Tutorial: Javadoc

Overview

To access Javadoc associated with a program element, we use member predicate getDoc of class Element, which returns a Documentable. Class Documentable, in turn, offers a member predicate getJavadoc to retrieve the Javadoc attached to the element in question, if any.

Javadoc comments are represented by class Javadoc, which provides a view of the comment as a tree of JavadocElement nodes. Each JavadocElement is either a JavadocTag, representing a tag, or a JavadocText, representing a piece of free-form text.

The most important member predicates of class Javadoc are:

  • getAChild - retrieves a top-level JavadocElement node in the tree representation.
  • getVersion - returns the value of the @version tag, if any.
  • getAuthor - returns the value of the @author tag, if any.

For example, the following query finds all classes that have both an @author tag and a @version tag, and returns this information:

import java

from Class c, Javadoc jdoc, string author, string version
where jdoc = c.getDoc().getJavadoc() and
    author = jdoc.getAuthor() and
    version = jdoc.getVersion()
select c, author, version

JavadocElement defines member predicates getAChild and getParent to navigate up and down the tree of elements. It also provides a predicate getTagName to return the tag’s name, and a predicate getText to access the text associated with the tag.

We could rewrite the above query to use this API instead of getAuthor and getVersion:

import java

from Class c, Javadoc jdoc, JavadocTag authorTag, JavadocTag versionTag
where jdoc = c.getDoc().getJavadoc() and
    authorTag.getTagName() = "@author" and authorTag.getParent() = jdoc and
    versionTag.getTagName() = "@version" and versionTag.getParent() = jdoc
select c, authorTag.getText(), versionTag.getText()

The JavadocTag has several subclasses representing specific kinds of Javadoc tags:

  • ParamTag represents @param tags; member predicate getParamName returns the name of the parameter being documented.
  • ThrowsTag represents @throws tags; member predicate getExceptionName returns the name of the exception being documented.
  • AuthorTag represents @author tags; member predicate getAuthorName returns the name of the author.

Example: Finding spurious @param tags

As an example of using the CodeQL Javadoc API, let us write a query that finds @param tags that refer to a non-existent parameter.

For example, consider the following program:

class A {
    /**
    * @param lst a list of strings
    */
    public String get(List<String> list) {
        return list.get(0);
    }
}

Here, the @param tag on A.get misspells the name of parameter list as lst. Our query should be able to find such cases.

To begin with, we write a query that finds all callables (that is, methods or constructors) and their @param tags:

import java

from Callable c, ParamTag pt
where c.getDoc().getJavadoc() = pt.getParent()
select c, pt

It is now easy to add another conjunct to the where clause, restricting the query to @param tags that refer to a non-existent parameter: we simply need to require that no parameter of c has the name pt.getParamName().

import java

from Callable c, ParamTag pt
where c.getDoc().getJavadoc() = pt.getParent() and
    not c.getAParameter().hasName(pt.getParamName())
select pt, "Spurious @param tag."

Example: Finding spurious @throws tags

A related, but somewhat more involved, problem is finding @throws tags that refer to an exception that the method in question cannot actually throw.

For example, consider the following Java program:

import java.io.IOException;

class A {
    /**
    * @throws IOException thrown if some IO operation fails
    * @throws RuntimeException thrown if something else goes wrong
    */
    public void foo() {
        // ...
    }
}

Notice that the Javadoc comment of A.foo documents two thrown exceptions: IOException and RuntimeException. The former is clearly spurious: A.foo does not have a throws IOException clause, and thus cannot throw this kind of exception. On the other hand, RuntimeException is an unchecked exception, so it can be thrown even if there is no explicit throws clause listing it. Therefore, our query should flag the @throws tag for IOException, but not the one for RuntimeException.

Recall from above that the CodeQL library represents @throws tags using class ThrowsTag. This class does not provide a member predicate for determining the exception type that is being documented, so we first need to implement our own version. A simple version might look as follows:

RefType getDocumentedException(ThrowsTag tt) {
    result.hasName(tt.getExceptionName())
}

Similarly, Callable does not come with a member predicate for querying all exceptions that the method or constructor may possibly throw. We can, however, implement this ourselves by using getAnException to find all throws clauses of the callable, and then use getType to resolve the corresponding exception types:

predicate mayThrow(Callable c, RefType exn) {
    exn.getASupertype*() = c.getAnException().getType()
}

Note the use of getASupertype* to find both exceptions declared in a throws clause and their subtypes. For instance, if a method has a throws IOException clause, it may throw MalformedURLException, which is a subtype of IOException.

Now we can write a query for finding all callables c and @throws tags tt such that:

  • tt belongs to a Javadoc comment attached to c.
  • c cannot throw the exception documented by tt.
import java

// Insert the definitions from above

from Callable c, ThrowsTag tt, RefType exn
where c.getDoc().getJavadoc() = tt.getParent+() and
    exn = getDocumentedException(tt) and
    not mayThrow(c, exn)
select tt, "Spurious @throws tag."

See this in the query console. This finds several results in the LGTM.com demo projects.

Improvements

Currently, there are two problems with this query:

  1. getDocumentedException is too liberal: it will return any reference type with the right name, even if it is in a different package and not actually visible in the current compilation unit.
  2. mayThrow is too restrictive: it does not account for unchecked exceptions, which do not need to be declared.

To see why the former is a problem, consider the following program:

class IOException extends Exception {}

class B {
    /** @throws IOException an IO exception */
    void bar() throws IOException {}
}

This program defines its own class IOException, which is unrelated to the class java.io.IOException in the standard library: they are in different packages. Our getDocumentedException predicate does not check packages, however, so it will consider the @throws clause to refer to both IOException classes, and thus flag the @param tag as spurious, since B.bar cannot actually throw java.io.IOException.

As an example of the second problem, method A.foo from our previous example was annotated with a @throws RuntimeException tag. Our current version of mayThrow, however, would think that A.foo cannot throw a RuntimeException, and thus flag the tag as spurious.

We can make mayThrow less restrictive by introducing a new class to represent unchecked exceptions, which are just the subtypes of java.lang.RuntimeException and java.lang.Error:

class UncheckedException extends RefType {
    UncheckedException() {
        this.getASupertype*().hasQualifiedName("java.lang", "RuntimeException") or
        this.getASupertype*().hasQualifiedName("java.lang", "Error")
    }
}

Now we incorporate this new class into our mayThrow predicate:

predicate mayThrow(Callable c, RefType exn) {
    exn instanceof UncheckedException or
    exn.getASupertype*() = c.getAnException().getType()
}

Fixing getDocumentedException is more complicated, but we can easily cover three common cases:

  1. The @throws tag specifies the fully qualified name of the exception.
  2. The @throws tag refers to a type in the same package.
  3. The @throws tag refers to a type that is imported by the current compilation unit.

The first case can be covered by changing getDocumentedException to use the qualified name of the @throws tag. To handle the second and the third case, we can introduce a new predicate visibleIn that checks whether a reference type is visible in a compilation unit, either by virtue of belonging to the same package or by being explicitly imported. We then rewrite getDocumentedException as follows:

predicate visibleIn(CompilationUnit cu, RefType tp) {
    cu.getPackage() = tp.getPackage()
    or
    exists(ImportType it | it.getCompilationUnit() = cu | it.getImportedType() = tp)
}

RefType getDocumentedException(ThrowsTag tt) {
    result.getQualifiedName() = tt.getExceptionName()
    or
    (result.hasName(tt.getExceptionName()) and visibleIn(tt.getFile(), result))
}

See this in the query console. This finds many fewer, more interesting results in the LGTM.com demo projects.

Currently, visibleIn only considers single-type imports, but it would be possible to extend it with support for other kinds of imports.

What next?