LGTM Enterprise 1.22.2

Analysis FAQs

Brief answers to common questions about analysis in LGTM:

For more detailed information about how LGTM works see LGTM analysis, or for information on what happens when you add a new project see What happens to new projects?

Does LGTM analyze forks?

LGTM doesn't analyze forks. Our view is that, usually, a forked project is very similar to the project it was derived from. Analyzing very similar codebases doesn't make much sense because it's like analyzing the same codebase multiple times. If LGTM analyzed forks, alerts would show up multiple times, and would be attributed to the same developer multiple times. It's also worth noting that analyzing forked projects would be a significant waste of computer resources.

If you try to add a fork to LGTM, you'll be instructed to add the original repository instead.

Which languages are supported?

This release of LGTM includes support for the following languages:

Supported language Language code
C and C++ cpp
C# csharp
COBOL cobol
Java java
JavaScript/TypeScript javascript
Python python

Language codes can be used to search by language. For example, to see all Java queries, search for language:java and then click Queries in the sidebar at the left of the results. For more information, see Searching.

How does LGTM determine which languages to analyze?

When you add a project to LGTM, LGTM attempts to analyze every supported language. If the analysis succeeds, we assume the repository contains that language (the analysis always fails if there isn't any of the language in the repository).

The most recent revision of the codebase is analyzed and, if this succeeds, analysis of historic revisions begins—working backward though the revision history.

For compiled languages, such as Java, the default environment used by LGTM may not be able to build the application. If this happens, analysis for that language is disabled. So a project with JavaScript and Java code may initially only be detected as a JavaScript project because the Java code failed to build.

What if LGTM fails to analyze a language?

LGTM may occasionally fail to analyze one (or more) of the languages in your project. This may happen for a variety of reasons, for example if the build of your project uses an uncommon command, or if LGTM fails to detect the correct Python version.

Note that it's possible to customize the LGTM build/extraction to meet your organization's requirements. For more details, see Customizing code extraction.

Also, if LGTM failed to detect a language in your project, or if analysis failed for a specific language, you can (re)trigger analysis. For more information, see Adding a language to a project.

If you still can't get this to work, tell your LGTM administrator, who will in turn let our Support team know. Don't forget to provide as much information as possible, including the URL for your project, as well as details of any build requirements.

Does LGTM show alerts for the codebase, or just the latest commit?

LGTM stores both types of information. To see:

  • All alerts found in the codebase, as of the most recently analyzed commit, display the Active alerts list for the project—on the project page, click Alerts
  • Impact of changes in a commit on alerts, display the commit results. On the Overview tab of the project page—click any commit

When you enable automatic code review for pull requests for a project, you can click through from your repository host to LGTM and see the alerts that will be introduced or fixed by the changes if the pull request is merged. For a list of repository host systems for which you can set up automated code review, see About LGTM.

How often does LGTM check for new commits?

LGTM checks the external repository host for new commits roughly once a day. When a new commit is found, LGTM analyzes it and updates the latest alerts view. This explains why you won't immediately see analysis results or alerts on LGTM for that commit. If you want a more rapid response, you should enable automated code review for pull requests. That way you’ll get results on proposed changes straight away, even before they are merged into your codebase.

What is upload analysis?

By default, LGTM is set up to run in full analysis mode. This means that it builds and analyzes every revision of your project, enabling it to attribute changes to individual committers, and providing the information needed to display personalized alert lists to contributors.

Some projects have highly specialized build environments that would be difficult to set up in LGTM. These projects are usually configured for upload analysis. With upload analysis, the code is built outside LGTM, and the QL command-line tools are used to create a snapshot for LGTM to analyze.

When a project is configured for upload analysis, instead of analyzing all commits for the project, LGTM only analyzes a commit when an administrator uploads a snapshot.

Your local LGTM administrator can enable upload analysis for your projects.

You won't see the following tabs for projects configured for this analysis mode:

Additionally, the My projects page doesn't display history information for such projects and some data may be missing in the My alerts page.

What is sparse analysis?

By default, LGTM is set up to run in full analysis mode. This means that it builds and analyzes every revision of your project, enabling it to attribute changes to individual committers, and providing the information needed to display personalized alert lists to contributors.

Full analysis may not be suitable for large projects, or projects whose history you are not interested in. You can analyze such projects in sparse analysis mode.

Your local LGTM administrator can enable sparse analysis for your projects.

Sparse analysis is particularly useful for larger projects as it reduces their computational burden. When a project is configured for sparse analysis, instead of analyzing all commits for the project, we only analyze and build the commits that are, or were, observed as being the HEAD of the repository. We identify HEAD commits when we poll the status of the repository. When sparse analysis is enabled, results are not contiguous because LGTM ignores some commits.

You won't see the following tabs for projects configured for this analysis mode:

Additionally, the My projects page doesn't display history information for such projects and some data may be missing in the My alerts page.

What format do I use for the project repository URL of a project under TFVC control?

When adding a project to LGTM, you need to specify the URL for the project repository. For projects under TFVC control hosted using Azure DevOps Server (formerly TFS) or Azure DevOps Services (formerly VSTS), LGTM is very flexible about the format of URLs you can use. Valid URLs are given below for the same TFVC project repository, hosted using Azure DevOps Services:

https://yourorganization.visualstudio.com/DefaultCollection/dev-repo/your%20team/_versionControl?path=?path=%24%2Fdev-repo%2Fpath%2Fto%2Fbranch

https://yourorganization.visualstudio.com/DefaultCollection/dev-repo/_versionControl?path=$/dev-repo/path/to/branch

https://yourorganization.visualstudio.com/dev-repo/_versionControl?path=$/dev-repo/path/to/branch

https://yourorganization.visualstudio.com/dev-repo?path=$/dev-repo/path/to/branch

https://yourorganization.visualstudio.com/dev-repo

As seen from the examples, the following URL items are optional:

  • team name
  • _versionControl
  • DefaultCollection (note that this item is optional for Azure DevOps Services only—it's mandatory for Azure DevOps Server)

You'll notice that no branch path is specified at the end of the URL in the last example. This means that the whole repository, as opposed to a specific branch, will be added to LGTM. If you add a whole repository, be aware that you're likely to get duplicate alerts if the same coding problem appears on more than one branch.

What's the default build environment for C# projects?

The build environment depends on the setup implemented by your system administrator. You may need .NET Core SDK to be installed in the build environment, and Mono (Linux) or a suitable version of Visual Studio (Windows). For more information on the autobuild process, and how to customize the build if these default settings don't work, see C# extraction.

What's the default build environment for Java projects?

The build environment depends on the setup implemented by your system administrator. On top of that, LGTM will automatically recognize the following Java build methods:

Why bother compiling code?

For each project, LGTM creates a detailed database to represent the hierarchical nature of the codebase. This database is analyzed using queries written in the QL language, which is optimized for querying recursive and hierarchical data. For more about the QL language see Introduction to the QL language.

For compiled languages, it turns out that much of the information you care about is only available during compilation time. Some of it (like language versions or special libraries) may depend on the precise flags passed to the compiler. Some of it may come from libraries that are downloaded as part of the build process. Some of it may rely on sources generated during the build. It may be impossible to make sense of the different parts of the system without knowing how they are meant to fit together, and this is usually implicitly defined by the build configuration. Getting all of this information into the QL database requires a build to succeed.

This approach gives more accurate results than traditional linting tools and creates a database that can be used to ask more complex and broader questions.

Why are some of the alerts in my Python projects false positives?

There are many language differences between Python 2 and 3, so LGTM analyzes Python 2 and Python 3 codebases differently. If your project is misclassified then you're likely to see many false positives from queries that target language differences.

How is the Python version identified?

The strategy used by LGTM to detect the Python version depends on the type of repository used to store your project:

For projects stored in Git repositories:

  1. LGTM looks in the configuration files (setup.py or travis.yml) to see if a version of Python is specified.
  2. If there is one, LGTM uses that version to build the project.
  3. Python 2 is used if both Python 2 and Python 3 are defined in the configuration files.

  4. If there isn't one, LGTM looks at the dates of the commits. If all commits to the repository are dated after January 1, 2017 (which is shortly after the date at which Python 3.6 was released), LGTM assumes the project is written using Python 3 and will use that version. Otherwise, LGTM will choose Python 2 instead.

For projects stored in repositories other than Git (for example, Subversion or TFVC), the detection is much simpler. If a Python version is specified in the configuration files mentioned above, it will be used. If it isn't, LGTM will default to using Python 2.

A few snippets of configuration files are shown below.

A project is treated as using Python 3 if it has a Trove classifier for Python 3 in its setup.py file, for example:

classifiers=[
   ...
   'Programming Language :: Python :: 3',
   ...
   ],

or if it specifies Python 3 compatibility in its travis.yml file, for example:

language: python
python:
   - "3.6"

You can set the lgtm.yml project configuration file to override the Python version you want LGTM to use. An example of an lgtm.yml file snippet telling LGTM to use Python 3 is given below:

extraction:
  python:
    python_setup:
      version: 3

For more information about configuring Python analysis using the lgtm.yml project configuration file, see Python extraction.

Why don't the stats match those in my repo?

Repository statistics are usually cumulative and comprehensive, including everyone who has ever contributed to the project since its creation.

LGTM classifies (or tags) files according to their source and purpose. Results from tagged files are excluded from analysis and statistics, results from untagged files are included in the analysis. "Normal" code files are untagged, whereas other code files—such as automatically generated code, test code or library code, to name but a few—are tagged. As a result, they are excluded from analysis. That way, we are confident that the LGTM data you see is a true representation of coding effort. You can customize file classification. See File classification for more information.

Another reason is that there may be developers who only contribute code in languages that LGTM doesn't support.

Why does my commit show as failed?

The Overview tab for each project shows the commits analyzed, most recent first, with any commits that don't change analyzed code hidden. If analysis failed for any commits, you'll see a red dot after the commit message and the statistics will be zero.

Hover over the dot to display a tooltip with more information:

  • We were unable to build—seen on projects where only one language is analyzed by LGTM
  • 1 language could not be built—seen on projects where more than one language is analyzed by LGTM

This means either LGTM failed to build/analyze the commit itself, or it failed to build/analyze one of the parent commits.

If the commit, or a parent of the commit, couldn't be analyzed then LGTM can't accurately determine which alerts are new/fixed in the commit—the changes might have been made in an earlier commit.

If there's any doubt about when an alert or line of code was added/removed, LGTM suppresses the information. This is better than publishing inaccurate data. For more information on how alerts are attributed to people, see Attributing alerts.

Why can't I query my project?

When LGTM is upgraded, you may not be able to query a project or download a snapshot immediately. This is because LGTM has not yet analyzed a commit using the latest version of the database schema and query libraries.

The main reasons why this can happen are:

  1. The upgrade was very recent (within the last 24 hours) and we haven't yet polled the repository for new commits. For more information, see How often does LGTM check for new commits?
  2. We have tried to build and analyze a commit since the latest upgrade, but the build or analysis failed. This could be because the project no longer builds (with the current lgtm.yml configuration).

Suggested fixes

You can trigger an analysis on LGTM by pushing a new commit to the repository. If analysis fails, you should check the project's analysis configuration settings to make sure that the project still builds successfully on LGTM.

If LGTM still can't analyze your project, or if you have any questions about the latest upgrade, contact your LGTM Enterprise system administrator (within your organization).