LGTM Enterprise 1.23.1

Using sparse analysis

Sparse analysis is a way of reducing the computational burden of a project. You may find it useful to enable this mode for large projects.

Overview

When you enable sparse analysis for a project, instead of building and analyzing all commits to the default branch of a project, LGTM only builds and analyzes the HEAD commit at the time the repository is polled for changes. This applies to the analysis for all languages in a project.

Each project repository is automatically polled once a day to check for changes. This means that a maximum of one commit per day is analyzed (except when you trigger a manual poll of the repository after the automated poll, in which case an additional commit is analyzed). The changes that are analyzed comprise all of the modifications to the code since the previous analysis. Conflating changes in this way can reduce the number of build/analysis jobs considerably. A drawback, however, is that it's no longer possible to determine exactly when specific changes were made or to be certain about which developer made a specific change. LGTM cannot, therefore, provide developer-level data and, as a consequence of this, you won't see the following tabs for projects that you have configured for sparse analysis:

  • Project overview
  • Contributors

Additionally, the My projects page doesn't display project history information for sparsely analyzed projects and some data may be missing from the My alerts page.

Analysis of a project history

It's important to note that, if you choose to use sparse analysis on a project, LGTM won't perform historic analysis—that is, it won't step back through the project history analyzing changes that were committed prior to the project being added to LGTM.

By default, LGTM performs full analysis on projects. So it's likely that when someone adds a project it will initially run in full analysis mode for a while before you decide to switch it to sparse analysis. When you enable sparse analysis all of the data for contiguous commits that have already been analyzed, or for commits that are currently queued for analysis, will be retained.

Enabling sparse analysis when adding projects

You can enable sparse analysis for one or more projects when you add them using the Projects administration page.

After entering the URLs of the projects, click Advanced options and select the Add projects in sparse analysis mode check box. For more details, see Adding projects.

Enabling sparse analysis for an existing project

To enable sparse analysis for a project that has already been added to LGTM Enterprise:

  1. Go to the Projects administration page.

  2. Find the project that you want to configure for sparse analysis, and click the project name to display the project settings page.

  3. Click the Analysis settings tab.

  4. Select Sparse from the Analysis mode drop-down list.

  5. The Minimum number of days between sparse mode builds field is displayed. Leave this set to 0. This is the recommended setting for sparse analysis. You only need to change this if you want to reduce the default frequency of sparse analysis.

  6. Click Save in the Analysis options section of the page.

    A message is displayed at the top of the page to confirm that the analysis options have been updated.

Frequency of analysis

By default—in the absence of manually triggered poll jobs—sparse analysis ensures that changes to a repository are analyzed at most once every 24 hours.

Poll jobs are scheduled to be spread evenly across the 24-hour period, with the poll job for a specific repository occurring at roughly the same time every day. Analysis results will therefore be added to LGTM at different times throughout the day for each project that you have configured to use sparse analysis.

The amount of change that is analyzed after each daily poll job, and the elapsed time between commits that are analyzed, depends on the activity that has occurred during that 24-hour period.

Example 1

The diagram below shows five 24-hour periods. The poll points represent the automated polls that occur at the end of a 24-hour period. The periods do not correspond to calendar days as LGTM schedules polling to occur throughout the day. No manual polls have been triggered. The triangles represent commits merged into the default branch. Commits E, I, J, and K are the HEAD commits when the automated polls occurs. All changes between each of these commits and the previously identified HEAD commit are conflated and analyzed as if they were made in a single commit:

Several commits were made in Period 2, so there may have been many changes between commit E and commit I. Commit J happened just after an automated poll job and was the only commit in Period 3, so there may be few changes to analyze between I and J. Analysis is triggered by a poll job that finds changes in the repository, so although commit J happened at the beginning of Period 3, results are not added to LGTM until nearly 24 hours later. The next commit (K) does not happen until 47 hours after commit J. It occurs just before an automated poll, so there is only a short lag between the commit being made and the analysis results appearing in LGTM.

Manually triggered poll jobs in sparse mode

With the default configuration of sparse analysis, manually triggered poll jobs may result in more than one set of analysis results being added to LGTM in a 24-hour period.

Example 2

The diagram below shows three manually triggered poll jobs. The first of these, in Period 2, adds analysis results for the changes made in commits F and G. At the end of Period 2 the automated poll jobs detects that further changes have been made, so the project is built and analyzed for a second time in that 24-hour period, adding analysis results for commits H and I. The manual poll job in Period 3 causes commit J to be analyzed, some hours before it would have been otherwise. When the automated poll occurs at the end of Period 3 no further changes are detected. Likewise, in Period 4, no changes are detected when the manual poll is triggered. Commit K is then merged into the branch, and is analyzed when the automated poll occurs at the end of Period 4.

Reducing the number of analyzed commits even further

Usually the default behavior of sparse analysis provides a sufficient reduction in the number of build/analysis jobs performed for a project, while still allowing you to trigger poll jobs manually if you need more than one analysis on a particular day. If required, however, you can define the minimum number of days between sparse mode builds.

Related topicsRelated topics