Note: This documentation is for the legacy command-line tool odasa.
The final version was released in May 2020. Support for this tool expires in May 2021.

For documentation on the new generation CodeQL CLI, see CodeQL CLI .
In particular, you may find the Notes for legacy QL CLI users useful in planning your migration.

Skip to end of metadata
Go to start of metadata

Overview

Semmle supports SARIF as an output format for sharing static analysis results. SARIF is designed to represent the output of a broad range of static analysis tools, and there are many features in the SARIF specification that are considered “optional”.  This document details the output produced when using the format type  sarifv2.1.0 , which corresponds to the SARIF v2.1.0.csd1 specification

The QL command-line tools are able to produce multiple versions of SARIF. For further information on selecting a file format for your analysis results, see analyzeSnapshot.  You can also download results from LGTM in sarifv2.1.0 format. For information on exporting results from LGTM, see Exporting alerts to SARIF in the LGTM help.

SARIF specification and schema

This topic is intended to be read alongside the detailed SARIF specification. For more information on the specification and the SARIF schema, see the SARIF specification documentation on GitHub.

Change notes

Changes between versions

Semmle versionFormat typeChanges
1.20.2sarifv2.1.0First version of this format.
1.22.2sarifv2.1.0For information, see Semmle 1.22 release notes.

Future changes to the output

The output produced for a given specific format type (for example  sarifv2.1.0) may change in future Semmle releases. We will endeavor to maintain backwards compatibility with consumers of the generated SARIF by ensuring that:

  • No field which is marked as “Always” being generated will be removed.

  • The circumstances under which “Optional” fields are generated may change. Consumers of the Semmle SARIF output should be robust to the presence or absence of these fields.

New output fields may be added in future releases under the same format type–these are not considered to break backwards compatibility, and consumers should be robust to the presence of newly added fields.

New format argument types may be added in future versions of Semmle–for example, to support new versions of SARIF. These have no guarantee of backwards compatibility, unless explicitly documented.

Generated SARIF objects

This details each SARIF component that may be generated, along with any specific circumstances. We omit any properties that are never generated. 

sarifLog object

JSON property nameWhen is this generated?Notes
$schemaAlwaysProvides a link to the SARIF schema.
versionAlwaysThe version of the SARIF used to generate the output.
runsAlwaysAn array containing a single run object, for one language.

run object

JSON property nameWhen is this generated?Notes
toolAlways
originalUriBaseIdsAlwaysA dictionary of uriBaseIds to artifactLocations representing the original locations on the analysis machine. At a minimum, this will contain the %SRCROOT% uriBaseId, which represents the root location on the analysis machine of the source code for the analyzed project.

Each artifactLocation will contain the uri and description properties.

artifactsAlwaysAn array containing at least one artifact object for every file referenced in a result.
resultsAlways
newLineSequencesAlways

columnKindAlways
propertiesAlways

The properties dictionary will contain two entries for:

  • semmle.sourceLanguage, which will be set to one of the Semmle language identifiers.

     Click here to expand...



    • C or C++ — language="cpp"
    • C# — language="csharp"
    • Go — language="go"
    • Java — language="java"
    • JavaScript — language="javascript"
    • Python — language="python"

  • semmle.formatSpecifier, which identifies the format specifier passed to the command line tools.

tool object

JSON property nameWhen is this generated?Notes
driverAlways

toolComponent object

JSON property nameWhen is this generated?Notes
nameAlways

Set to “Semmle Core” for output from the QL command-line tools. Note, if the output was generated using a different tool a different name is reported, and the format may not be as described here.

organizationAlwaysSet to “Semmle”.
versionAlwaysSet to the Semmle release version e.g. “1.20.2”.
rulesAlwaysAn array of reportingDescriptor objects that represent rules. This array will contain, at a minimum, all the rules that were run during this analysis, but may contain rules which were available but not run. For more detail about enabling queries, see defaultConfiguration.

reportingDescriptor object (for rule)

reportingDescriptor objects may be used in multiple places in the SARIF specification. When a reportingDescriptor is included in the rules array of a Semmle toolComponent object it has the following properties.

JSON property nameWhen is this generated?Notes
idAlwaysWill contain the @id property specified in the query that defines the rule, which is usually of the format language/rule-name (for example cpp/unsafe-format-string). If your organization defines the @opaqueid property in the query it will be used instead.
nameAlwaysWill contain the @id property specified in the query. See the id property above for an example.
shortDescriptionAlwaysWill contain the @name property specified in the query that defines the rule.
fullDescriptionAlwaysWill contain the @description property specified in the query that defines the rule.

defaultConfiguration

Always

A reportingConfiguration object, with the enabled property set to true or false, and a level property set according to the @severity property specified in the query that defines the rule. Omitted if the @severity property was not specified.

artifact object

JSON property nameWhen is this generated?Notes
locationAlwaysAn artifactLocation object.
indexAlwaysThe index of the artifact object.
contentsOptionally

If results are generated using the --sarif-add-file-contents flag, and the source code is available at the time the SARIF file is generated, then the contents property is populated with an artifactContent object, with the text property set.

artifactLocation object

JSON property nameWhen is this generated?Notes
uriAlways
indexAlways
uriBaseIdOptionallyIf the file is relative to some known abstract location, such as the root source location on the analysis machine, this will be set.

result object

The composition of the results is dependent on the options provided to Semmle Core. By default, the results are grouped by unique message format string and primary location. Thus, two results that occur at the same location with the same underlying message, will appear as a single result in the output. This behavior can be disabled by using the flag --ungroup-results, in which case no results are grouped.

JSON property nameWhen is this generated?Notes
ruleIdAlwaysSee the description of the id property in reportingDescriptor object (for rule) .
ruleIndexAlways
messageAlwaysA message describing the problem(s) occurring at this location. This message may be a SARIF “Message with placeholder”, containing links that refer to locations in the relatedLocations property.
locationsAlwaysAn array containing a single location object.
partialFingerprintsAlwaysA dictionary from named fingerprint types to the fingerprint. This will contain, at a minimum, a value for the primaryLocationLineHash, which provides a fingerprint based on the context of the primary location.
codeFlowsOptionallyThis array may be populated with one or more codeFlow objects if the query that defines the rule for this result is of @kind path-problem.
relatedLocationsOptionallyThis array will be populated if the query that defines the rule for this result has a message with placeholder options. Each unique location is included once.
suppressionsOptionallyIf the result is suppressed, then this will contain a single suppression object, with the @kind property set to IN_SOURCE. If this result is not suppressed, but there is at least one result that has a suppression, then this will be set to an empty array, otherwise it will not be set.

location object

JSON property nameWhen is this generated?Notes
physicalLocationAlways
idOptionallylocation objects that appear in the relatedLocations array of a result object may contain the id property.
messageOptionallylocation objects may contain the message property if:
  • They appear in the relatedLocations array of a result object may contain the message property.
  • They appear in the threadFlowLocation.location property.

physicalLocation object

JSON property nameWhen is this generated?Notes
artifactLocationAlways
regionOptionallyIf the given physicalLocation exists in a text file, such as a source code file, then the region property may be present.
contextRegionOptionallyMay be present if this location has an associated snippet.

region object

There are two types of region object produced by Semmle:

  • Line/column offset regions
  • Character offset and length regions 

Any region produced by Semmle may be specified in either format, and consumers should robustly handle either type.

For line/column offset regions, the following properties will be set:

JSON property nameWhen is this generated?Notes
startLineAlways
startColumnOptionallyNot included if equal to the default value of 1.
endLineOptionallyNot included if identical to startLine.
endColumnAlways
snippetOptionally

For character offset and length regions, the following properties will be set:

JSON property nameWhen is this generated?Notes
charOffsetOptionallyProvided if startLinestartColumnendLine, and endColumn are not populated
charLengthOptionallyProvided if startLinestartColumnendLine, and endColumn are not populated
snippetOptionally

codeFlow object

JSON property nameWhen is this generated?Notes
threadFlowsAlways

threadFlow object

JSON property nameWhen is this generated?Notes
locationsAlways

threadFlowLocation object

JSON property nameWhen is this generated?Notes
locationAlways