Semmle 1.19
Skip to end of metadata
Go to start of metadata

Overview

This space contains one page for each of the C++ built-in queries for the most recent release of the QL tools. Each page contains:

  • Summary of key metadata for the query
  • QL code for the query
  • Help information for the query
  • Labels derived from the query metadata

About the queries

The queries built into the QL tools include queries that:

  • Run on LGTM—selected because they find issues that are important to the majority of developers and/or the results have a very high precision. That is, a high percentage of the alerts they report are true results. For a list of all the default LGTM queries, see LGTM.com and search: language=cpp. Note that the results may include queries that are scheduled for release next version of Semmle's product range.
  • Generate additional alerts—some of these queries are relevant only if you're working in a field which has special coding standards, for example, the Joint Strike Fighter Air Vehicle C++ Coding Standard.
  • Calculate metrics—these give you more general information about a project.
  • Demonstrate other ways to output data using QL—for example, generating a table, chart or graph of results. These are intended to be run using the QL plugins and extensions.

Exploring the queries

The heatmap below shows the labels for C++ built-in queries, click a label to view all queries with that tag or query type.

About the security queries

There are two query suites for C security analysis: default and all. For most projects we recommend that you run queries from the default suite. The all suite contains a few additional rules which perform points-to analysis and may timeout on some projects. These rules test for the following CWEs:

Data sources

Many queries rely on tracking the flow of data from sources that cannot be guaranteed to be safe. Where possible, the source of the potentially unsafe data is reported in the query violation message, indicating why that particular use of the data was considered dangerous. We describe here how some common sources of data are classified, in order to make it clear why these are considered to be potentially dangerous.

User Input

A common data source is data that comes from a user. This must be treated as untrusted unless it is validated, as a malicious user can send unexpected data that can have undesirable effects, such as allowing them to perform a SQL injection attack.

The sources that we consider user input are listed below:

  1. Getting the value of a system environment variable
    The environment in which a program is run is often under the control of the user who runs the program.

  2. Getting a parameter from an HttpServletRequest
    Spoofing a request can give the user control even over parameters which are not usually set by the user.

  3. Getting the query string from an HttpServletRequest
    The query string comes from the URL, which is under the control of the user.

  4. Getting the header from an HttpServletRequest
    Spoofing a request can give the user control over the header.

  5. Getting the value of a property from a Properties object
    Properties objects are commonly written to and read from disk, which may be under the control of the user.

  6. Getting the output of a ResultSet
    Databases may contain user input that has been stored, and as such must be treated as untrusted.

  7. Getting the value of a cookie
    Cookies are controlled by the client, and so their values must be treated as untrusted.

  8. Getting the hostname of a request using reverse DNS
    If the user controls their DNS server, then they can return whatever result they wish for a reverse DNS lookup.

Security analysis testing

The C/C++ security analyses are regularly evaluated against the SAMATE Juliet tests maintained by the US National Institute of Standards and Technology (NIST). This ensures that the quality and discrimination of the results is maintained as the queries are updated, for example, for changes to the C++ language, or improvements to the QL library, and enhancements to the code extraction process.

Summary of results

The following table summarizes the results for the latest release of the C/C++ security queries run against the SAMATE Juliet 1.3 tests. In the table, each row represents a weakness, and the columns show the following information:

  • TP – count of all true positive results: the code has a known security weakness, and Semmle’s analyses correctly identify this defect.
  • FP – count of all false positive results: the code has no known security weakness, but Semmle’s analyses are over cautious and suggest a potential problem.
  • TN – count of true negative results: the code has no known security weakness, and Semmle's analyses correctly pass the code as secure.
  • FN – count of all false negative results: the code has a known security weakness, but Semmle’s analyses fail to identify this defect.

In an ideal implementation of the analyses, the number of false positives (FP) and false negatives (FN) would be zero, but that is impossible to achieve by static analysis. The figures for FP and FN show where there are limitations in the present implementation.

CWE TP FP TN FN Count
CWE-022 3900 0 4800 900 4800
CWE-078 3900 0 4800 900 4800
CWE-114 390 0 576 186 576
CWE-119 2168 0 12336 10168 12336
CWE-134 2460 0 2880 420 2880
CWE-190 5646 113 6799 1266 6912
CWE-197 495 0 864 369 864
CWE-200 54 0 54 0 54
CWE-242 18 0 18 0 18
CWE-327 54 0 54 0 54
CWE-367 36 0 36 0 36
CWE-416 360 0 459 99 459
CWE-457 180 0 948 768 948
CWE-468 37 0 37 0 37
CWE-665 104 0 193 89 193
CWE-676 18 0 18 0 18
CWE-681 18 0 54 36 54
CWE-772 1713 462 1340 89 1802
CWE-835 3 0 6 3 6

Interpreting the results

The report CAS Static Analysis Tool Study – Methodology, by the Center for Assured Software of the National Security Agency of the USA defines four different ways to measure success:

  • Precision = TP/ (FP+TP)
  • Recall = TP/(TP+FN)
  • F-Score = 2*(Precision*Recall)/(Precision+Recall)
  • Discrimination rate = #discriminated tests / #tests

For each of these metrics, a higher score is better. There is clearly a trade-off between the precision and recall metrics: increasing the level of precision or recall for any analysis reduces the level of the other metric. The F-score is therefore an attempt to quantify the balance of decision between these two metrics.

The following table shows the results of calculating these metrics for the results shown above. These scores compare very favorably with the sample tools tested by the Center for Assured Software.

CWE Precision F-score Recall Disc. Rate
CWE-022 100% 90% 81% 81%
CWE-078 100% 90% 81% 81%
CWE-114 100% 81% 68% 68%
CWE-119 100% 30% 18% 18%
CWE-134 100% 92% 85% 85%
CWE-190 98% 89% 82% 80%
CWE-197 100% 73% 57% 57%
CWE-200 100% 100% 100% 100%
CWE-242 100% 100% 100% 100%
CWE-327 100% 100% 100% 100%
CWE-367 100% 100% 100% 100%
CWE-416 100% 88% 78% 78%
CWE-457 100% 32% 19% 19%
CWE-468 100% 100% 100% 100%
CWE-665 100% 70% 54% 54%
CWE-676 100% 100% 100% 100%
CWE-681 100% 50% 33% 33%
CWE-772 79% 86% 95% 69%
CWE-835 100% 67% 50% 50%

Conclusions

The tests suggest that Semmle has made judicious choices balancing the number of false positive results (an incorrect warning is issued) and false negative results (a true defect was not identified). Where comparative results are available for other tools, the Semmle analyses stand out for their exceptional accuracy.