Semmle 1.18
Skip to end of metadata
Go to start of metadata

The topics in this space are also available for download as a single PDF file: CCPP-Semmle-1.18.pdf.

About this document

This document was generated automatically from the default C and C++ dashboard templates included in the Semmle 1.18 release. The standard dashboard for C analysis contains a core set of rules. For C++ analysis these core rules are supplemented with additional rules, indicated by (C++ only) in the category name. See Supported languages for details of which language versions can be analyzed using this release.

This release includes the following new default rules:

  • Too many arguments to formatting function - A printf-like function called with too many arguments will ignore the excess arguments and output less than might have been intended.

  • Upcast array used in pointer arithmetic - An array with elements of a derived struct type is cast to a pointer to the base type of the struct. If pointer arithmetic or an array dereference is done on the resulting pointer, it will use the width of the base type, leading to misaligned reads.

See also the security rules for C variant analysis: C Security.

Terminology

Different industries use a variety of terms to describe mistakes in coding, issues found by software users/testers and potential improvements. Throughout this document, we use the following terms to discuss software quality:
  • RuleA definition of coding errors, bad coding practice or other opportunities to improve the quality of a project. Rules are grouped by category (see table).
  • Query—A request for information—written in QL (the Semmle query language)—used to extract a data set from a database. Rules are implemented using QL queries.
  • Alert/Violation—Code that fails a rule. The meaning of each alert and the best course of action to take varies according to the category of rule that was broken.
  • QL—The query language created by Semmle to facilitate efficient querying of hierarchical data. All rules and metrics are defined by QL queries.

The QL queries that define the rules and metrics described in Semmle standards are distributed as part of Semmle analysis. You can create copies of standard queries and customize them so that they match your internal rules of best coding practice. See Introduction to QL for an introduction to the QL language and Learning QL for an overview of the resources for learning to write your own queries.

Categories of rule

The coding rules are grouped into the following main categories:
CategoryDetectsTypical cost of addressingBenefits of fixing violations of these rules
Correctness Coding mistakesLow, local changes onlyImmediate reduction in risk of software defects
ReadabilityCorrect but confusing syntax or namingLow, local changes onlyImmediate reduction in risk of introducing new defects in future changes to the code
Maintainability Opportunities for structural improvementHigh, refactoring normally required

For frequently updated code, substantial cost/time saving for future changes

For infrequently updated or machine generated code, little benefit

Useless CodeCode that may not be usedVary according to level of refactoring requiredImmediate reduction in risk of introducing new defects in future changes to the code
MetricsTrends in code quality--

Excluding the main categories, all other topics are organized in alphabetic order and consequently the order does not imply any level of priority.

Summary of rules and metrics

  • Correctness
    • Common Errors
    • Consistent Use
      • Inconsistent nullness checkThe result value of a function is often checked for nullness, but not always. Since the value is mostly checked, it is likely that the function can return null values in some cases, and omitting the check could crash the program.
      • Inconsistent operation on return valueA function is called, and the same operation is usually performed on the return value - for example, free, delete, close etc. However, in some cases it is not performed. These unusual cases may indicate misuse of the API and could cause resource leaks.
      • Return value of a function is ignoredA call to a function ignores its return value, but more than 80% of the total number of calls to the function check the return value. Check the return value of functions consistently, especially for functions like 'fread' or the 'scanf' functions that return the status of the operation.
    • Dangerous Conversions
      • Bad check for oddnessUsing "x % 2 == 1" to check whether x is odd does not work for negative numbers.
      • Implicit downcast from bitfieldA bitfield is implicitly downcast to a smaller integer type. This could lead to loss of upper bits of the bitfield.
      • Lossy pointer castA pointer type is converted to a smaller integer type. This may lead to loss of information in the variable and is highly non-portable.
      • Multiplication result converted to larger typeA multiplication result that is converted to a larger type can be a sign that the result can overflow the type converted from.
      • Non-zero value cast to pointerA constant value other than zero is converted to a pointer type. This is a dangerous practice, since the meaning of the numerical value of pointers is platform dependent.
      • SlicingAssigning a non-reference instance of a derived type to a variable of the base type slices off all members added by the derived class, and can cause an unexpected state.
    • Dangerous Conversions - C++ only
      • Upcast array used in pointer arithmeticAn array with elements of a derived struct type is cast to a pointer to the base type of the struct. If pointer arithmetic or an array dereference is done on the resulting pointer, it will use the width of the base type, leading to misaligned reads.
    • Exceptions - C++ only
      • Accidental rethrowWhen there is nothing to rethrow, attempting to rethrow an exception will terminate the program.
      • Catching by valueCatching an exception by value will create a copy of the thrown exception, thereby potentially slicing the original exception object.
      • Leaky catchIf an exception is allocated on the heap, then it should be deleted when caught.
      • Throwing pointersExceptions should be objects rather than pointers to objects.
    • Use of Libraries
      • Non-constant format stringPassing a non-constant 'format' string to a printf-like function can lead to a mismatch between the number of arguments defined by the 'format' and the number of arguments actually passed to the function. If the format string ultimately stems from an untrusted source, this can be used for exploits.
      • Possibly wrong buffer size in string copyCalling 'strncpy' with the size of the source buffer as the third argument may result in a buffer overflow.
      • Potentially overflowing call to snprintfUsing the return value from snprintf without proper checks can cause overflow.
      • Potentially unsafe call to strncatCalling 'strncat' with the size of the destination buffer as the third argument may result in a buffer overflow.
      • Potentially unsafe use of strcatUsing 'strcat' without checking the size of the source string may result in a buffer overflow
      • Sizeof with side effectsThe sizeof operator should not be used on expressions that contain side effects. It is subtle whether the side effects will occur or not.
      • Suspicious 'sizeof' useTaking 'sizeof' of an array parameter is often mistakenly thought to yield the size of the underlying array, but it always yields the machine pointer size.
      • Suspicious call to memsetUse of memset where the size argument is computed as the size of some non-struct type. When initializing a buffer, you should specify its size as <number of elements> * <size of one element> to ensure portability.
      • Too few arguments to formatting functionCalling a printf-like function with too few arguments can be a source of security issues.
      • Too many arguments to formatting functionA printf-like function called with too many arguments will ignore the excess arguments and output less than might have been intended.
      • Wrong type of arguments to formatting functionCalling a printf-like function with the wrong type of arguments causes unpredictable behavior.
    • Use of Libraries - C++ only
      • Return c_str of local std::stringReturning the c_str of a locally allocated std::string could cause the program to crash or behave non-deterministically because the memory is deallocated when the std::string goes out of scope.
  • Readability
    • Control Flow
      • Avoid floats in for loopsFloating point variables should not be used as loop counters. For loops are best suited to simple increments and termination conditions; while loops are preferable for more complex uses.
      • Dubious NULL checkThe address of a field (except the first) will never be NULL, so it is misleading, at best, to check for that case.
      • Empty branch of conditionalAn empty block after a conditional can be a sign of an omission and can decrease maintainability of the code. Such blocks should contain an explanatory comment to aid future maintainers.
      • Error-prone name of loop variableThe iteration variable of a nested loop should have a descriptive name: short names like i, j, or k can cause confusion except in very simple loops.
      • For loop variable changed in bodyNumeric variables being used within a for loop for iteration counting should not be modified in the body of the loop. Reserve for loops for straightforward iterations, and use a while loop instead for more complex cases.
      • Futile conditionalAn if-statement with an empty then-branch and no else-branch may indicate that the code is incomplete.
      • No trivial switch statementsUsing a switch statement when there are fewer than two non-default cases leads to unclear code.
    • Declarations (1)
      • Declaration hides parameterA local variable hides a parameter. This may be confusing. Consider renaming one of them.
      • Declaration hides variableA local variable hides another local variable from a surrounding scope. This may be confusing. Consider renaming one of the variables.
      • Function declared in blockFunctions should always be declared at file scope. It is confusing to declare a function at block scope, and the visibility of the function is not what would be expected.
      • Large object passed by valueAn object larger than 64 bytes is passed by value to a function. Passing large objects by value unnecessarily use up scarce stack space, increase the cost of calling a function and can be a security risk. Use a pointer to the object instead.
      • Local variable hides global variableA local variable or parameter that hides a global variable of the same name. This may be confusing. Consider renaming one of the variables.
    • Expressions
      • Comparison result is always the sameWhen a comparison operation, such as x < y, always returns the same result, it means that the comparison is redundant and may mask a bug because a different check was intended.
      • Comparison with canceling sub-expressionIf the same sub-expression is added to both sides of a comparison, and there is no possibility of overflow or rounding, then the sub-expression is redundant and could be removed.
      • Self comparisonComparing a variable to itself always produces the same result, unless the purpose is to check for integer overflow or floating point NaN.
      • Unclear comparison precedenceUsing comparisons as operands of other comparisons is unusual in itself, and most readers will require parentheses to be sure of the precedence.
      • Unsigned comparison to zeroAn unsigned value is always non-negative, even if it has been assigned a negative number, so the comparison is redundant and may mask a bug because a different check was intended.
    • JSF - C++ only
      • Inconsistent virtual inheritanceA base class shall not be both virtual and non-virtual in the same hierarchy.
      • Redefined default parameterAn inherited default parameter shall never be redefined. Default values are bound statically which is confusing when combined with dynamically bound function calls.
      • Resource not released in destructorAll resources acquired by a class should be released by its destructor. Avoid the use of the 'open / close' pattern, since C++ constructors and destructors provide a safer way to handle resource acquisition and release. Best practice in C++ is to use the 'RAII' technique: constructors allocate resources and destructors free them.
      • Undisciplined multiple inheritanceMultiple inheritance should only be used in the following restricted form: n interfaces plus m private implementations, plus at most one protected implementation. Multiple inheritance can lead to complicated inheritance hierarchies that are difficult to comprehend and maintain.
    • Safe Language - C++ only
      • Constant return type on memberA 'const' modifier on a member function return type is useless. It is usually a typo or misunderstanding, since the syntax for a 'const' function is 'int foo() const', not 'const int foo()'.
      • Constructor with default arguments will be used as a copy constructorConstructors with default arguments should not be signature-compatible with a copy constructor when their default arguments are taken into account.
      • Exception thrown in destructorThrowing an exception from a destructor may cause immediate program termination.
      • Inconsistent definition of copy constructor and assignment ('Rule of Two')Classes that have an explicit copy constructor or copy assignment operator may behave inconsistently if they do not have both.
      • Incorrect constructor delegationA constructor in C++ cannot delegate part of the object initialization to another by calling it. This is likely to leave part of the object uninitialized.
      • No virtual destructorAll base classes with a virtual function should define a virtual destructor. If an application attempts to delete a derived class object through a base class pointer, the result is undefined if the base class destructor is non-virtual.
      • Overloaded assignment does not return 'this'An assignment operator should return a reference to *this. Both the standard library types and the built-in types behave in this manner.
      • Virtual call from constructor or destructorVirtual functions should not be invoked from a constructor or destructor of the same class. Confusingly, virtual functions are resolved statically (not dynamically) in constructors and destructors for the same class. The call should be made explicitly static by qualifying it using the scope resolution operator.
    • Safe Language - core
      • Ambiguously signed bit-field memberBit fields with integral types should have explicit signedness only. For example, use `unsigned int` rather than `int`. It is implementation specific whether an `int`-typed bit field is signed, so there could be unexpected sign extension or overflow.
      • Constant return typeA 'const' modifier on a function return type is useless and should be removed for clarity.
      • Irregular enum initializationIn an enumerator list, the = construct should not be used to explicitly initialize members other than the first, unless all items are explicitly initialized. An exception is the pattern to use the last element of an enumerator list to get the number of possible values.
      • Missing return statementAll functions that are not void should return a value on every exit path.
      • No raw arrays in interfacesArrays should not be used in interfaces. Arrays degenerate to pointers when passed as parameters. This array decay problem has long been known to be a source of errors. Consider using std::vector or encapsulating the array in an Array class.
      • Use of gotoThe goto statement shall not be used.
      • Use of integer where enum is preferredEnumeration types should be used instead of integer types (and constants) to select from a limited series of choices.
    • Size (1)
      • Block with too many statementsBlocks with too many consecutive statements are candidates for refactoring. Only complex statements are counted here (eg. for, while, switch ...). The top-level logic will be clearer if each complex statement is extracted to a function.
      • Complex conditionBoolean expressions that are too deeply nested are hard to read and understand. Consider naming intermediate results as local variables.
      • Long switch caseA switch statement with too much code in its cases can make the control flow hard to follow. Consider wrapping the code for each case in a function and just using the switch statement to invoke the appropriate function in each case.
  • Maintainability
    • Complexity
    • Coupling
      • Duplicate include guardUsing the same include guard macro in more than one header file may cause unexpected behavior from the compiler.
      • Feature envyA function that uses more functions and variables from another file than functions and variables from its own file. This function might be better placed in the other file, to avoid exposing internals of the file it depends on.
      • Inappropriate IntimacyTwo files share too much information about each other (accessing many operations or variables in both directions). It would be better to invert some of the dependencies to reduce the coupling between the two files.
      • Include header files onlyThe #include pre-processor directive should only be used to include header files.
      • Missing header guardHeader files should contain header guards (#defines to prevent the file from being included twice). This prevents errors and inefficiencies caused by repeated inclusion.
      • Outgoing dependencies per fileThe number of files that a file depends on.
      • Public global variablesThe total number of global variables in each file with external (public) visibility.
    • Declarations
      • Magic numbers'Magic constants' should be avoided: if a nontrivial constant is used repeatedly, it should be encapsulated into a const variable or macro definition.
      • Magic strings'Magic constants' should be avoided: if a nontrivial constant is used repeatedly, it should be encapsulated into a const variable or macro definition.
      • Short global nameGlobal variables should have descriptive names, to help document their use, avoid namespace pollution and reduce the risk of shadowing with local variables.
    • Documentation
      • Commented-out codeCommented-out code makes the remaining code more difficult to read.
      • FIXME commentComments containing 'FIXME' indicate that the code has known bugs.
      • Poorly documented large functionLarge functions that have no or almost no comments are likely to be too complex to understand and maintain. The larger a function is, the more problematic the lack of comments.
      • TODO commentComments containing 'TODO' indicate that the code may be in an incomplete state.
      • Undocumented API functionFunctions used from outside the file they are declared in should be documented, as they are part of a public API. Without comments, modifying such functions is dangerous because callers easily come to rely on their exact implementation.
    • Memory Management
    • Size
  • Metrics
  • Useless code
    • Duplicate Code - C++ only
      • Mostly duplicate classMore than 80% of the methods in this class are duplicated in another class. Create a common supertype to improve code sharing.
    • Duplicate Code - core
      • Duplicate functionThere is another identical implementation of this function. Extract the code to a common file or superclass or delegate to improve sharing.
      • Mostly duplicate fileThere is another file that shares a lot of the code with this file. Merge the two files to improve maintainability.
      • Mostly duplicate functionThere is another function that shares a lot of the code with this one. Extract the code to a common file/superclass or delegate to improve sharing.
      • Mostly similar fileThere is another file that shares a lot of the code with this file. Notice that names of variables and types may have been changed. Merge the two files to improve maintainability.
    • Unused local variableA local variable that is never called or accessed may be an indication that the code is incomplete or has a typo.
    • Unused static functionA static function that is never called or accessed may be an indication that the code is incomplete or has a typo.
    • Unused static variableA static variable that is never accessed may be an indication that the code is incomplete or has a typo.