Semmle 1.20
Skip to end of metadata
Go to start of metadata

Name: Unclear validation of array index

Description: Accessing an array without first checking that the index is within the bounds of the array can cause undefined behavior and can also be a security risk.

ID: cpp/unclear-array-index-validation

Kind: problem

Severity: warning

Query: ImproperArrayIndexValidation.ql
/**
 * @name Unclear validation of array index
 * @description Accessing an array without first checking
 *              that the index is within the bounds of the array can
 *              cause undefined behavior and can also be a security risk.
 * @kind problem
 * @id cpp/unclear-array-index-validation
 * @problem.severity warning
 * @tags security
 *       external/cwe/cwe-129
 */
import cpp
import semmle.code.cpp.controlflow.Guards
private import semmle.code.cpp.rangeanalysis.RangeAnalysisUtils
import semmle.code.cpp.security.TaintTracking

predicate hasUpperBound(VariableAccess offsetExpr) {
  exists(
    BasicBlock controlled, LocalScopeVariable offsetVar, SsaDefinition def
  | controlled.contains(offsetExpr) and
    linearBoundControls(controlled, def, offsetVar) and
    offsetExpr = def.getAUse(offsetVar))
}

pragma[noinline]
predicate linearBoundControls(BasicBlock controlled, SsaDefinition def, LocalScopeVariable offsetVar) {
  exists(GuardCondition guard, boolean branch |
    guard.controls(controlled, branch) and
    cmpWithLinearBound(guard, def.getAUse(offsetVar), Lesser(), branch)
  )
}

from Expr origin, ArrayExpr arrayExpr, VariableAccess offsetExpr
where tainted(origin, offsetExpr)
  and offsetExpr = arrayExpr.getArrayOffset()
  and not hasUpperBound(offsetExpr)
select
  offsetExpr,
  "$@ flows to here and is used in an array indexing expression, potentially causing an invalid access.",
  origin, "User-provided value"

C and C++ do not have built-in bounds checking for array indexing expressions such as x[i]. If i is out of bounds then the program will read/write whatever data happens to be at that address. An attacker who is able to control the value of i might be able to read or modify data which they are not authorized to access.

Recommendation

Always check the bounds of array indexing expressions, especially if the index value is derived from user-controlled data.

Example

In this example, a string is read from a socket and converted to an int. This int is then used to index the data array. Because the index value is a user-controlled value, it could be out of bounds.

int example(int socket, int data[]) {
  char inputBuffer[CHAR_ARRAY_SIZE];
  int recvResult;
  int i;

  recvResult = recv(socket, inputBuffer, CHAR_ARRAY_SIZE - 1, 0);
  if (recvResult < 0)
  {
    return -1;
  }

  inputBuffer[recvResult] = '\0';
  i = atoi(inputBuffer);

  // BAD: i has not been validated.
  return data[i];
}

Below, the problem has been fixed by adding a guard:

int example(int socket, int data[], int ndata) {
  char inputBuffer[CHAR_ARRAY_SIZE];
  int recvResult;
  int i;

  recvResult = recv(socket, inputBuffer, CHAR_ARRAY_SIZE - 1, 0);
  if (recvResult < 0)
  {
    return -1;
  }

  inputBuffer[recvResult] = '\0';
  i = atoi(inputBuffer);

  if (i < 0 || ndata <= i) {
    return -1;
  }

  // GOOD: i has been validated.
  return data[i];
}

References
  • Common Weakness Enumeration: CWE-129.