Semmle 1.22
Skip to end of metadata
Go to start of metadata

Name: Upcast array used in pointer arithmetic

Description: An array with elements of a derived struct type is cast to a pointer to the base type of the struct. If pointer arithmetic or an array dereference is done on the resulting pointer, it will use the width of the base type, leading to misaligned reads.

ID: cpp/upcast-array-pointer-arithmetic

Kind: path-problem

Severity: warning

Precision: high

Query: CastArrayPointerArithmetic.ql
/**
 * @name Upcast array used in pointer arithmetic
 * @description An array with elements of a derived struct type is cast to a
 *              pointer to the base type of the struct. If pointer arithmetic or
 *              an array dereference is done on the resulting pointer, it will
 *              use the width of the base type, leading to misaligned reads.
 * @kind path-problem
 * @problem.severity warning
 * @precision high
 * @tags correctness
 *       reliability
 *       security
 *       external/cwe/cwe-119
 *       external/cwe/cwe-843
 * @id cpp/upcast-array-pointer-arithmetic
 */

import cpp
import semmle.code.cpp.dataflow.DataFlow
import DataFlow::PathGraph

class CastToPointerArithFlow extends DataFlow::Configuration {
  CastToPointerArithFlow() {
    this = "CastToPointerArithFlow"
  }

  override predicate isSource(DataFlow::Node node) {
    not node.asExpr() instanceof Conversion and
    introducesNewField(
      node.asExpr().getType().(DerivedType).getBaseType(),
      node.asExpr().getConversion*().getType().(DerivedType).getBaseType()
      
    )
  }

  override predicate isSink(DataFlow::Node node) {
    exists(PointerAddExpr pae |
      pae.getAnOperand() = node.asExpr()
    ) or
    exists(ArrayExpr ae |
      ae.getArrayBase() = node.asExpr()
    )
  }
}

/**
 * `derived` has a (possibly indirect) base class of `base`, and at least one new
 * field has been introduced in the inheritance chain after `base`.
 */
predicate introducesNewField(Class derived, Class base) {
  (
    exists(Field f |
      f.getDeclaringType() = derived and
      derived.getABaseClass+() = base
    ) or
    introducesNewField(derived.getABaseClass(), base)
  )
}

from DataFlow::PathNode source, DataFlow::PathNode sink, CastToPointerArithFlow cfg
where cfg.hasFlowPath(source, sink)
  and source.getNode().asExpr().getFullyConverted().getUnspecifiedType() = sink.getNode().asExpr().getFullyConverted().getUnspecifiedType()
select sink, source, sink, "Pointer arithmetic here may be done with the wrong type because of the cast $@.", source, "here"

A pointer to a derived class may be implicitly converted to a pointer to its base type when passed as an argument to a function expecting a pointer to the base type. If pointer arithmetic or an array dereference is then used, it will be performed using the size of the base type. This can lead to reading data from unexpected fields in the derived type.

Recommendation

Only convert pointers to single objects. If you must work with a sequence of objects that are converted to a base type, use an array of pointers rather than a pointer to an array.

Example

class Base {
public:
	int x;
}

class Derived: public Base {
public:
	int y;
};

void dereference_base(Base *b) {
	b[2].x;
}

void dereference_derived(Derived *d) {
	d[2].x;
}

void test () {
	Derived[4] d;
	dereference_base(d); // BAD: implicit conversion to Base*

	dereference_derived(d); // GOOD: implicit conversion to Derived*, which will be the right size
}