Skip to content

Conversation

@MathiasVP
Copy link
Contributor

For a long time we've had an ugly hack in the C/C++ dataflow library related to postfix operator uses. This hack comes from the fact that we want dataflow in cases like:

int x = source();
sink(x++);

but not in:

int x = source();
x++;
sink(x);

The hack got this by modifying the location of the read attached to x++. However, it's both ugly and error prone (and, in fact, for some range analysis work I'm doing this was causing infinite looping in the shared range analysis library).

This PR gets rid of this hack and instead properly models this in our SSA library. So for something like:

int x = source();
sink(x++);

we create a new SSA source variable, definition and use such that the above program is more faithfully represented as:

int x = source();
int x_tmp = x;
x++;
sink(x_tmp);

this ensures that we get the correct dataflow behavior.

Commit-by-commit review recommended.

  • bd09aad adds loads of tests
  • e0a7889 reverts the hack. You can see how this makes the tests regress.
  • 3d445be and db3f22a performs small refactorings that shouldn't have any effect on any results, but will make the actual change much easier.
  • Finally, 445cca1 implements the strategy outlines above, and
  • 28fec0c shows that the regressions are fixed.

I've tested that the isUseAfterPostfixCrement0 predicate indeed correspond exactly to the uses of postfix increments and decrements in lots of databases (by checking whether all the StoreInstructions for which isUseAfterPostfixCrement0(store, _) holds come from here, and vice versa).

@github-actions github-actions bot added the C++ label Jan 27, 2026
@MathiasVP MathiasVP marked this pull request as ready for review January 27, 2026 18:56
@MathiasVP MathiasVP requested a review from a team as a code owner January 27, 2026 18:56
Copilot AI review requested due to automatic review settings January 27, 2026 18:56
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the C/C++ IR dataflow SSA implementation to correctly model postfix increment/decrement by introducing an explicit “saved value” SSA variable, eliminating the previous location-based workaround and restoring correct dataflow behavior.

Changes:

  • Added new dataflow tests covering postfix/prefix crements, compound assignments, and conditional-expression cases.
  • Updated expected AST/IR flow outputs for the new tests.
  • Updated SSA implementation to represent postfix crements via dedicated “saved” source variables/defs/uses, replacing the prior hack.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
cpp/ql/test/library-tests/dataflow/dataflow-tests/test.cpp Adds new test cases validating desired postfix crement flow behavior and related edge cases.
cpp/ql/test/library-tests/dataflow/dataflow-tests/test-source-sink.expected Updates expected AST/IR flow results corresponding to the new tests.
cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll Implements the new SSA modeling for postfix crements by adding saved-value source variables and wiring corresponding defs/uses.
Comments suppressed due to low confidence (2)

cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll:50

  • Same grammar nit in the doc comment here: "generated by an postfix" should be "generated by a postfix".
   * Holds if `store` is the `StoreInstruction` generated by an postfix

cpp/ql/lib/semmle/code/cpp/ir/dataflow/internal/SsaImpl.qll:697

  • This call site uses hasBaseSourceVariableAndIndirectrion; if you fix the typo in the predicate name, make sure this reference is updated to match.
      this.hasBaseSourceVariableAndIndirectrion(v, indirection)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@MathiasVP MathiasVP added the no-change-note-required This PR does not need a change note label Jan 27, 2026
@MathiasVP MathiasVP changed the title C++: Get rid of an ugly work around in dataflow C++: Get rid of an ugly workaround in dataflow Jan 27, 2026
IdrissRio
IdrissRio previously approved these changes Jan 29, 2026
Copy link
Contributor

@IdrissRio IdrissRio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It took me some time to review this since my IR knowledge is fairly limited. That said, the approach makes a lot of sense and the result is much cleaner. I only have one question regarding conditional post increment.

I locally tested pointer dereference with postfix sink(*p++) and it works, still I would add it as a test (up to you).

Moreover DCA looks good.

sink(x);

x = source();
sink((b ? x : y)++); // $ ast MISSING: ir
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this due to strict_count(left.getAUse()) == 2? If so, anything that looks like (non trivial expression)++ would be considered missing right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt investigate it too deeply, but I think this is related to our less than ideal handling of value categories related to ternary expressions. We have a related internal issue for this. I'll make sure to link this comment to that issue.


private module SourceVariables {
/**
* Holds if `store` is the `StoreInstruction` generated by n postfix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small typo. It should be a postifix :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Fixed in 61a53fa

@MathiasVP
Copy link
Contributor Author

It took me some time to review this since my IR knowledge is fairly limited. That said, the approach makes a lot of sense and the result is much cleaner. I only have one question regarding conditional post increment.

I locally tested pointer dereference with postfix sink(*p++) and it works, still I would add it as a test (up to you).

Moreover DCA looks good.

Thanks a lot, Idriss! 🙇 We have some mean query tests of the form *p++ for example here so I think we're good.

@IdrissRio IdrissRio self-requested a review January 29, 2026 12:23
@MathiasVP MathiasVP merged commit 1b1c9c6 into github:main Jan 29, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C++ no-change-note-required This PR does not need a change note

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants