Shared: Provenance-based filtering of flow summaries #21051

hvitved · 2025-12-16T13:40:36Z

This PR aligns the logic across languages for how flow summaries are prioritized based on provenance and exactness (that is, whether a model is defined directly for a function or for a function that is implemented/overridden).

A flow summary is considered relevant if:

It is manual exact model, or
It is a manual inexact model and there is no exact manual (neutral) model, or
It is a generated model and (a) there is no source code available for the modeled callable, (b) there is no manual (neutral) model, and (c) the model is inexact and there is no generated exact (neutral) model.

Note that for dynamic languages we currently pretend that no source code is available for functions with flow summaries, so 3.(a) holds vacuously.

Points 2 and 3.c represent a change for e.g. Java, where we would previously union exact and inexact models, which meant that it was not possible to overrule inexact models. As a consequence, some inexact manual have been replicated. DCA for Java reports some lost java/sensitive-log results on apache_solr, but looking at those results, they all have flow paths of length > 150, so they are almost certainly false positives, and most likely a consequence of 3.c.

In order for the logic to be defined in the shared flow summary library, I had to move provenance and exactness information into the propagatesFlow predicate, which is a breaking change.

Lastly, I have applied the ::Range pattern to the SummarizedCallable class for all languages except C++, which currently does not expose this class. This means that SummarizedCallable::Range will contain all flow summaries, whereas SummarizedCallable will only contain relevant summaries.

rust/ql/lib/codeql/rust/dataflow/FlowSummary.qll

shared/dataflow/codeql/dataflow/internal/FlowSummaryImpl.qll

java/ql/lib/semmle/code/java/dataflow/internal/DataFlowDispatch.qll

rust/ql/test/library-tests/dataflow/models/models.qlref

michaelnebel · 2026-01-22T09:59:12Z

csharp/ql/lib/semmle/code/csharp/dataflow/internal/FlowSummaryImpl.qll

+    c.fromSource() and
+    not c.getFile().isStub() and
+    not (
+      c.getFile().extractedQlTest() and


Maybe this deserves a comment (that ql test files where the body is just a throw are considered stub like and thus not a part of the source code).

michaelnebel

Really nice work @hvitved !
Only a couple of minor questions/remarks.

yoff

Python 👍

aschackmull · 2026-01-23T08:59:53Z

Offline feedback recap: Some of the added Java models look wrong. Tom and I identified several issues: the code snippet to identify and generate the missing models lacked the signature, and notably the signature can be different from the overridden method. Also, some existing manual exact models were missing signatures, which caused them to wrongly apply to inherited overloads.

java/ql/lib/semmle/code/java/dataflow/internal/FlowSummaryImpl.qll

Missing manual models were added using the following code added to `FlowSummaryImpl.qll`: ```ql private predicate testsummaryElement( Input::SummarizedCallableBase c, string namespace, string type, boolean subtypes, string name, string signature, string ext, string originalInput, string originalOutput, string kind, string provenance, string model, boolean isExact ) { exists(string input, string output, Callable baseCallable | summaryModel(namespace, type, subtypes, name, signature, ext, originalInput, originalOutput, kind, provenance, model) and baseCallable = interpretElement(namespace, type, subtypes, name, signature, ext, isExact) and ( c.asCallable() = baseCallable and input = originalInput and output = originalOutput or correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalInput, input) and correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalOutput, output) ) ) } private predicate testsummaryElement2( string namespace, string type, boolean subtypes, string name, string signature, string ext, string originalInput, string originalOutput, string kind, string provenance, string model, string namespace2, string type2 ) { exists(Input::SummarizedCallableBase c | testsummaryElement(c, namespace2, type2, _, _, _, ext, originalInput, originalOutput, kind, provenance, model, false) and testsummaryElement(c, namespace, type, subtypes, name, _, _, _, _, _, provenance, _, true) and signature = paramsString(c.asCallable()) and not testsummaryElement(c, _, _, _, _, _, _, originalInput, originalOutput, kind, provenance, _, true) ) } private string getAMissingManualModel(string namespace2, string type2) { exists( string namespace, string type, boolean subtypes, string name, string signature, string ext, string originalInput, string originalOutput, string kind, string provenance, string model | testsummaryElement2(namespace, type, subtypes, name, signature, ext, originalInput, originalOutput, kind, provenance, model, namespace2, type2) and result = "- [\"" + namespace + "\", \"" + type + "\", True, \"" + name + "\", \"" + signature + "\", \"\", \"" + originalInput + "\", \"" + originalOutput + "\", \"" + kind + "\", \"" + provenance + "\"]" ) } ```

aschackmull

Java LGTM

github-actions bot added C# C++ Java Python Go Ruby Rust Pull requests that update Rust code Swift DataFlow Library labels Dec 16, 2025

hvitved force-pushed the shared/flow-summary-provenance-filtering branch 3 times, most recently from a3e585d to eb48820 Compare December 17, 2025 19:45

github-actions bot added the JS label Dec 18, 2025

hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 1e946f8 to 30a0791 Compare December 18, 2025 10:06

This was referenced Jan 5, 2026

C#: Narrow provenance printing in tests. #21094

Closed

Rust: Refactor MaD provenance-based filtering #21072

Merged

hvitved force-pushed the shared/flow-summary-provenance-filtering branch 3 times, most recently from 0fbea88 to 5a2881d Compare January 13, 2026 10:08

github-advanced-security bot found potential problems Jan 13, 2026

View reviewed changes

rust/ql/lib/codeql/rust/dataflow/FlowSummary.qll Fixed Show fixed Hide fixed

hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 5a2881d to a941f4a Compare January 13, 2026 10:59

github-advanced-security bot found potential problems Jan 13, 2026

View reviewed changes

shared/dataflow/codeql/dataflow/internal/FlowSummaryImpl.qll Fixed Show fixed Hide fixed

hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from bf632b3 to c6383ff Compare January 13, 2026 13:36

github-advanced-security bot found potential problems Jan 13, 2026

View reviewed changes

hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 9f81377 to 0057ae3 Compare January 13, 2026 14:43

github-advanced-security bot found potential problems Jan 13, 2026

View reviewed changes

rust/ql/test/library-tests/dataflow/models/models.qlref Fixed Show fixed Hide fixed

hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 1933d1c to 72dfe9c Compare January 14, 2026 08:30

hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 5d74edd to 27c102a Compare January 21, 2026 13:00

hvitved requested review from geoffw0, michaelnebel, owen-mc and yoff January 22, 2026 09:28

michaelnebel reviewed Jan 22, 2026

View reviewed changes

yoff previously approved these changes Jan 22, 2026

View reviewed changes

hvitved dismissed yoff’s stale review via eaaa4f2 January 23, 2026 13:33

hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 27c102a to eaaa4f2 Compare January 23, 2026 13:33

github-advanced-security bot found potential problems Jan 23, 2026

View reviewed changes

hvitved force-pushed the shared/flow-summary-provenance-filtering branch from eaaa4f2 to e486026 Compare January 26, 2026 08:37

hvitved added 13 commits January 26, 2026 12:39

Shared: Provenance-based filtering of flow summaries

4ce04e4

C#: Adapt to changes in FlowSummaryImpl

b11b091

Rust: Adapt to changes in FlowSummaryImpl

c4e0dda

Ruby: Adapt to changes in FlowSummaryImpl

c975ae5

Swift: Adapt to changes in FlowSummaryImpl

47d9e8a

Go: Adapt to changes in FlowSummaryImpl

739748c

Python: Adapt to changes in FlowSummaryImpl

0adece7

C++: Adapt to changes in FlowSummaryImpl

3b1e062

JS: Adapt to changes in FlowSummaryImpl

93dad86

Add change notes

0f6bae0

C#: Revert change to getASummarizedCallableTarget

732c60c

Shared: Shadow hasManualModel in RelevantSummarizedCallable

df09f02

hvitved force-pushed the shared/flow-summary-provenance-filtering branch from e486026 to df09f02 Compare January 26, 2026 11:40

aschackmull approved these changes Jan 26, 2026

View reviewed changes

hvitved merged commit b974a84 into github:main Jan 26, 2026
108 checks passed

hvitved deleted the shared/flow-summary-provenance-filtering branch January 26, 2026 16:24

Shared: Provenance-based filtering of flow summaries #21051

Shared: Provenance-based filtering of flow summaries #21051

Conversation

hvitved commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

michaelnebel Jan 22, 2026

Choose a reason for hiding this comment

Uh oh!

michaelnebel left a comment

Choose a reason for hiding this comment

Uh oh!

yoff left a comment

Choose a reason for hiding this comment

Uh oh!

aschackmull commented Jan 23, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aschackmull left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

hvitved commented Dec 16, 2025 •

edited

Loading