Skip to content

Conversation

@hvitved
Copy link
Contributor

@hvitved hvitved commented Dec 16, 2025

This PR aligns the logic across languages for how flow summaries are prioritized based on provenance and exactness (that is, whether a model is defined directly for a function or for a function that is implemented/overridden).

A flow summary is considered relevant if:

  1. It is manual exact model, or
  2. It is a manual inexact model and there is no exact manual (neutral) model, or
  3. It is a generated model and (a) there is no source code available for the modeled callable, (b) there is no manual (neutral) model, and (c) the model is inexact and there is no generated exact (neutral) model.

Note that for dynamic languages we currently pretend that no source code is available for functions with flow summaries, so 3.(a) holds vacuously.

Points 2 and 3.c represent a change for e.g. Java, where we would previously union exact and inexact models, which meant that it was not possible to overrule inexact models. As a consequence, some inexact manual have been replicated. DCA for Java reports some lost java/sensitive-log results on apache_solr, but looking at those results, they all have flow paths of length > 150, so they are almost certainly false positives, and most likely a consequence of 3.c.

In order for the logic to be defined in the shared flow summary library, I had to move provenance and exactness information into the propagatesFlow predicate, which is a breaking change.

Lastly, I have applied the ::Range pattern to the SummarizedCallable class for all languages except C++, which currently does not expose this class. This means that SummarizedCallable::Range will contain all flow summaries, whereas SummarizedCallable will only contain relevant summaries.

@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 3 times, most recently from a3e585d to eb48820 Compare December 17, 2025 19:45
@github-actions github-actions bot added the JS label Dec 18, 2025
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 1e946f8 to 30a0791 Compare December 18, 2025 10:06
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 3 times, most recently from 0fbea88 to 5a2881d Compare January 13, 2026 10:08
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 5a2881d to a941f4a Compare January 13, 2026 10:59
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from bf632b3 to c6383ff Compare January 13, 2026 13:36
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 9f81377 to 0057ae3 Compare January 13, 2026 14:43
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch 2 times, most recently from 1933d1c to 72dfe9c Compare January 14, 2026 08:30
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 5d74edd to 27c102a Compare January 21, 2026 13:00
c.fromSource() and
not c.getFile().isStub() and
not (
c.getFile().extractedQlTest() and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this deserves a comment (that ql test files where the body is just a throw are considered stub like and thus not a part of the source code).

Copy link
Contributor

@michaelnebel michaelnebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice work @hvitved !
Only a couple of minor questions/remarks.

yoff
yoff previously approved these changes Jan 22, 2026
Copy link
Contributor

@yoff yoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python 👍

@aschackmull
Copy link
Contributor

Offline feedback recap: Some of the added Java models look wrong. Tom and I identified several issues: the code snippet to identify and generate the missing models lacked the signature, and notably the signature can be different from the overridden method. Also, some existing manual exact models were missing signatures, which caused them to wrongly apply to inherited overloads.

@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from 27c102a to eaaa4f2 Compare January 23, 2026 13:33
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from eaaa4f2 to e486026 Compare January 26, 2026 08:37
Missing manual models were added using the following code added to `FlowSummaryImpl.qll`:

```ql
    private predicate testsummaryElement(
      Input::SummarizedCallableBase c, string namespace, string type, boolean subtypes, string name,
      string signature, string ext, string originalInput, string originalOutput, string kind,
      string provenance, string model, boolean isExact
    ) {
      exists(string input, string output, Callable baseCallable |
        summaryModel(namespace, type, subtypes, name, signature, ext, originalInput, originalOutput,
          kind, provenance, model) and
        baseCallable = interpretElement(namespace, type, subtypes, name, signature, ext, isExact) and
        (
          c.asCallable() = baseCallable and input = originalInput and output = originalOutput
          or
          correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalInput,
            input) and
          correspondingKotlinParameterDefaultsArgSpec(baseCallable, c.asCallable(), originalOutput,
            output)
        )
      )
    }

    private predicate testsummaryElement2(
      string namespace, string type, boolean subtypes, string name, string signature, string ext,
      string originalInput, string originalOutput, string kind, string provenance, string model,
      string namespace2, string type2
    ) {
      exists(Input::SummarizedCallableBase c |
        testsummaryElement(c, namespace2, type2, _, _, _, ext, originalInput, originalOutput, kind,
          provenance, model, false) and
        testsummaryElement(c, namespace, type, subtypes, name, _, _, _, _, _, provenance, _, true) and
        signature = paramsString(c.asCallable()) and
        not testsummaryElement(c, _, _, _, _, _, _, originalInput, originalOutput, kind, provenance,
          _, true)
      )
    }

    private string getAMissingManualModel(string namespace2, string type2) {
      exists(
        string namespace, string type, boolean subtypes, string name, string signature, string ext,
        string originalInput, string originalOutput, string kind, string provenance, string model
      |
        testsummaryElement2(namespace, type, subtypes, name, signature, ext, originalInput,
          originalOutput, kind, provenance, model, namespace2, type2) and
        result =
          "- [\"" + namespace + "\", \"" + type + "\", True, \"" + name + "\", \"" + signature +
            "\", \"\", \"" + originalInput + "\", \"" + originalOutput + "\", \"" + kind + "\", \"" +
            provenance + "\"]"
      )
    }
```
@hvitved hvitved force-pushed the shared/flow-summary-provenance-filtering branch from e486026 to df09f02 Compare January 26, 2026 11:40
Copy link
Contributor

@aschackmull aschackmull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java LGTM

@hvitved hvitved merged commit b974a84 into github:main Jan 26, 2026
108 checks passed
@hvitved hvitved deleted the shared/flow-summary-provenance-filtering branch January 26, 2026 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants