[red-knot] Use ternary decision diagrams (TDDs) for visibility constraints #15861

dcreager · 2025-01-31T21:51:14Z

We now use ternary decision diagrams (TDDs) to represent visibility constraints. A TDD is just like a BDD (binary decision diagram), but with "ambiguous" as an additional allowed value. Unlike the previous representation, TDDs are strongly normalizing, so equivalent ternary formulas are represented by exactly the same graph node, and can be compared for equality in constant time.

We currently have a slight 1-3% performance regression with this in place, according to local testing. However, we also have a 5× increase in performance for pathological cases, since we can now remove the recursion limit when we evaluate visibility constraints.

As follow-on work, we are now closer to being able to remove the simplify_visibility_constraint calls in the semantic index builder. In the vast majority of cases, we now see (for instance) that the visibility constraint after an if statement, for bindings of symbols that weren't rebound in any branch, simplifies back to true. But there are still some cases we generate constraints that are cyclic. With fixed-point cycle support in salsa, or with some careful analysis of the still-failing cases, we might be able to remove those.

carljm · 2025-02-01T00:17:55Z

Very excited about this!

Per your request (and the "draft" status), I'll refrain from a detailed look at the code until you declare it ready for review. Just one general thought, based on the title of the PR: when I looked at this previously, my conclusion was that we don't actually need TDDs here, BDDs suffice. If we build a BDD of the constraints, and then evaluate that BDD, anytime a decision node evaluates to "ambiguous", we can immediately short-circuit to a result of "ambiguous". I guess another way to say this is that a TDD where the "ambiguous" exit of every node leads immediately to the "ambiguous" terminal (which is our scenario, I think), is isomorphic to a BDD where the "ambiguous" exits can be implied.

Just mentioning that in case it's useful or allows any simplification (I don't know if it does, since I haven't looked at your code yet.). Ultimately I'll be happy with any solution that performs well and allows us to get rid of the arbitrary depth limit, no matter what we call it!

(One other thought: if we are getting rid of the depth limit, we should run this on Black and verify that the big perf regression @sharkdp observed is in fact gone.)

dcreager · 2025-02-01T02:09:03Z

If we build a BDD of the constraints, and then evaluate that BDD, anytime a decision node evaluates to "ambiguous", we can immediately short-circuit to a result of "ambiguous".

Hmm, I thought we'd need to model ambiguous edges fully, since (e.g.) ambiguous ∧ false = false. So if we have a constraint of C1 ∧ C2, where C1 is amb and C2 is false, I think your optimization would over-approximate the result to amb instead of false. Am I following that right?

(One other thought: if we are getting rid of the depth limit, we should run this on Black and verify that the big perf regression @sharkdp observed is in fact gone.)

Ah, that makes me a bit happier about the performance numbers I'm seeing! Codspeed is reporting a 3% regression with this in place. When I test locally, using hyperfine on black, I see similar numbers. But that might not be apples-to-apples, since main has a recursion limit in place but this feature branch does not.

carljm · 2025-02-01T16:05:51Z

I thought we'd need to model ambiguous edges fully, since (e.g.) ambiguous ∧ false = false

You're right, of course! I remembered that I'd reached the conclusion we could do this with just BDDs, but when I tried to recall the reasoning for that conclusion, I thought I'd come up with a simpler reason, but instead it was just a wrong reason :)

The actual reason I'd reached this conclusion before was because OxiDD will allow you to supply "resolutions" (that is, t/f values) for nodes in a BDD and will then simplify the BDD accordingly, such that the resolved nodes disappear from it. So I thought if we built a BDD and then resolved the nodes that we can infer a statically-known truthiness for, the BDD would either simplify to just a terminal, or not, in which case it remains ambiguous.

But I don't know if that's actually a good/efficient approach, compared to building a TDD and evaluating it!

Codspeed is reporting a 3% regression with this in place. When I test locally, using hyperfine on black, I see similar numbers. But that might not be apples-to-apples,

IIRC @sharkdp was seeing like 5x regression on Black without the recursion limit, so it seems like you're doing much better on the pathological cases. It's too bad if this comes with an across-the-board regression, but I think it's worth it to get rid of the arbitrary limit. We probably will need such limits in some places, but they are a really bad experience when users encounter them, so as much as we can avoid them, or make them so high that users will never encounter them in realistic code, we should try to do so.

Of course we should also take a look at profiles and see if there are ways to bring down the regression here!

github-actions · 2025-02-03T15:24:30Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

dcreager · 2025-02-03T15:26:26Z

I've extracted the refactoring parts of this into a separate PR to make everything easier to review: #15913

codspeed-hq · 2025-02-03T15:36:09Z

CodSpeed Performance Report

Merging #15861 will not alter performance

_{Comparing dcreager/tdd (ec35dcf) with main (6bb3235)}

Summary

✅ 32 untouched benchmarks

dcreager · 2025-02-03T18:36:31Z

IIRC @sharkdp was seeing like 5x regression on Black without the recursion limit, so it seems like you're doing much better on the pathological cases. It's too bad if this comes with an across-the-board regression, but I think it's worth it to get rid of the arbitrary limit. We probably will need such limits in some places, but they are a really bad experience when users encounter them, so as much as we can avoid them, or make them so high that users will never encounter them in realistic code, we should try to do so.

Of course we should also take a look at profiles and see if there are ways to bring down the regression here!

As expected, the execution time shifts from being mostly in the evaluate step on main:

to being mostly in the build step with this PR:

dcreager · 2025-02-03T19:47:59Z

I think this is at a good point now. In local testing, I've gotten the performance regression down to 1-2%. Interested to see what codspeed says about the latest commits on the branch.

carljm · 2025-02-03T20:05:05Z

For some reason CodSpeed thinks it doesn't have data on the baseline for this PR, but just comparing the results here with the ones on main (or vc-api), it looks like about 2% regression (89.5 ms -> 91.6 ms).

carljm · 2025-02-03T20:09:42Z

As follow-on work, we are now closer to being able to remove the simplify_visibility_constraint calls in the semantic index builder. In the vast majority of cases, we now see (for instance) that the visibility constraint after an if statement, for bindings of symbols that weren't rebound in any branch, simplifies back to true. But there are still some cases we generate constraints that are cyclic. With fixed-point cycle support in salsa, or with some careful analysis of the still-failing cases, we might be able to remove those.

I would definitely like to see the failing cases and understand why they need to be cyclic, or if there's some missing simplification (other than the "no new bindings" heuristic that doesn't work with with terminals) that could avoid the cycles. But this can be a follow-up.

This extracts some pure refactoring noise from #15861. This changes the API for creating and evaluating visibility constraints, but does not change how they are respresented internally. There should be no behavioral or performance changes in this PR. Changes: - Hide the internal representation isn't changed, so that we can make changes to it in #15861. - Add a separate builder type for visibility constraints. (With TDDs, we will have some additional builder state that we can throw away once we're done constructing.) - Remove a layer of helper methods from `UseDefMapBuilder`, making `SemanticIndexBuilder` responsible for constructing whatever visibility constraints it needs.

carljm · 2025-02-03T21:20:00Z

Summarizing from Discord: I think a) such code would never actually occur in a stub file; they shouldn't even have function bodies other than ellipsis (and perhaps we should make this explicit in how we handle them), and b) it would be fine to go ahead and remove the simplify calls and mark that file xfail (though ideal if we could mark only the pyi version of it as xfail?) pending fixpoint iteration, which I think would solve the panic.

This reverts commit 6fb24c6.

carljm

Looks great to me!

Micha or David may have Rust/perf suggestions I missed, but I think it's OK to go ahead with this and handle any such comments as follow-ups.

crates/red_knot_python_semantic/src/visibility_constraints.rs

dcreager · 2025-02-03T21:28:13Z

Summarizing from Discord: I think a) such code would never actually occur in a stub file; they shouldn't even have function bodies other than ellipsis (and perhaps we should make this explicit in how we handle them), and b) it would be fine to go ahead and remove the simplify calls and mark that file xfail (though ideal if we could mark only the pyi version of it as xfail?) pending fixpoint iteration, which I think would solve the panic.

Done. Performance between keeping and removing the simplify calls seems to be a wash. I've removed them in the interests of simpler code, and added a pyi-only xfail for the one failing test.

MichaReiser · 2025-02-03T21:34:25Z

I haven't reviewed the code yet but the regression might simply come from that we now evaluate (or simplify) more constraints than before because the semantic indexer processes all constraints whereas simplification during inference was lazy (and only for the queried definitions)

carljm · 2025-02-03T23:24:10Z

Hmm, the switch to IndexVec (or at least, something in the recent commits) seems to have grown the regression; now we're seeing a significant regression even in incremental check, which is a bit strange.

carljm

Semantics and code here look good to me. May be worth understanding why the regression seems to have grown again, and see if that reproduces locally on a larger codebase like Black.

dcreager · 2025-02-04T00:07:57Z

Hmm, the switch to IndexVec (or at least, something in the recent commits) seems to have grown the regression; now we're seeing a significant regression even in incremental check, which is a bit strange.

I do see a 2-3% regression between the IndexVec commit and the one immediately before it. My hunch is that it's because we're not storing ConstraintId and InteriorNodeId directly, and have to convert into that from the bit-hacked u32 on each access. That does a bounds check each time. I'm going to revert that commit and add a note about why we're using Vec. (Alternatively I could add an unchecked constructor for the ID types.)

dcreager · 2025-02-04T00:24:49Z

I'm going to revert that commit and add a note about why we're using Vec. (Alternatively I could add an unchecked constructor for the ID types.)

Ackshually I can write a custom Idx impl for a couple of types and bypass the asserts, I think.

dcreager · 2025-02-04T01:49:24Z

Still seeing the larger regression with custom Idx impls, and again with a revert of the Vec → IndexVec commit. I'm stumped, will look at this some more in the morning.

MichaReiser

Nice :) I only have nit/documentation comments.

It would be nice if we could use IndexVec or a newtype wrapper and implement std::ops::Index but it seems you tried that and it regressed performance.

It would also be great if we understand the reason for the perf regression better. Where are we spending more time now? Is it because we do more work eagerly or is it because the caching is expensive? Could we remove the caching?

MichaReiser · 2025-02-04T08:08:55Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+            ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),
+            AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),
+            ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),


Suggested change

ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),

AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),

ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),

ALWAYS_TRUE => f.field(&"AlwaysTrue").finish(),

AMBIGUOUS => f.field(&"Ambiguous").finish(),

ALWAYS_FALSE => f.field(&"AlwaysFalse").finish(),

The format_args! part makes this render as a symbol, not a string: ScopedVisibilityConstraintId(AlwaysTrue) as opposed to ScopedVisibilityConstraintId("AlwaysTrue"). It's small and aesthetic but I like it better! I've added a comment explaining why

MichaReiser · 2025-02-04T08:09:39Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+        let mut f = f.debug_tuple("ScopedVisibilityConstraintId");
+        match *self {
+            ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),
+            AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),
+            ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),
+            _ => f.field(&self.0).finish(),
+        }


Nit: Makes it clearer what's different/shared between the branches:

Suggested change

let mut f = f.debug_tuple("ScopedVisibilityConstraintId");

match *self {

ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),

AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),

ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),

_ => f.field(&self.0).finish(),

}

let mut f = f.debug_tuple("ScopedVisibilityConstraintId");

match *self {

ALWAYS_TRUE => f.field(&"AlwaysTrue"),

AMBIGUOUS => f.field(&"Ambiguous"),

ALWAYS_FALSE => f.field(&"AlwaysFalse"),

_ => f.field(&self.0),

}

f.finish()

Done (modulo above)

MichaReiser · 2025-02-04T08:11:37Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+// _Atoms_ are the underlying Constraints, which are the variables that are evaluated by the
+// ternary function.
+//
+// _Interior nodes_ provide the TDD structure for the formula. Interior nodes are stored in an


I'm not familiar with TDD myself (I'm familiar with test driven development but that's not it). Would it make sense to extend the module level documentation and also include a link to what TDD means in this context?

MichaReiser · 2025-02-04T08:12:56Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+/// ([`VisibilityConstraints::constraints`]). An atom consists of an index into this arena, and a
+/// copy number.


It's unclear to me what a copy number is. Can you expand the comment to cover this in more detail. It may also be worthwhile to explain the internal representation. How many constraints can be expressed etc.

MichaReiser · 2025-02-04T08:15:44Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+const ALWAYS_TRUE: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::ALWAYS_TRUE;
+const AMBIGUOUS: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::AMBIGUOUS;
+const ALWAYS_FALSE: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::ALWAYS_FALSE;
+const SMALLEST_TERMINAL: u32 = ALWAYS_FALSE.0;


I'd expected this to also be defined on ScopedVisiblityConstraitId as it is used in is_terminal and that there's only an alias here.

MichaReiser · 2025-02-04T08:18:36Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+    constraint_cache: FxHashMap<Constraint<'db>, u32>,
+    interior_cache: FxHashMap<InteriorNode, u32>,


if we end up not using IndexVec, I still recommend to create a newtype wrapper around u32 to signify what the u32 here means (it's an index into the constraints/interiors vec?)

We're back to using IndexVec

MichaReiser · 2025-02-04T08:30:06Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+    constraints: Vec<Constraint<'db>>,
+    interiors: Vec<InteriorNode>,


This struct defines a fair amount of heap-allocated data structures and requires a lot of hashing. It would be nice not to have to do that, but I don't see how we can restructure the code to avoid it (unless maybe combining some maps and encoding the or/and in the key but that might as well turn out to be slower if it happens that the map has to resize more often because of it.

The hashing is an important part of the data structure, since we have to intern all of the nodes to guarantee that the TDD is reduced. (I included a description of what it means to be reduced, and why that's important, in the module documentation that I added).

MichaReiser · 2025-02-04T08:34:13Z

crates/red_knot_python_semantic/src/visibility_constraints.rs

+impl Idx for ScopedVisibilityConstraintId {
+    #[inline]
+    fn new(value: usize) -> Self {
+        assert!(value <= (SMALLEST_TERMINAL as usize));


Does it make any difference if you remove the assert here just to make sure we compare apples with apples

Once I had my performance tests taking into account thermal throttling, it turned out the Vec vs IndexVec, and assert vs debug_assert, both did not make a difference. Leaving this as an assert so that we get a proper panic if we ever try to analyze a file that needs more than 16 million constraints!

This reverts commit ff472b7.

This reverts commit 2f74a51.

* main: (66 commits) [red-knot] Use ternary decision diagrams (TDDs) for visibility constraints (#15861) [`pyupgrade`] Rename private type parameters in PEP 695 generics (`UP049`) (#15862) Simplify the `StringFlags` trait (#15944) [`flake8-pyi`] Make `PYI019` autofixable for `.py` files in preview mode as well as stubs (#15889) Docs (`linter.md`): clarify that Python files are always searched for in subdirectories (#15882) [`flake8-pyi`] Make PEP-695 functions with multiple type parameters fixable by PYI019 again (#15938) [red-knot] Use unambiguous invalid-syntax-construct for suppression comment test (#15933) Make `Binding::range()` point to the range of a type parameter's name, not the full type parameter (#15935) Update black deviations (#15928) [red-knot] MDTest: Fix line numbers in error messages (#15932) Preserve triple quotes and prefixes for strings (#15818) [red-knot] Hand-written MDTest parser (#15926) [`pylint`] Fix missing parens in unsafe fix for `unnecessary-dunder-call` (`PLC2801`) (#15762) nit: docs for ignore & select (#15883) [airflow] `BashOperator` has been moved to `airflow.providers.standard.operators.bash.BashOperator` (AIR302) (#15922) [`flake8-logging`] `.exception()` and `exc_info=` outside exception handlers (`LOG004`, `LOG014`) (#15799) [red-knot] Enforce specifying paths for mdtest code blocks in a separate preceding line (#15890) [red-knot] Internal refactoring of visibility constraints API (#15913) [red-knot] Implicit instance attributes (#15811) [`flake8-comprehensions`] Handle extraneous parentheses around list comprehension (`C403`) (#15877) ...

dcreager force-pushed the dcreager/tdd branch 2 times, most recently from 38a04d1 to 3476728 Compare January 31, 2025 22:50

AlexWaygood added the red-knot Multi-file analysis & type inference label Feb 1, 2025

dcreager force-pushed the dcreager/tdd branch from 0425403 to aef9dd2 Compare February 3, 2025 15:15

dcreager mentioned this pull request Feb 3, 2025

[red-knot] Internal refactoring of visibility constraints API #15913

Merged

dcreager changed the base branch from main to dcreager/vc-api February 3, 2025 15:24

dcreager force-pushed the dcreager/tdd branch from 6b90163 to 30da9fd Compare February 3, 2025 15:42

dcreager marked this pull request as ready for review February 3, 2025 19:48

dcreager requested review from carljm, MichaReiser, AlexWaygood and sharkdp as code owners February 3, 2025 19:48

dcreager force-pushed the dcreager/tdd branch from ad4202b to 48fea30 Compare February 3, 2025 19:48

dcreager force-pushed the dcreager/vc-api branch from d1578ee to b4eb067 Compare February 3, 2025 19:48

Base automatically changed from dcreager/vc-api to main February 3, 2025 20:13

dcreager added 4 commits February 3, 2025 15:13

Add TDD implementation

5c2d731

Replace the old implementation with the new TDD one!

1116336

Remove NodeKind

5b6bce5

Check cache second

48fce26

Reapply "Remove simplify calls!!!"

6e47aa9

This reverts commit 6fb24c6.

carljm approved these changes Feb 3, 2025

View reviewed changes

Add .pyi xfail for RET503

61b2335

dcreager added 3 commits February 3, 2025 17:36

Use IndexVec

bd5ebf7

Handle two-terminal case in cmp_atoms

d7f070c

Remove unneeded clippy allows

7eb1f2b

carljm approved these changes Feb 3, 2025

View reviewed changes

dcreager added 3 commits February 3, 2025 19:26

Custom Idx impl for Atom

cf6b269

Add custom Idx impl for ScopedVisibilityConstraintId

a387636

Go back to Vec I guess

ff472b7

MichaReiser approved these changes Feb 4, 2025

View reviewed changes

MichaReiser reviewed Feb 4, 2025

View reviewed changes

dcreager added 2 commits February 4, 2025 09:27

Revert "Go back to Vec I guess"

977da30

This reverts commit ff472b7.

Revert "Remove simplify calls!!!"

086bb58

This reverts commit 2f74a51.

dcreager force-pushed the dcreager/tdd branch from 3ede45c to 086bb58 Compare February 4, 2025 16:34

dcreager added 5 commits February 4, 2025 13:25

Move finish calls

20ada6a

Add module docs describing TDDs

82e8e3e

Make SMALLEST_TERMINAL an id

7e286a7

Document copy number better

3377315

Document format_args usage

ec35dcf

dcreager merged commit 444b055 into main Feb 4, 2025
21 checks passed

dcreager deleted the dcreager/tdd branch February 4, 2025 19:32

		/// ([`VisibilityConstraints::constraints`]). An atom consists of an index into this arena, and a
		/// copy number.

		constraint_cache: FxHashMap<Constraint<'db>, u32>,
		interior_cache: FxHashMap<InteriorNode, u32>,

		constraints: Vec<Constraint<'db>>,
		interiors: Vec<InteriorNode>,

[red-knot] Use ternary decision diagrams (TDDs) for visibility constraints #15861

[red-knot] Use ternary decision diagrams (TDDs) for visibility constraints #15861

Conversation

dcreager commented Jan 31, 2025 • edited Loading

carljm commented Feb 1, 2025

dcreager commented Feb 1, 2025

carljm commented Feb 1, 2025 • edited Loading

github-actions bot commented Feb 3, 2025 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

Formatter (stable)

Formatter (preview)

dcreager commented Feb 3, 2025

codspeed-hq bot commented Feb 3, 2025 • edited Loading

Merging #15861 will not alter performance

Summary

dcreager commented Feb 3, 2025

dcreager commented Feb 3, 2025

carljm commented Feb 3, 2025 • edited Loading

carljm commented Feb 3, 2025

carljm commented Feb 3, 2025

carljm left a comment

Choose a reason for hiding this comment

dcreager commented Feb 3, 2025

MichaReiser commented Feb 3, 2025 • edited Loading

carljm commented Feb 3, 2025

carljm left a comment

Choose a reason for hiding this comment

dcreager commented Feb 4, 2025 • edited Loading

dcreager commented Feb 4, 2025

dcreager commented Feb 4, 2025

MichaReiser left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dcreager commented Jan 31, 2025 •

edited

Loading

carljm commented Feb 1, 2025 •

edited

Loading

github-actions bot commented Feb 3, 2025 •

edited

Loading

`ruff-ecosystem` results

codspeed-hq bot commented Feb 3, 2025 •

edited

Loading

carljm commented Feb 3, 2025 •

edited

Loading

MichaReiser commented Feb 3, 2025 •

edited

Loading

dcreager commented Feb 4, 2025 •

edited

Loading

MichaReiser left a comment •

edited

Loading