Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] Use ternary decision diagrams (TDDs) for visibility constraints #15861

Merged
merged 27 commits into from
Feb 4, 2025

Conversation

dcreager
Copy link
Member

@dcreager dcreager commented Jan 31, 2025

We now use ternary decision diagrams (TDDs) to represent visibility constraints. A TDD is just like a BDD (binary decision diagram), but with "ambiguous" as an additional allowed value. Unlike the previous representation, TDDs are strongly normalizing, so equivalent ternary formulas are represented by exactly the same graph node, and can be compared for equality in constant time.

We currently have a slight 1-3% performance regression with this in place, according to local testing. However, we also have a 5× increase in performance for pathological cases, since we can now remove the recursion limit when we evaluate visibility constraints.

As follow-on work, we are now closer to being able to remove the simplify_visibility_constraint calls in the semantic index builder. In the vast majority of cases, we now see (for instance) that the visibility constraint after an if statement, for bindings of symbols that weren't rebound in any branch, simplifies back to true. But there are still some cases we generate constraints that are cyclic. With fixed-point cycle support in salsa, or with some careful analysis of the still-failing cases, we might be able to remove those.

@dcreager dcreager force-pushed the dcreager/tdd branch 2 times, most recently from 38a04d1 to 3476728 Compare January 31, 2025 22:50
@carljm
Copy link
Contributor

carljm commented Feb 1, 2025

Very excited about this!

Per your request (and the "draft" status), I'll refrain from a detailed look at the code until you declare it ready for review. Just one general thought, based on the title of the PR: when I looked at this previously, my conclusion was that we don't actually need TDDs here, BDDs suffice. If we build a BDD of the constraints, and then evaluate that BDD, anytime a decision node evaluates to "ambiguous", we can immediately short-circuit to a result of "ambiguous". I guess another way to say this is that a TDD where the "ambiguous" exit of every node leads immediately to the "ambiguous" terminal (which is our scenario, I think), is isomorphic to a BDD where the "ambiguous" exits can be implied.

Just mentioning that in case it's useful or allows any simplification (I don't know if it does, since I haven't looked at your code yet.). Ultimately I'll be happy with any solution that performs well and allows us to get rid of the arbitrary depth limit, no matter what we call it!

(One other thought: if we are getting rid of the depth limit, we should run this on Black and verify that the big perf regression @sharkdp observed is in fact gone.)

@AlexWaygood AlexWaygood added the red-knot Multi-file analysis & type inference label Feb 1, 2025
@dcreager
Copy link
Member Author

dcreager commented Feb 1, 2025

If we build a BDD of the constraints, and then evaluate that BDD, anytime a decision node evaluates to "ambiguous", we can immediately short-circuit to a result of "ambiguous".

Hmm, I thought we'd need to model ambiguous edges fully, since (e.g.) ambiguous ∧ false = false. So if we have a constraint of C1 ∧ C2, where C1 is amb and C2 is false, I think your optimization would over-approximate the result to amb instead of false. Am I following that right?

(One other thought: if we are getting rid of the depth limit, we should run this on Black and verify that the big perf regression @sharkdp observed is in fact gone.)

Ah, that makes me a bit happier about the performance numbers I'm seeing! Codspeed is reporting a 3% regression with this in place. When I test locally, using hyperfine on black, I see similar numbers. But that might not be apples-to-apples, since main has a recursion limit in place but this feature branch does not.

@carljm
Copy link
Contributor

carljm commented Feb 1, 2025

I thought we'd need to model ambiguous edges fully, since (e.g.) ambiguous ∧ false = false

You're right, of course! I remembered that I'd reached the conclusion we could do this with just BDDs, but when I tried to recall the reasoning for that conclusion, I thought I'd come up with a simpler reason, but instead it was just a wrong reason :)

The actual reason I'd reached this conclusion before was because OxiDD will allow you to supply "resolutions" (that is, t/f values) for nodes in a BDD and will then simplify the BDD accordingly, such that the resolved nodes disappear from it. So I thought if we built a BDD and then resolved the nodes that we can infer a statically-known truthiness for, the BDD would either simplify to just a terminal, or not, in which case it remains ambiguous.

But I don't know if that's actually a good/efficient approach, compared to building a TDD and evaluating it!

Codspeed is reporting a 3% regression with this in place. When I test locally, using hyperfine on black, I see similar numbers. But that might not be apples-to-apples,

IIRC @sharkdp was seeing like 5x regression on Black without the recursion limit, so it seems like you're doing much better on the pathological cases. It's too bad if this comes with an across-the-board regression, but I think it's worth it to get rid of the arbitrary limit. We probably will need such limits in some places, but they are a really bad experience when users encounter them, so as much as we can avoid them, or make them so high that users will never encounter them in realistic code, we should try to do so.

Of course we should also take a look at profiles and see if there are ways to bring down the regression here!

Copy link
Contributor

github-actions bot commented Feb 3, 2025

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@dcreager dcreager changed the base branch from main to dcreager/vc-api February 3, 2025 15:24
@dcreager
Copy link
Member Author

dcreager commented Feb 3, 2025

I've extracted the refactoring parts of this into a separate PR to make everything easier to review: #15913

Copy link

codspeed-hq bot commented Feb 3, 2025

CodSpeed Performance Report

Merging #15861 will not alter performance

Comparing dcreager/tdd (ec35dcf) with main (6bb3235)

Summary

✅ 32 untouched benchmarks

@dcreager
Copy link
Member Author

dcreager commented Feb 3, 2025

IIRC @sharkdp was seeing like 5x regression on Black without the recursion limit, so it seems like you're doing much better on the pathological cases. It's too bad if this comes with an across-the-board regression, but I think it's worth it to get rid of the arbitrary limit. We probably will need such limits in some places, but they are a really bad experience when users encounter them, so as much as we can avoid them, or make them so high that users will never encounter them in realistic code, we should try to do so.

Of course we should also take a look at profiles and see if there are ways to bring down the regression here!

As expected, the execution time shifts from being mostly in the evaluate step on main:

main

to being mostly in the build step with this PR:

feature

@dcreager
Copy link
Member Author

dcreager commented Feb 3, 2025

I think this is at a good point now. In local testing, I've gotten the performance regression down to 1-2%. Interested to see what codspeed says about the latest commits on the branch.

@dcreager dcreager marked this pull request as ready for review February 3, 2025 19:48
@carljm
Copy link
Contributor

carljm commented Feb 3, 2025

For some reason CodSpeed thinks it doesn't have data on the baseline for this PR, but just comparing the results here with the ones on main (or vc-api), it looks like about 2% regression (89.5 ms -> 91.6 ms).

@carljm
Copy link
Contributor

carljm commented Feb 3, 2025

As follow-on work, we are now closer to being able to remove the simplify_visibility_constraint calls in the semantic index builder. In the vast majority of cases, we now see (for instance) that the visibility constraint after an if statement, for bindings of symbols that weren't rebound in any branch, simplifies back to true. But there are still some cases we generate constraints that are cyclic. With fixed-point cycle support in salsa, or with some careful analysis of the still-failing cases, we might be able to remove those.

I would definitely like to see the failing cases and understand why they need to be cyclic, or if there's some missing simplification (other than the "no new bindings" heuristic that doesn't work with with terminals) that could avoid the cycles. But this can be a follow-up.

dcreager added a commit that referenced this pull request Feb 3, 2025
This extracts some pure refactoring noise from
#15861. This changes the API for
creating and evaluating visibility constraints, but does not change how
they are respresented internally. There should be no behavioral or
performance changes in this PR.

Changes:

- Hide the internal representation isn't changed, so that we can make
changes to it in #15861.
- Add a separate builder type for visibility constraints. (With TDDs, we
will have some additional builder state that we can throw away once
we're done constructing.)
- Remove a layer of helper methods from `UseDefMapBuilder`, making
`SemanticIndexBuilder` responsible for constructing whatever visibility
constraints it needs.
Base automatically changed from dcreager/vc-api to main February 3, 2025 20:13
@carljm
Copy link
Contributor

carljm commented Feb 3, 2025

Summarizing from Discord: I think a) such code would never actually occur in a stub file; they shouldn't even have function bodies other than ellipsis (and perhaps we should make this explicit in how we handle them), and b) it would be fine to go ahead and remove the simplify calls and mark that file xfail (though ideal if we could mark only the pyi version of it as xfail?) pending fixpoint iteration, which I think would solve the panic.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

Micha or David may have Rust/perf suggestions I missed, but I think it's OK to go ahead with this and handle any such comments as follow-ups.

@dcreager
Copy link
Member Author

dcreager commented Feb 3, 2025

Summarizing from Discord: I think a) such code would never actually occur in a stub file; they shouldn't even have function bodies other than ellipsis (and perhaps we should make this explicit in how we handle them), and b) it would be fine to go ahead and remove the simplify calls and mark that file xfail (though ideal if we could mark only the pyi version of it as xfail?) pending fixpoint iteration, which I think would solve the panic.

Done. Performance between keeping and removing the simplify calls seems to be a wash. I've removed them in the interests of simpler code, and added a pyi-only xfail for the one failing test.

@MichaReiser
Copy link
Member

MichaReiser commented Feb 3, 2025

I haven't reviewed the code yet but the regression might simply come from that we now evaluate (or simplify) more constraints than before because the semantic indexer processes all constraints whereas simplification during inference was lazy (and only for the queried definitions)

@carljm
Copy link
Contributor

carljm commented Feb 3, 2025

Hmm, the switch to IndexVec (or at least, something in the recent commits) seems to have grown the regression; now we're seeing a significant regression even in incremental check, which is a bit strange.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics and code here look good to me. May be worth understanding why the regression seems to have grown again, and see if that reproduces locally on a larger codebase like Black.

@dcreager
Copy link
Member Author

dcreager commented Feb 4, 2025

Hmm, the switch to IndexVec (or at least, something in the recent commits) seems to have grown the regression; now we're seeing a significant regression even in incremental check, which is a bit strange.

I do see a 2-3% regression between the IndexVec commit and the one immediately before it. My hunch is that it's because we're not storing ConstraintId and InteriorNodeId directly, and have to convert into that from the bit-hacked u32 on each access. That does a bounds check each time. I'm going to revert that commit and add a note about why we're using Vec. (Alternatively I could add an unchecked constructor for the ID types.)

@dcreager
Copy link
Member Author

dcreager commented Feb 4, 2025

I'm going to revert that commit and add a note about why we're using Vec. (Alternatively I could add an unchecked constructor for the ID types.)

Ackshually I can write a custom Idx impl for a couple of types and bypass the asserts, I think.

@dcreager
Copy link
Member Author

dcreager commented Feb 4, 2025

Still seeing the larger regression with custom Idx impls, and again with a revert of the VecIndexVec commit. I'm stumped, will look at this some more in the morning.

Copy link
Member

@MichaReiser MichaReiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :) I only have nit/documentation comments.

It would be nice if we could use IndexVec or a newtype wrapper and implement std::ops::Index but it seems you tried that and it regressed performance.

It would also be great if we understand the reason for the perf regression better. Where are we spending more time now? Is it because we do more work eagerly or is it because the caching is expensive? Could we remove the caching?

Comment on lines 189 to 191
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),
ALWAYS_TRUE => f.field(&"AlwaysTrue").finish(),
AMBIGUOUS => f.field(&"Ambiguous").finish(),
ALWAYS_FALSE => f.field(&"AlwaysFalse").finish(),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format_args! part makes this render as a symbol, not a string: ScopedVisibilityConstraintId(AlwaysTrue) as opposed to ScopedVisibilityConstraintId("AlwaysTrue"). It's small and aesthetic but I like it better! I've added a comment explaining why

Comment on lines 187 to 193
let mut f = f.debug_tuple("ScopedVisibilityConstraintId");
match *self {
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),
_ => f.field(&self.0).finish(),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Makes it clearer what's different/shared between the branches:

Suggested change
let mut f = f.debug_tuple("ScopedVisibilityConstraintId");
match *self {
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(),
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(),
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(),
_ => f.field(&self.0).finish(),
}
let mut f = f.debug_tuple("ScopedVisibilityConstraintId");
match *self {
ALWAYS_TRUE => f.field(&"AlwaysTrue"),
AMBIGUOUS => f.field(&"Ambiguous"),
ALWAYS_FALSE => f.field(&"AlwaysFalse"),
_ => f.field(&self.0),
}
f.finish()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done (modulo above)

// _Atoms_ are the underlying Constraints, which are the variables that are evaluated by the
// ternary function.
//
// _Interior nodes_ provide the TDD structure for the formula. Interior nodes are stored in an
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with TDD myself (I'm familiar with test driven development but that's not it). Would it make sense to extend the module level documentation and also include a link to what TDD means in this context?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 218 to 219
/// ([`VisibilityConstraints::constraints`]). An atom consists of an index into this arena, and a
/// copy number.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me what a copy number is. Can you expand the comment to cover this in more detail. It may also be worthwhile to explain the internal representation. How many constraints can be expressed etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

const ALWAYS_TRUE: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::ALWAYS_TRUE;
const AMBIGUOUS: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::AMBIGUOUS;
const ALWAYS_FALSE: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::ALWAYS_FALSE;
const SMALLEST_TERMINAL: u32 = ALWAYS_FALSE.0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expected this to also be defined on ScopedVisiblityConstraitId as it is used in is_terminal and that there's only an alias here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 284 to 285
constraint_cache: FxHashMap<Constraint<'db>, u32>,
interior_cache: FxHashMap<InteriorNode, u32>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we end up not using IndexVec, I still recommend to create a newtype wrapper around u32 to signify what the u32 here means (it's an index into the constraints/interiors vec?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're back to using IndexVec

Comment on lines 282 to 283
constraints: Vec<Constraint<'db>>,
interiors: Vec<InteriorNode>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This struct defines a fair amount of heap-allocated data structures and requires a lot of hashing. It would be nice not to have to do that, but I don't see how we can restructure the code to avoid it (unless maybe combining some maps and encoding the or/and in the key but that might as well turn out to be slower if it happens that the map has to resize more often because of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hashing is an important part of the data structure, since we have to intern all of the nodes to guarantee that the TDD is reduced. (I included a description of what it means to be reduced, and why that's important, in the module documentation that I added).

impl Idx for ScopedVisibilityConstraintId {
#[inline]
fn new(value: usize) -> Self {
assert!(value <= (SMALLEST_TERMINAL as usize));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make any difference if you remove the assert here just to make sure we compare apples with apples

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once I had my performance tests taking into account thermal throttling, it turned out the Vec vs IndexVec, and assert vs debug_assert, both did not make a difference. Leaving this as an assert so that we get a proper panic if we ever try to analyze a file that needs more than 16 million constraints!

@dcreager dcreager merged commit 444b055 into main Feb 4, 2025
21 checks passed
@dcreager dcreager deleted the dcreager/tdd branch February 4, 2025 19:32
dcreager added a commit that referenced this pull request Feb 4, 2025
* main: (66 commits)
  [red-knot] Use ternary decision diagrams (TDDs) for visibility constraints (#15861)
  [`pyupgrade`] Rename private type parameters in PEP 695 generics (`UP049`) (#15862)
  Simplify the `StringFlags` trait (#15944)
  [`flake8-pyi`] Make `PYI019` autofixable for `.py` files in preview mode as well as stubs (#15889)
  Docs (`linter.md`): clarify that Python files are always searched for in subdirectories (#15882)
  [`flake8-pyi`] Make PEP-695 functions with multiple type parameters fixable by PYI019 again (#15938)
  [red-knot] Use unambiguous invalid-syntax-construct for suppression comment test (#15933)
  Make `Binding::range()` point to the range of a type parameter's name, not the full type parameter (#15935)
  Update black deviations (#15928)
  [red-knot] MDTest: Fix line numbers in error messages (#15932)
  Preserve triple quotes and prefixes for strings (#15818)
  [red-knot] Hand-written MDTest parser (#15926)
  [`pylint`] Fix missing parens in unsafe fix for `unnecessary-dunder-call` (`PLC2801`) (#15762)
  nit: docs for ignore & select (#15883)
  [airflow] `BashOperator` has been moved to `airflow.providers.standard.operators.bash.BashOperator` (AIR302) (#15922)
  [`flake8-logging`] `.exception()` and `exc_info=` outside exception handlers (`LOG004`, `LOG014`) (#15799)
  [red-knot] Enforce specifying paths for mdtest code blocks in a separate preceding line (#15890)
  [red-knot] Internal refactoring of visibility constraints API (#15913)
  [red-knot] Implicit instance attributes (#15811)
  [`flake8-comprehensions`] Handle extraneous parentheses around list comprehension (`C403`) (#15877)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants