-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[red-knot] Use ternary decision diagrams (TDDs) for visibility constraints #15861
Conversation
38a04d1
to
3476728
Compare
Very excited about this! Per your request (and the "draft" status), I'll refrain from a detailed look at the code until you declare it ready for review. Just one general thought, based on the title of the PR: when I looked at this previously, my conclusion was that we don't actually need TDDs here, BDDs suffice. If we build a BDD of the constraints, and then evaluate that BDD, anytime a decision node evaluates to "ambiguous", we can immediately short-circuit to a result of "ambiguous". I guess another way to say this is that a TDD where the "ambiguous" exit of every node leads immediately to the "ambiguous" terminal (which is our scenario, I think), is isomorphic to a BDD where the "ambiguous" exits can be implied. Just mentioning that in case it's useful or allows any simplification (I don't know if it does, since I haven't looked at your code yet.). Ultimately I'll be happy with any solution that performs well and allows us to get rid of the arbitrary depth limit, no matter what we call it! (One other thought: if we are getting rid of the depth limit, we should run this on Black and verify that the big perf regression @sharkdp observed is in fact gone.) |
Hmm, I thought we'd need to model ambiguous edges fully, since (e.g.)
Ah, that makes me a bit happier about the performance numbers I'm seeing! Codspeed is reporting a 3% regression with this in place. When I test locally, using hyperfine on black, I see similar numbers. But that might not be apples-to-apples, since |
You're right, of course! I remembered that I'd reached the conclusion we could do this with just BDDs, but when I tried to recall the reasoning for that conclusion, I thought I'd come up with a simpler reason, but instead it was just a wrong reason :) The actual reason I'd reached this conclusion before was because OxiDD will allow you to supply "resolutions" (that is, t/f values) for nodes in a BDD and will then simplify the BDD accordingly, such that the resolved nodes disappear from it. So I thought if we built a BDD and then resolved the nodes that we can infer a statically-known truthiness for, the BDD would either simplify to just a terminal, or not, in which case it remains ambiguous. But I don't know if that's actually a good/efficient approach, compared to building a TDD and evaluating it!
IIRC @sharkdp was seeing like 5x regression on Black without the recursion limit, so it seems like you're doing much better on the pathological cases. It's too bad if this comes with an across-the-board regression, but I think it's worth it to get rid of the arbitrary limit. We probably will need such limits in some places, but they are a really bad experience when users encounter them, so as much as we can avoid them, or make them so high that users will never encounter them in realistic code, we should try to do so. Of course we should also take a look at profiles and see if there are ways to bring down the regression here! |
0425403
to
aef9dd2
Compare
|
I've extracted the refactoring parts of this into a separate PR to make everything easier to review: #15913 |
CodSpeed Performance ReportMerging #15861 will not alter performanceComparing Summary
|
6b90163
to
30da9fd
Compare
As expected, the execution time shifts from being mostly in the evaluate step on to being mostly in the build step with this PR: |
I think this is at a good point now. In local testing, I've gotten the performance regression down to 1-2%. Interested to see what codspeed says about the latest commits on the branch. |
ad4202b
to
48fea30
Compare
d1578ee
to
b4eb067
Compare
For some reason CodSpeed thinks it doesn't have data on the baseline for this PR, but just comparing the results here with the ones on main (or |
I would definitely like to see the failing cases and understand why they need to be cyclic, or if there's some missing simplification (other than the "no new bindings" heuristic that doesn't work with with terminals) that could avoid the cycles. But this can be a follow-up. |
This extracts some pure refactoring noise from #15861. This changes the API for creating and evaluating visibility constraints, but does not change how they are respresented internally. There should be no behavioral or performance changes in this PR. Changes: - Hide the internal representation isn't changed, so that we can make changes to it in #15861. - Add a separate builder type for visibility constraints. (With TDDs, we will have some additional builder state that we can throw away once we're done constructing.) - Remove a layer of helper methods from `UseDefMapBuilder`, making `SemanticIndexBuilder` responsible for constructing whatever visibility constraints it needs.
Summarizing from Discord: I think a) such code would never actually occur in a stub file; they shouldn't even have function bodies other than ellipsis (and perhaps we should make this explicit in how we handle them), and b) it would be fine to go ahead and remove the simplify calls and mark that file xfail (though ideal if we could mark only the pyi version of it as xfail?) pending fixpoint iteration, which I think would solve the panic. |
This reverts commit 6fb24c6.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me!
Micha or David may have Rust/perf suggestions I missed, but I think it's OK to go ahead with this and handle any such comments as follow-ups.
Done. Performance between keeping and removing the simplify calls seems to be a wash. I've removed them in the interests of simpler code, and added a pyi-only xfail for the one failing test. |
I haven't reviewed the code yet but the regression might simply come from that we now evaluate (or simplify) more constraints than before because the semantic indexer processes all constraints whereas simplification during inference was lazy (and only for the queried definitions) |
Hmm, the switch to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semantics and code here look good to me. May be worth understanding why the regression seems to have grown again, and see if that reproduces locally on a larger codebase like Black.
I do see a 2-3% regression between the |
Ackshually I can write a custom |
Still seeing the larger regression with custom |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice :) I only have nit/documentation comments.
It would be nice if we could use IndexVec
or a newtype wrapper and implement std::ops::Index
but it seems you tried that and it regressed performance.
It would also be great if we understand the reason for the perf regression better. Where are we spending more time now? Is it because we do more work eagerly or is it because the caching is expensive? Could we remove the caching?
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(), | ||
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(), | ||
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(), | |
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(), | |
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(), | |
ALWAYS_TRUE => f.field(&"AlwaysTrue").finish(), | |
AMBIGUOUS => f.field(&"Ambiguous").finish(), | |
ALWAYS_FALSE => f.field(&"AlwaysFalse").finish(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The format_args!
part makes this render as a symbol, not a string: ScopedVisibilityConstraintId(AlwaysTrue)
as opposed to ScopedVisibilityConstraintId("AlwaysTrue")
. It's small and aesthetic but I like it better! I've added a comment explaining why
let mut f = f.debug_tuple("ScopedVisibilityConstraintId"); | ||
match *self { | ||
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(), | ||
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(), | ||
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(), | ||
_ => f.field(&self.0).finish(), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Makes it clearer what's different/shared between the branches:
let mut f = f.debug_tuple("ScopedVisibilityConstraintId"); | |
match *self { | |
ALWAYS_TRUE => f.field(&format_args!("AlwaysTrue")).finish(), | |
AMBIGUOUS => f.field(&format_args!("Ambiguous")).finish(), | |
ALWAYS_FALSE => f.field(&format_args!("AlwaysFalse")).finish(), | |
_ => f.field(&self.0).finish(), | |
} | |
let mut f = f.debug_tuple("ScopedVisibilityConstraintId"); | |
match *self { | |
ALWAYS_TRUE => f.field(&"AlwaysTrue"), | |
AMBIGUOUS => f.field(&"Ambiguous"), | |
ALWAYS_FALSE => f.field(&"AlwaysFalse"), | |
_ => f.field(&self.0), | |
} | |
f.finish() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done (modulo above)
// _Atoms_ are the underlying Constraints, which are the variables that are evaluated by the | ||
// ternary function. | ||
// | ||
// _Interior nodes_ provide the TDD structure for the formula. Interior nodes are stored in an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with TDD myself (I'm familiar with test driven development but that's not it). Would it make sense to extend the module level documentation and also include a link to what TDD means in this context?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// ([`VisibilityConstraints::constraints`]). An atom consists of an index into this arena, and a | ||
/// copy number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear to me what a copy number is. Can you expand the comment to cover this in more detail. It may also be worthwhile to explain the internal representation. How many constraints can be expressed etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
const ALWAYS_TRUE: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::ALWAYS_TRUE; | ||
const AMBIGUOUS: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::AMBIGUOUS; | ||
const ALWAYS_FALSE: ScopedVisibilityConstraintId = ScopedVisibilityConstraintId::ALWAYS_FALSE; | ||
const SMALLEST_TERMINAL: u32 = ALWAYS_FALSE.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd expected this to also be defined on ScopedVisiblityConstraitId
as it is used in is_terminal
and that there's only an alias here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
constraint_cache: FxHashMap<Constraint<'db>, u32>, | ||
interior_cache: FxHashMap<InteriorNode, u32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we end up not using IndexVec
, I still recommend to create a newtype wrapper around u32
to signify what the u32
here means (it's an index into the constraints/interiors vec?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're back to using IndexVec
constraints: Vec<Constraint<'db>>, | ||
interiors: Vec<InteriorNode>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This struct defines a fair amount of heap-allocated data structures and requires a lot of hashing. It would be nice not to have to do that, but I don't see how we can restructure the code to avoid it (unless maybe combining some maps and encoding the or
/and
in the key but that might as well turn out to be slower if it happens that the map has to resize more often because of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hashing is an important part of the data structure, since we have to intern all of the nodes to guarantee that the TDD is reduced. (I included a description of what it means to be reduced, and why that's important, in the module documentation that I added).
impl Idx for ScopedVisibilityConstraintId { | ||
#[inline] | ||
fn new(value: usize) -> Self { | ||
assert!(value <= (SMALLEST_TERMINAL as usize)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make any difference if you remove the assert here just to make sure we compare apples with apples
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once I had my performance tests taking into account thermal throttling, it turned out the Vec
vs IndexVec
, and assert
vs debug_assert
, both did not make a difference. Leaving this as an assert
so that we get a proper panic if we ever try to analyze a file that needs more than 16 million constraints!
3ede45c
to
086bb58
Compare
* main: (66 commits) [red-knot] Use ternary decision diagrams (TDDs) for visibility constraints (#15861) [`pyupgrade`] Rename private type parameters in PEP 695 generics (`UP049`) (#15862) Simplify the `StringFlags` trait (#15944) [`flake8-pyi`] Make `PYI019` autofixable for `.py` files in preview mode as well as stubs (#15889) Docs (`linter.md`): clarify that Python files are always searched for in subdirectories (#15882) [`flake8-pyi`] Make PEP-695 functions with multiple type parameters fixable by PYI019 again (#15938) [red-knot] Use unambiguous invalid-syntax-construct for suppression comment test (#15933) Make `Binding::range()` point to the range of a type parameter's name, not the full type parameter (#15935) Update black deviations (#15928) [red-knot] MDTest: Fix line numbers in error messages (#15932) Preserve triple quotes and prefixes for strings (#15818) [red-knot] Hand-written MDTest parser (#15926) [`pylint`] Fix missing parens in unsafe fix for `unnecessary-dunder-call` (`PLC2801`) (#15762) nit: docs for ignore & select (#15883) [airflow] `BashOperator` has been moved to `airflow.providers.standard.operators.bash.BashOperator` (AIR302) (#15922) [`flake8-logging`] `.exception()` and `exc_info=` outside exception handlers (`LOG004`, `LOG014`) (#15799) [red-knot] Enforce specifying paths for mdtest code blocks in a separate preceding line (#15890) [red-knot] Internal refactoring of visibility constraints API (#15913) [red-knot] Implicit instance attributes (#15811) [`flake8-comprehensions`] Handle extraneous parentheses around list comprehension (`C403`) (#15877) ...
We now use ternary decision diagrams (TDDs) to represent visibility constraints. A TDD is just like a BDD (binary decision diagram), but with "ambiguous" as an additional allowed value. Unlike the previous representation, TDDs are strongly normalizing, so equivalent ternary formulas are represented by exactly the same graph node, and can be compared for equality in constant time.
We currently have a slight 1-3% performance regression with this in place, according to local testing. However, we also have a 5× increase in performance for pathological cases, since we can now remove the recursion limit when we evaluate visibility constraints.
As follow-on work, we are now closer to being able to remove the
simplify_visibility_constraint
calls in the semantic index builder. In the vast majority of cases, we now see (for instance) that the visibility constraint after anif
statement, for bindings of symbols that weren't rebound in any branch, simplifies back totrue
. But there are still some cases we generate constraints that are cyclic. With fixed-point cycle support in salsa, or with some careful analysis of the still-failing cases, we might be able to remove those.