Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] don't include Unknown in the type for a conditionally-defined import #13563

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pilleye
Copy link
Contributor

@pilleye pilleye commented Sep 30, 2024

Summary

Fixes the bug described in #13514 where an unbound public type defaulted to the type or Unknown, whereas it should only be the type if unbound.

Test Plan

Added a new test case

@pilleye pilleye force-pushed the pilleye/public-known branch 2 times, most recently from 4fcbb88 to 2436cb2 Compare September 30, 2024 03:05
@pilleye pilleye changed the title [red-knot] don't include Unknown in the type for a conditionally-defined import #13514 [red-knot] don't include Unknown in the type for a conditionally-defined import Sep 30, 2024
Copy link
Contributor

github-actions bot commented Sep 30, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

@AlexWaygood AlexWaygood added the red-knot Multi-file analysis & type inference label Sep 30, 2024
@pilleye pilleye force-pushed the pilleye/public-known branch 2 times, most recently from 4e1fcc7 to b67b0b3 Compare September 30, 2024 20:35
Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you!! Just a few comment removals and this looks good to go.

Comment on lines 5486 to 5496
// TODO: once we support `sys.version_info` branches,
// we can set `--target-version=py311` in this test
// and the inferred type will just be `BaseExceptionGroup` --Alex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can now remove this comment, and the similar one in two below tests, because the lack of sys.version_info handling no longer results in a spurious Unknown here.

@carljm
Copy link
Contributor

carljm commented Sep 30, 2024

Oh, I didn't notice tests are failing. We'll need to update the tests, and I'll need to look at the resulting assertion changes to make sure we aren't getting any undesired side effects from this.

@pilleye
Copy link
Contributor Author

pilleye commented Sep 30, 2024

Ah, I missed those tests on the latest version, will fix tonight.

@pilleye
Copy link
Contributor Author

pilleye commented Oct 1, 2024

There seems to be a weird side-effect where we're narrowing Unbound | Unknown types to just Unknown instead of Unbound: https://github.com/pilleye/ruff/blob/pilleye/public-known/crates/red_knot_python_semantic/src/types/infer.rs#L5313

Misread the case

@carljm
Copy link
Contributor

carljm commented Oct 1, 2024

So one thing I'm realizing here is that we are using assert_public_ty in some tests where we really intend to be testing "what is the local type of this variable at the end of the scope" -- this was natural because those used to be the same thing, but in this PR we are definitively making them different.

So for tests like basic_for_loop, for example, what we should really do instead of using assert_public_ty is to add a reveal_type(x) as the last line in the test code, and then assert_file_diagnostics that we get the expected type revealed there, which will include Unbound. This will be a bit more verbose for now, but it'll get better in the upcoming test framework.

@pilleye
Copy link
Contributor Author

pilleye commented Oct 1, 2024

Working through a global-related bugs right now.

We seem to be incorrectly shadowing global builtins with Never here: https://github.com/pilleye/ruff/blob/pilleye/public-known/crates/red_knot_python_semantic/src/types/infer.rs#L4431

This is currently narrowed to Literal[1].

@pilleye pilleye force-pushed the pilleye/public-known branch 2 times, most recently from d1c8432 to 2acaeef Compare October 1, 2024 01:55
@carljm
Copy link
Contributor

carljm commented Oct 1, 2024

Ah ok, so that's going to be due to the use of global_symbol_ty in TypeInferenceBuilder::lookup_name. We need to use a function there that does union with unbound (if unbound is possible), so that we can replace unbound with the builtin definition. So the set of functions in types.rs: global_symbol_ty, symbol_ty, symbol_ty_by_name, symbol_ty_by_id are going to have to be split between the import case and the global-lookup-within-the-module case.

I would probably keep those functions as is, and add a new global_symbol_lookup function that a) still inserts Type::Unbound in case unbound is possible, and b) should also use only definitions, not declared types, even if there are declarations (so it's like just the second case in symbol_ty_by_id). (We may want a new test to verify (b) as well.) We may not even need global_symbol_lookup to be a function; I think the only callsite will be in TypeInferenceBuilder::lookup_name, so it can probably just be inlined there, since it shouldn't be more than a couple lines.

Let me know if any of that doesn't make sense! Thanks for working through this. I knew the first version of this PR was too simple to be true 😆

@pilleye
Copy link
Contributor Author

pilleye commented Oct 1, 2024

Haha the first version seemed suspiciously easy

I think I got the tests working, but definitely feels way jankier than I would want. Would appreciate comments!

I think trying to genericize symbol_ty_by_id adds unnecessary complexity, so I might just inline that code as you said.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! You did what I asked for yesterday, but I now realize that in some cases I asked for the wrong thing 😆 Sorry about that! I've talked this over with the team (by which I mean @AlexWaygood in this case) and I think we've got a clearer plan in mind now; let me know if any of the comments are not clear.

Thanks for all your work on this! You definitely didn't pick an easy first task :)

Comment on lines 3279 to +3369
// the type as seen from external modules (`Unknown`)
// is different from the type inside the module itself (`Unbound`):
assert_public_ty(&db, "src/package/foo.py", "x", "Unbound");
assert_public_ty(&db, "src/package/bar.py", "x", "Unknown");
assert_public_ty(&db, "src/package/foo.py", "x", "Never");
assert_public_ty(&db, "src/package/bar.py", "x", "Never");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a behavior change we want. If you import a fully undeclared/undefined name, you should get a diagnostic, but the imported name should have type Unknown in this case. Otherwise you'll just get lots of useless cascading errors about doing invalid things with Never.

There is currently a special case in TypeInferenceBuilder::infer_import_from_definition where we transform Unbound to Unknown -- that's why this test exists. That special case becomes obsolete with this PR (we'll never get Unbound there anymore), but I think the special case should be updated to check for Never, emit a diagnostic (use self.add_diagnostic), and transform the Never to Unknown. (It's also fine if you want to just add a TODO for the diagnostic and we can add it in a later PR; we'll also want a similar diagnostic that's currently missing in infer_import_definition, for a nonexistent module.)

}

#[test]
fn resolve_unannotated_unbound_public() -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's only maybe unbound

Suggested change
fn resolve_unannotated_unbound_public() -> anyhow::Result<()> {
fn resolve_unannotated_maybe_unbound_public() -> anyhow::Result<()> {

@@ -4158,6 +4159,40 @@ mod tests {
Ok(())
}

#[test]
fn resolve_unbound_public() -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn resolve_unbound_public() -> anyhow::Result<()> {
fn resolve_maybe_undeclared_maybe_unbound_public() -> anyhow::Result<()> {

Comment on lines +4412 to +4500
assert_public_ty(&db, "src/a.py", "r", "Literal[2]");
assert_public_ty(&db, "src/a.py", "s", "Literal[5]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those cases where part of the whole purpose of the test is actually to show that we know which names might be unbound, so we should really update the test such that it doesn't use (only)assert_public_ty, because it's really not the public type we want to assert on.

I earlier suggested an approach using reveal_type, and I think that's actually a fine option in the tests below where you use it.

I realized a simpler approach in some of these tests (particularly the global-scope ones) might be to keep these assert_public_ty but add another helper assert_may_be_unbound (which uses the use-def map directly) and explicitly assert here that r and s may be unbound. This way we actually get both the assertion on the public type, and an assertion that locally we do know the name might be unbound.

@@ -4516,7 +4551,7 @@ mod tests {
)
.unwrap();

assert_public_ty(&db, "src/a.py", "y", "Unbound | Literal[2, 3]");
assert_public_ty(&db, "src/a.py", "y", "Literal[2, 3]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, should have an assertion in some form that y can be unbound here

Comment on lines 5770 to 5858
// and the inferred type will just be `BaseExceptionGroup` --Alex
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment (including the two previous lines GH won't let me comment on) can be removed.

Comment on lines 5796 to 5884
// and the inferred type will just be `BaseExceptionGroup` --Alex
//
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one can be removed too

Comment on lines +103 to +104
/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module.
/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module, including fallback to globals if (possibly) not defined.

db,
global_scope(db, file),
symbol,
SymbolTableLookupContext::GlobalWithinModule,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I realized that a simpler approach that doesn't require threading this context through would be to just use symbol_ty_by_id (which can always use Never and never Unbound or Unknown as the unbound-ty), and then explicitly use the use-def map here to check directly whether the name may be undeclared and may be unbound, and then if so, look up the builtin type and union that with the symbol public type.

@@ -93,6 +100,24 @@ pub(crate) fn global_symbol_ty<'db>(db: &'db dyn Db, file: File, name: &str) ->
symbol_ty(db, global_scope(db, file), name)
}

/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module.
pub(crate) fn global_symbol_lookup<'db>(db: &'db dyn Db, file: File, name: &str) -> Type<'db> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it might be fine to inline this into TypeInferenceBuilder::lookup_name, since I don't think it will have any other call sites? If we do keep it as a separate function, then I think it should include the builtins-fallback part too; that's part of a "global symbol lookup".

Also I don't think it needs to be pub(crate).

Uses `Type::Never` instead of `Type::Unknown` for the case where
a publicly available variable is unbound. In the case where it is
unbound, we want a union of its actual type with `Never` instead of
`Unbound`, because the `Unbound` case will cause runtime error.
@pilleye pilleye marked this pull request as draft October 2, 2024 00:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants