Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] don't include Unknown in the type for a conditionally-defined import #13563

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 45 additions & 20 deletions crates/red_knot_python_semantic/src/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -43,48 +43,55 @@ pub fn check_types(db: &dyn Db, file: File) -> TypeCheckDiagnostics {
diagnostics
}

#[derive(Clone, Copy, Eq, PartialEq, Debug)]
enum SymbolTableLookupContext {
PublicImport,
GlobalWithinModule,
}

/// Infer the public type of a symbol (its type as seen from outside its scope).
fn symbol_ty_by_id<'db>(db: &'db dyn Db, scope: ScopeId<'db>, symbol: ScopedSymbolId) -> Type<'db> {
fn symbol_ty_by_id<'db>(
db: &'db dyn Db,
scope: ScopeId<'db>,
symbol: ScopedSymbolId,
context: SymbolTableLookupContext,
) -> Type<'db> {
let _span = tracing::trace_span!("symbol_ty_by_id", ?symbol).entered();

let use_def = use_def_map(db, scope);

// If the symbol is declared, the public type is based on declarations; otherwise, it's based
// on inference from bindings.
if use_def.has_public_declarations(symbol) {
let unbound_ty = use_def
.public_may_be_unbound(symbol)
.then_some(match context {
SymbolTableLookupContext::PublicImport => Type::Never,
SymbolTableLookupContext::GlobalWithinModule => Type::Unbound,
});

// If we want to try to use declarations and the symbol is declared, the public type is based on
// declarations; otherwise, it's based on inference from bindings.
if context == SymbolTableLookupContext::PublicImport && use_def.has_public_declarations(symbol)
{
let declarations = use_def.public_declarations(symbol);
// If the symbol is undeclared in some paths, include the inferred type in the public type.
let undeclared_ty = if declarations.may_be_undeclared() {
Some(bindings_ty(
db,
use_def.public_bindings(symbol),
use_def
.public_may_be_unbound(symbol)
.then_some(Type::Unknown),
))
Some(bindings_ty(db, use_def.public_bindings(symbol), unbound_ty))
} else {
None
};
// Intentionally ignore conflicting declared types; that's not our problem, it's the
// problem of the module we are importing from.
declarations_ty(db, declarations, undeclared_ty).unwrap_or_else(|(ty, _)| ty)
} else {
bindings_ty(
db,
use_def.public_bindings(symbol),
use_def
.public_may_be_unbound(symbol)
.then_some(Type::Unbound),
)
bindings_ty(db, use_def.public_bindings(symbol), unbound_ty)
}
}

/// Shorthand for `symbol_ty` that takes a symbol name instead of an ID.
/// Shorthand for `symbol_ty_by_id` that takes a symbol name instead of an ID.
fn symbol_ty<'db>(db: &'db dyn Db, scope: ScopeId<'db>, name: &str) -> Type<'db> {
let table = symbol_table(db, scope);
table
.symbol_id_by_name(name)
.map(|symbol| symbol_ty_by_id(db, scope, symbol))
.map(|symbol| symbol_ty_by_id(db, scope, symbol, SymbolTableLookupContext::PublicImport))
.unwrap_or(Type::Unbound)
}

Expand All @@ -93,6 +100,24 @@ pub(crate) fn global_symbol_ty<'db>(db: &'db dyn Db, file: File, name: &str) ->
symbol_ty(db, global_scope(db, file), name)
}

/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module.
Comment on lines +103 to +104
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module.
/// Shorthand for `symbol_ty_by_id` that looks up a global symbol from the context of being in that
/// module, including fallback to globals if (possibly) not defined.

pub(crate) fn global_symbol_lookup<'db>(db: &'db dyn Db, file: File, name: &str) -> Type<'db> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think it might be fine to inline this into TypeInferenceBuilder::lookup_name, since I don't think it will have any other call sites? If we do keep it as a separate function, then I think it should include the builtins-fallback part too; that's part of a "global symbol lookup".

Also I don't think it needs to be pub(crate).

let table = symbol_table(db, global_scope(db, file));

table
.symbol_id_by_name(name)
.map(|symbol| {
symbol_ty_by_id(
db,
global_scope(db, file),
symbol,
SymbolTableLookupContext::GlobalWithinModule,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I realized that a simpler approach that doesn't require threading this context through would be to just use symbol_ty_by_id (which can always use Never and never Unbound or Unknown as the unbound-ty), and then explicitly use the use-def map here to check directly whether the name may be undeclared and may be unbound, and then if so, look up the builtin type and union that with the symbol public type.

)
})
.unwrap_or_else(|| Type::Unbound)
}

/// Infer the type of a binding.
pub(crate) fn binding_ty<'db>(db: &'db dyn Db, definition: Definition<'db>) -> Type<'db> {
let inference = infer_definition_types(db, definition);
Expand Down
107 changes: 76 additions & 31 deletions crates/red_knot_python_semantic/src/types/infer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ use crate::semantic_index::SemanticIndex;
use crate::stdlib::builtins_module_scope;
use crate::types::diagnostic::{TypeCheckDiagnostic, TypeCheckDiagnostics};
use crate::types::{
bindings_ty, builtins_symbol_ty, declarations_ty, global_symbol_ty, symbol_ty,
bindings_ty, builtins_symbol_ty, declarations_ty, global_symbol_lookup, symbol_ty,
typing_extensions_symbol_ty, BytesLiteralType, ClassType, FunctionKind, FunctionType,
StringLiteralType, Truthiness, TupleType, Type, TypeArrayDisplay, UnionType,
};
Expand Down Expand Up @@ -2210,7 +2210,7 @@ impl<'db> TypeInferenceBuilder<'db> {
let ty = if file_scope_id.is_global() {
Type::Unbound
} else {
global_symbol_ty(self.db, self.file, name)
global_symbol_lookup(self.db, self.file, name)
};
// Fallback to builtins (without infinite recursion if we're already in builtins.)
if ty.may_be_unbound(self.db) && Some(self.scope) != builtins_module_scope(self.db) {
Expand Down Expand Up @@ -3016,6 +3016,7 @@ mod tests {
use ruff_db::system::{DbWithTestSystem, SystemPathBuf};
use ruff_db::testing::assert_function_query_was_not_run;
use ruff_python_ast::name::Name;
use test_case::test_case;

use super::TypeInferenceBuilder;

Expand Down Expand Up @@ -3353,7 +3354,7 @@ mod tests {
}

#[test]
fn imported_unbound_symbol_is_unknown() -> anyhow::Result<()> {
fn imported_unbound_symbol_is_never() -> anyhow::Result<()> {
let mut db = setup_db();

db.write_files([
Expand All @@ -3364,8 +3365,8 @@ mod tests {

// the type as seen from external modules (`Unknown`)
// is different from the type inside the module itself (`Unbound`):
assert_public_ty(&db, "src/package/foo.py", "x", "Unbound");
assert_public_ty(&db, "src/package/bar.py", "x", "Unknown");
assert_public_ty(&db, "src/package/foo.py", "x", "Never");
assert_public_ty(&db, "src/package/bar.py", "x", "Never");
Comment on lines 3366 to +3369
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a behavior change we want. If you import a fully undeclared/undefined name, you should get a diagnostic, but the imported name should have type Unknown in this case. Otherwise you'll just get lots of useless cascading errors about doing invalid things with Never.

There is currently a special case in TypeInferenceBuilder::infer_import_from_definition where we transform Unbound to Unknown -- that's why this test exists. That special case becomes obsolete with this PR (we'll never get Unbound there anymore), but I think the special case should be updated to check for Never, emit a diagnostic (use self.add_diagnostic), and transform the Never to Unknown. (It's also fine if you want to just add a TODO for the diagnostic and we can add it in a later PR; we'll also want a similar diagnostic that's currently missing in infer_import_definition, for a nonexistent module.)


Ok(())
}
Expand Down Expand Up @@ -3913,7 +3914,7 @@ mod tests {
)?;

// TODO: sys.version_info, and need to understand @final and @type_check_only
assert_public_ty(&db, "src/a.py", "x", "Unknown | EllipsisType");
assert_public_ty(&db, "src/a.py", "x", "EllipsisType | Unknown");

Ok(())
}
Expand Down Expand Up @@ -4245,6 +4246,40 @@ mod tests {
Ok(())
}

#[test]
fn resolve_unbound_public() -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fn resolve_unbound_public() -> anyhow::Result<()> {
fn resolve_maybe_undeclared_maybe_unbound_public() -> anyhow::Result<()> {

let mut db = setup_db();

db.write_dedented(
"src/a.py",
"
if flag:
x: int = 1
",
)?;

assert_public_ty(&db, "src/a.py", "x", "int");

Ok(())
}

#[test]
fn resolve_unannotated_unbound_public() -> anyhow::Result<()> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: it's only maybe unbound

Suggested change
fn resolve_unannotated_unbound_public() -> anyhow::Result<()> {
fn resolve_unannotated_maybe_unbound_public() -> anyhow::Result<()> {

let mut db = setup_db();

db.write_dedented(
"src/a.py",
"
if flag:
x = 1
",
)?;

assert_public_ty(&db, "src/a.py", "x", "Literal[1]");

Ok(())
}

#[test]
fn literal_int_arithmetic() -> anyhow::Result<()> {
let mut db = setup_db();
Expand Down Expand Up @@ -4461,8 +4496,8 @@ mod tests {
)?;

assert_public_ty(&db, "src/a.py", "x", "Literal[3, 4, 5]");
assert_public_ty(&db, "src/a.py", "r", "Unbound | Literal[2]");
assert_public_ty(&db, "src/a.py", "s", "Unbound | Literal[5]");
assert_public_ty(&db, "src/a.py", "r", "Literal[2]");
assert_public_ty(&db, "src/a.py", "s", "Literal[5]");
Comment on lines +4499 to +4500
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of those cases where part of the whole purpose of the test is actually to show that we know which names might be unbound, so we should really update the test such that it doesn't use (only)assert_public_ty, because it's really not the public type we want to assert on.

I earlier suggested an approach using reveal_type, and I think that's actually a fine option in the tests below where you use it.

I realized a simpler approach in some of these tests (particularly the global-scope ones) might be to keep these assert_public_ty but add another helper assert_may_be_unbound (which uses the use-def map directly) and explicitly assert here that r and s may be unbound. This way we actually get both the assertion on the public type, and an assertion that locally we do know the name might be unbound.

Ok(())
}

Expand Down Expand Up @@ -4603,7 +4638,7 @@ mod tests {
)
.unwrap();

assert_public_ty(&db, "src/a.py", "y", "Unbound | Literal[2, 3]");
assert_public_ty(&db, "src/a.py", "y", "Literal[2, 3]");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above, should have an assertion in some form that y can be unbound here

}

#[test]
Expand Down Expand Up @@ -4719,24 +4754,29 @@ mod tests {
let y_ty = symbol_ty(&db, function_scope, "y");
let x_ty = symbol_ty(&db, function_scope, "x");

assert_eq!(x_ty.display(&db).to_string(), "Unbound");
assert_eq!(x_ty.display(&db).to_string(), "Never");
assert_eq!(y_ty.display(&db).to_string(), "Literal[1]");

Ok(())
}

#[test]
fn conditionally_global_or_builtin() -> anyhow::Result<()> {
#[test_case("")]
// Tests that we only use the definition of a symbol instead of its declaration when we are
// checking module globals without a nonlocal binding.
Comment on lines +4764 to +4765
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So on further thought and discussion with the team, I think I pointed you in the wrong direction here yesterday; sorry about that! I think we should always use the same logic, respecting both declarations and definitions. (Maybe we will want to consider using only definitions for a not-global nonlocal reference, but that's not a change we should make in this PR.)

I still think the test case you added here is useful! But I would want the assertion to change such that when you add the annotation, you get Literal[copyright] | int instead of Literal[copyright] | Literal[1].

And if the assertion changes, it's probably clearer to be a fully separate test rather than a #[test_case]?

#[test_case(": int"; "with a declaration")]
fn conditionally_global_or_builtin(annotation: &'static str) -> anyhow::Result<()> {
let mut db = setup_db();

db.write_dedented(
"/src/a.py",
"
if flag:
copyright = 1
def f():
y = copyright
&format!(
"
if flag:
copyright {annotation} = 1
def f():
y = copyright
",
),
)?;

let file = system_path_to_file(&db, "src/a.py").expect("Expected file to exist.");
Expand Down Expand Up @@ -4784,7 +4824,7 @@ mod tests {
let y_ty = symbol_ty(&db, class_scope, "y");
let x_ty = symbol_ty(&db, class_scope, "x");

assert_eq!(x_ty.display(&db).to_string(), "Unbound | Literal[2]");
assert_eq!(x_ty.display(&db).to_string(), "Literal[2]");
assert_eq!(y_ty.display(&db).to_string(), "Literal[1]");

Ok(())
Expand Down Expand Up @@ -5447,7 +5487,7 @@ mod tests {
",
)?;

assert_public_ty(&db, "src/a.py", "x", "Unbound | int");
assert_public_ty(&db, "src/a.py", "x", "int");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another case where it's important that we assert on may-be-unbound


Ok(())
}
Expand Down Expand Up @@ -5553,7 +5593,7 @@ mod tests {
",
)?;

assert_public_ty(&db, "src/a.py", "x", "Unbound | int");
assert_public_ty(&db, "src/a.py", "x", "int");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and another


Ok(())
}
Expand All @@ -5567,6 +5607,8 @@ mod tests {
db.write_dedented(
"src/a.py",
"
from typing import reveal_type

async def foo():
class Iterator:
def __next__(self) -> int:
Expand All @@ -5578,12 +5620,14 @@ mod tests {

async for x in Iterator():
pass

reveal_type(x)
",
)?;

// We currently return `Todo` for all `async for` loops,
// including loops that have invalid syntax
assert_scope_ty(&db, "src/a.py", &["foo"], "x", "Unbound | @Todo");
assert_file_diagnostics(&db, "src/a.py", &["Revealed type is 'Unbound | @Todo'."]);

Ok(())
}
Expand All @@ -5595,6 +5639,8 @@ mod tests {
db.write_dedented(
"src/a.py",
"
from typing import reveal_type

async def foo():
class IntAsyncIterator:
async def __anext__(self) -> int:
Expand All @@ -5606,11 +5652,13 @@ mod tests {

async for x in IntAsyncIterable():
pass

reveal_type(x)
",
)?;

// TODO(Alex) async iterables/iterators!
assert_scope_ty(&db, "src/a.py", &["foo"], "x", "Unbound | @Todo");
assert_file_diagnostics(&db, "src/a.py", &["Revealed type is 'Unbound | @Todo'."]);

Ok(())
}
Expand All @@ -5631,7 +5679,7 @@ mod tests {
&db,
"src/a.py",
"x",
r#"Unbound | Literal[1] | Literal["a"] | Literal[b"foo"]"#,
r#"Literal[1] | Literal["a"] | Literal[b"foo"]"#,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should assert on maybe-unbound here

);

Ok(())
Expand Down Expand Up @@ -5660,7 +5708,7 @@ mod tests {
"src/a.py",
&["Object of type 'NotIterable' is not iterable."],
);
assert_public_ty(&db, "src/a.py", "x", "Unbound | Unknown");
assert_public_ty(&db, "src/a.py", "x", "Unknown");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here


Ok(())
}
Expand Down Expand Up @@ -5783,10 +5831,7 @@ mod tests {

assert_file_diagnostics(&db, "src/a.py", &[]);

// TODO: once we support `sys.version_info` branches,
// we can set `--target-version=py311` in this test
// and the inferred type will just be `BaseExceptionGroup` --Alex
assert_public_ty(&db, "src/a.py", "e", "Unknown | BaseExceptionGroup");
assert_public_ty(&db, "src/a.py", "e", "BaseExceptionGroup");

Ok(())
}
Expand All @@ -5812,7 +5857,7 @@ mod tests {
// and the inferred type will just be `BaseExceptionGroup` --Alex
//
Comment on lines 5857 to 5858
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment (including the two previous lines GH won't let me comment on) can be removed.

// TODO more precise would be `ExceptionGroup[OSError]` --Alex
assert_public_ty(&db, "src/a.py", "e", "Unknown | BaseExceptionGroup");
assert_public_ty(&db, "src/a.py", "e", "BaseExceptionGroup");

Ok(())
}
Expand All @@ -5838,7 +5883,7 @@ mod tests {
// and the inferred type will just be `BaseExceptionGroup` --Alex
//
Comment on lines 5883 to 5884
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this one can be removed too

// TODO more precise would be `ExceptionGroup[TypeError | AttributeError]` --Alex
assert_public_ty(&db, "src/a.py", "e", "Unknown | BaseExceptionGroup");
assert_public_ty(&db, "src/a.py", "e", "BaseExceptionGroup");

Ok(())
}
Expand Down Expand Up @@ -6101,7 +6146,7 @@ mod tests {
",
)?;

assert_scope_ty(&db, "src/a.py", &["foo", "<listcomp>"], "z", "Unbound");
assert_scope_ty(&db, "src/a.py", &["foo", "<listcomp>"], "z", "Never");

// (There is a diagnostic for invalid syntax that's emitted, but it's not listed by `assert_file_diagnostics`)
assert_file_diagnostics(&db, "src/a.py", &[]);
Expand Down
Loading