Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[red-knot] more ergonomic and efficient handling of known builtin classes #13615

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Slyces
Copy link
Contributor

@Slyces Slyces commented Oct 3, 2024

Summary

This PR introduces a new enumeration BuiltinType that serves 2 purposes:

  • df16f21: give a common syntax for the convenience shortcuts to get a builtin instance
    • For that purpose, the enum doesn't need to be exhaustive - just to have the types we often need
  • c98ce74: save if a class is a builtin on creation, to save time on call when we need to check for custom behaviour (str(...), bool(...), ...)

I think for the first purpose that's mainly a syntax preference, you should be able to tell fairly fast if you prefer it that way.
For the second, that's mainly an optimisation that we might need once we handle more specific behaviour for builtin used as callable - this could clearly wait until we implement more of those.

Test Plan

No tests were added, but the existing test suite is enough to check if this introduced a regression.

This enumeration allows us to have shorter syntax for very common
builtin types. This mainly allows to gather convenience methods under
one common syntax (one place to look for).
On `ClassType` creation, check if the class is a common builtin and save
that information for later use during inference. This will allow us to
save time on every call for builtin class/functions (e.g.
`str`,`int`,`float`,`bool`, ...)
Copy link
Contributor

github-actions bot commented Oct 3, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Copy link
Contributor

@carljm carljm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great as a general direction! A few nits and thoughts on naming/API.

/// Non-exhaustive enumeration of builtin types to allow for easier syntax when interacting with
/// the most common builtin types (e.g. int, str, ...).
///
/// Feel free to expend this enum if you ever find yourself using the same builtin type in multiple
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Feel free to expend this enum if you ever find yourself using the same builtin type in multiple
/// Feel free to expand this enum if you ever find yourself using the same builtin type in multiple

}

pub fn to_instance(&self, db: &'db dyn Db) -> Type<'db> {
builtins_symbol_ty(db, self.as_str()).to_instance(db)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

Suggested change
builtins_symbol_ty(db, self.as_str()).to_instance(db)
self.to_class(db).to_instance(db)

@@ -424,27 +416,35 @@ impl<'db> Type<'db> {
(Type::Never, _) => true,
(_, Type::Never) => false,
(Type::IntLiteral(_), Type::Instance(class))
if class.is_stdlib_symbol(db, "builtins", "int") =>
if matches!(class.is_builtin(db), Some(BuiltinType::Int)) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a method on ClassType to make this less verbose? I would probably even call it is_builtin:

Suggested change
if matches!(class.is_builtin(db), Some(BuiltinType::Int)) =>
if class.is_builtin(db, BuiltinType::Int) =>

And then rename the is_builtin field to just builtin; is_ prefix suggests a boolean, not an optional enum.

@@ -1220,9 +1279,26 @@ pub struct ClassType<'db> {
definition: Definition<'db>,

body_scope: ScopeId<'db>,

is_builtin: Option<BuiltinType>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, I'd name this just builtin -- is_ prefix suggests a boolean.

/// Feel free to expend this enum if you ever find yourself using the same builtin type in multiple
/// places.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum BuiltinType {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit: let's rename this to BuiltinClass. This actually only handles classes, not all kinds of types (functions are types too).

Speaking of which, I kinda want to unify this PR and FunctionKind, in terms of both naming and API. The one trick there is that we want to represent known functions that aren't actually builtins (e.g. reveal_type). The same may be true in future for other stdlib classes!

So maybe ultimately what we'll actually want is KnownClass and KnownFunction enums, and a known field (which is an Option<KnownClass> or Option<KnownFunction> - I like this Option approach better than FunctionKind::Ordinary), and is_known method, on both ClassType and FunctionType. But this would require generalizing what you have in this PR to not assume builtins module, and instead classify the known functions/classes by module as well. I would be happy with doing all of this now, in this PR, or waiting on it if you prefer to land a simpler version of this PR for now.

cc @AlexWaygood in case you hate my naming/API preferences :)

Copy link
Member

@AlexWaygood AlexWaygood Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I considered the pros and cons of Option<KnownFunction> vs having FunctionKind::Ordinary while working on my previous PR. I ended up going with FunctionKind::Ordinary because I have a general preference having a flat list of possible states rather than representing possible states using an enum of enums. It makes it easier to count exactly how many possible states there are, and giving the default state a name (Ordinary) rather than using None makes it abundantly clear to new readers of the code what the default state represents. It's also more ergonomic to pattern-match on a flat list rather than a nested enum.

I don't feel too strongly, but a flat enum would be my preference :)

Copy link
Contributor

@carljm carljm Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think clearer naming is the primary reason I prefer Option<KnownFunction>. I don't like the name FunctionKind because it is so unnecessarily generic -- it doesn't clarify anything at all about what characteristic(s) of the function we are actually talking about. What happens when we later need to categorize functions according to some totally different cross-cutting categorization?

It seems clearer to me to have known = Some(KnownFunction) mean it's a known function (and then the enum specifies which one), and known = None to mean "not a known function." I suppose we could still use the KnownFunction name and have KnownFunction::NotKnown or KnownFunction::None -- it just seems weird to me to have a variant of KnownFunction that means... not a known function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're definitely right that FunctionKind is not a great name. And I think you're right that renaming it to KnownFunction addresses some of the concern I had that it wouldn't be obvious what the default state signified. It's pretty obvious that None indicates that it's not a known function!

I'm still not a massive fan of nested enums, but I guess in this case it's okay. Feel free to proceed :)

@carljm carljm added the red-knot Multi-file analysis & type inference label Oct 4, 2024
@carljm carljm changed the title Feat/builtins enum [red-knot] more ergonomic and efficient handling of known builtin classes Oct 4, 2024
}

impl<'db> ClassType<'db> {
/// Find if a class is a builtin type.
pub fn maybe_builtin(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name of this method currently because it implies that if it returns None the class is not a builtin. But that's not necessarily true, since our list of known builtins isn't exhaustive. I'd prefer maybe_known_builtin. Or just maybe_known if we go with the broader refactor/rename to handle any known class, even if it's not a builtin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
red-knot Multi-file analysis & type inference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants