Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove defaults #3

Merged
merged 13 commits into from
Jan 20, 2025
Merged

Conversation

johannahaffner
Copy link
Contributor

@johannahaffner johannahaffner commented Jan 17, 2025

As you can see, this is a tiny tweak.

However, I'm wondering if removing defaults is ...itself a reasonable default? In other words, if this should be an option that defaults to True, same as short_arrays. And if it should then be implemented for other Python types as well.
(For this reason, I have not updated the tests yet.)

PS: With this tweak, the optimistix documentation for IndirectLevenbergMarquardt looks like this

Screenshot 2025-01-17 at 18 35 29

instead of like this:

Screenshot 2025-01-17 at 18 41 27

x-ref: patrick-kidger/optimistix#109 (comment)

@johannahaffner johannahaffner marked this pull request as draft January 17, 2025 17:44
@patrick-kidger
Copy link
Owner

I think making this an option sounds reasonable!

if this should be an option that defaults to True, same as short_arrays

Yeah, much like short_arrays I think this is best enabled by default.

And if it should then be implemented for other Python types as well.

I'm not sure what you mean by this! Which other Python types might we also wish to adjust?

@johannahaffner
Copy link
Contributor Author

I'm not sure what you mean by this! Which other Python types might we also wish to adjust?

NamedTuples, for instance, can have default values as well IIRC. I would tend to not include that, but I wanted to check if there is something you have in mind that I don't.


I'll get to work then :)

@johannahaffner
Copy link
Contributor Author

johannahaffner commented Jan 17, 2025

By the way, just looking at this: why do we not support/special-case jaxtyping aliases that are defined as a union, like ScalarLike? I get that this would add a dependency, so do you think these would be better handled by customising the display where relevant?

For instance, I find the long enumeration of things that ScalarLike is an alias for doesn't add any clarity in the example above (lambda_0). Personally, I might find the existence of bool and numpy.bool rather confusing as part of this.

@patrick-kidger
Copy link
Owner

Good point about namedtuples. I feel like that for that we probably do want to include defaults, since they're more tuple than class? Not sure. I basically never use them, FWIW.

As for handling pretty unions: this was on my mind as well, but I didn't have an immediate nice way to handle it here. Two possible options:

  • Have downstream libraries wrap their FooScalarLike definitions withif getattr(typing, "GENERATING_DOCUMENTATION", True), and instead define it as a dummy object with a different repr if we are generating documentation.
  • Offer some mechanism here that jaxtyping could use (like __pdoc__) to unconditionally get a pretty repr all the time.

I definitely wouldn't try to handle it here, otherwise we'll pick up a circular dependency jaxtyping<>wadler_lindig!

@johannahaffner johannahaffner marked this pull request as ready for review January 18, 2025 12:15
@johannahaffner
Copy link
Contributor Author

johannahaffner commented Jan 18, 2025

Good point about namedtuples. I feel like that for that we probably do want to include defaults, since they're more tuple than class? Not sure. I basically never use them, FWIW.

I never use them either. They're used extensively in blackjax to define the sampler states, but I don't think they have default values in there. That is pretty much the only place I come across them. I opted to ignore these for now, if there is ever a request then we can add them.


As for the PR: showing defaults is now turned off by default, but fields that have default values will still be shown if that value has been tweaked. As a concrete example, if an optimistix solver/search/descent defaults to a two_norm, but a max_norm has been passed instead, the field will be shown in the display.

As for the unions: I'll get back to you on that!

wadler_lindig/_definitions.py Outdated Show resolved Hide resolved
wadler_lindig/_definitions.py Outdated Show resolved Hide resolved
wadler_lindig/_definitions.py Outdated Show resolved Hide resolved
@johannahaffner
Copy link
Contributor Author

All done :)

docs/requirements.txt Outdated Show resolved Hide resolved
Comment on lines 288 to 289
or field.default is dataclasses.MISSING
or field.default != getattr(obj, field.name)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think just the second check is needed here? If field.default is dataclasses.MISSING then we'll definitely also trigger the next check.

FWIW btw just a plain getattr isn't safe if the dataclass is only partially initialised:

@dataclass
class Foo:
    x: int
    
    def __init__(self):
         wl.pformat(self)
        self.x = 4

(Not sure if we have a test for this, I just know to look out for it.)
This is the reason for the _WithRepr("<uninitialised>") in the other getattr above. In particular this scenario arises when runtime-type-checking inputs to dataclass initialisation.

I'm also thinking that the != check may fail if we compare nonscalar numpy arrays, for which __eq__ is overridden. It has the potential to be slightly more verbose but I think an is check might be safer here.

Putting this together, maybe what we need here is:

uninitialised = _WithRepr("<uninitialised>")
objs = []
for field in dataclasses.fields(obj):
    if field.repr:
        value = getattr(obj, field.name, uninitialised)
        if not (kwargs["hide_defaults"] and value is field.default):
            objs.append((field.name.removeprefix(type_name), value))
objs = named_objs(objs, **kwargs)

Or something like that anyway.

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented, this does everything it should! I did not know about the getattr footgun, thanks for pointing it out!

@johannahaffner
Copy link
Contributor Author

Just saw that this does not catch the case where the default is only defined in __init__, like this

@dataclasses.dataclass
class Foo:
    array: np.ndarray
    def __init__(self, array: np.ndarray = np.ones((3, 4))):
         self.array = array

I'll figure this out.

@johannahaffner
Copy link
Contributor Author

Hm not sure if this can be done reliably. The parameter name in __init__ does not necessarily have to match the attribute name, and I think I would need to link those? I'm going to enjoy the sunshine a bit and come back to this :)

@johannahaffner
Copy link
Contributor Author

Do you have an idea for dealing with this? The only thing that I can come up with is:

  • get default values for fields
  • get signature of __init__ for the dataclass
  • get source of __init__ for the dataclass
  • map init args to fields
  • check values

and that seems very, very clunky!

@patrick-kidger
Copy link
Owner

I hope you enjoyed the sunshine today :)

Honestly, I'm not sure we can handle it. Someone could do completely arbitrary things inside __init__, after all!

Maybe there's an elegant way for the author of the dataclass to opt-in to this behaviour instead? For example something like this would still work (and could maybe still be improved upon):

_x_default = 4

@dataclass
class Foo:
  x: int = _x_default

  def __init__(self, x=_x_default):
    self.x = x

@johannahaffner
Copy link
Contributor Author

I think that makes sense. Should this be an info box in the documentation or something in the FAQ?

@johannahaffner
Copy link
Contributor Author

I hope you enjoyed the sunshine today :)

And thanks, I did!

@patrick-kidger
Copy link
Owner

I'm not sure how common this is at the moment / if this is something we should advertise.

I'm guessing you're thinking about this because you found some use cases in the Equinox ecosystem that would make this useful?

@johannahaffner
Copy link
Contributor Author

johannahaffner commented Jan 18, 2025

I'm guessing you're thinking about this because you found some use cases in the Equinox ecosystem that would make this useful?

In optimistix, defaults for the norms are defined in this way at the solver level, and verbose options too.
For searches and descents, default values are usually defined at the level of the fields.

Edit: e.g. here:

https://github.com/patrick-kidger/optimistix/blob/dcafc480d4a2bb9ec3b62658d5ae136db847b2d8/optimistix/_solver/bfgs.py#L335

@johannahaffner
Copy link
Contributor Author

I think the most pragmatic way forward here is to do our own workarounds in optimistix where helpful, and leave this thread here as a minimal documentation of the issue?

@patrick-kidger
Copy link
Owner

Agreed. We could probably write an eqxi decorator that copies defaults from __init__ into the field, so that this works. Maybe even just during docgen. And it's on us as users to make sure that method arguments and dataclass fields share a name.

@patrick-kidger patrick-kidger merged commit f1fbe39 into patrick-kidger:main Jan 20, 2025
2 checks passed
@johannahaffner johannahaffner deleted the remove-defaults branch January 20, 2025 11:40
@johannahaffner
Copy link
Contributor Author

Agreed on the second point, and I'll put the first point on my list of not-super-urgent TODOs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants