Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add merge() for tuples #57052

Closed
wants to merge 1 commit into from
Closed

add merge() for tuples #57052

wants to merge 1 commit into from

Conversation

aplavin
Copy link
Contributor

@aplavin aplavin commented Jan 15, 2025

merge() makes sense for any indexed collection, while it's currently only defined for dicts and namedtuples.

Here, I define it for regular tuples, with the same semantics:

julia> merge((a=1, b=2, c=3), (a=4, b=5))
(a = 4, b = 5, c = 3)

julia> merge((1,2,3), (4,5))
(4, 5, 3)

Potentially, the merge docstring should also be generalized to say that it merges all collections in the same way – merge(a, b)[i] == b[i] for i in keys(b), and a[i] for other i.

Will add tests and update docs if this addition is ok. Maybe also define for vectors?

@LilithHafner LilithHafner added needs tests Unit tests are required for this change feature Indicates new feature / enhancement requests labels Jan 15, 2025
@LilithHafner
Copy link
Member

merge currently only works on collections of key-value pairs, with semantics based on key, not index, when both keys and indices are present (as in NamedTuple).

julia> merge((a=1, b=2, c=3), (c=4, b=5))
(a = 1, b = 5, c = 4)

I appreciate the idea and the proposal but I do not think this method is consistent with the generic function's semantics. If there's a compelling usecase, perhaps we could find an alternative way to meet it.

@aplavin
Copy link
Contributor Author

aplavin commented Jan 15, 2025

Indeed, the keys/indices story is a bit of a mixup in Julia – but that's a general issue, not really specific to this PR.
All merge methods, including the proposed one for tuples, merge with respect to keys() of corresponding objects.keys are integers for tuples, and symbols for namedtuples.

@nsajko nsajko added the collections Data structures holding multiple items, e.g. sets label Jan 15, 2025
@@ -349,6 +349,9 @@ function _front(v, t...)
(v, _front(t...)...)
end


merge(a::Tuple, b::Tuple) = (b..., last(a, length(a) - length(b))...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

julia> merge((), (7,))
ERROR: ArgumentError: Take length must be non-negative

Fix:

Suggested change
merge(a::Tuple, b::Tuple) = (b..., last(a, length(a) - length(b))...)
merge(a::Tuple, b::Tuple) = (b..., last(a, max(0, length(a) - length(b)))...)

@MasonProtter
Copy link
Contributor

MasonProtter commented Jan 15, 2025

I feel a deep discomfort with this method, but I must mention @LilithHafner that the NamedTuple case also violates the semantics you mention, since NamedTuples aren't really key-value pairs either. They're collections of values:

julia> nt = (a=1, b=2, c=3);

julia> [x for x in nt]
3-element Vector{Int64}:
 1
 2
 3

julia> Dict(nt)
ERROR: ArgumentError: AbstractDict(kv): kv needs to be an iterator of 2-tuples or pairs
Stacktrace:
 [1] _throw_dict_kv_error()
[...]

If the claim is that NamedTuples are key-value pairs because you can use keys, values, and pairs on them, then that's also true of Tuple.

All of that said, I'm rather happy with merge working on NamedTuple and rather uncomfortable with the idea of it working this way on Tuple. It just feels yucky and wrong, but I don't think there's really a principled argument against it...

@LilithHafner LilithHafner added the triage This should be discussed on a triage call label Jan 15, 2025
@MasonProtter
Copy link
Contributor

MasonProtter commented Jan 15, 2025

Maybe it'd help @aplavin if you could explain why you want this? Is it just because you feel it makes sense to exist, or is there a use-case for it? Perhaps a case-study of how this would be used would help with some of the apprehension people feel about it.

I'm struggling to picture what good this method could do, whereas I can imagine a lot of situations where I'd rather this threw a method error because I accidentally passed a Tuple but meant to pass a NamedTuple.

I also just think the meaning of this is likely to confuse users, even if it's actually a very simple and straightforward idea "from first principles".

@JeffreySarnoff
Copy link
Contributor

I do not want tuples merging by positional overwrite.

julia> nt1 = (a=1, b=2, c=3)
julia> nt2 = (b=4, g=5)

# this proposal would have

julia> merge(nt1, nt2)
(a = 1, b = 4, c = 3, g = 5)
julia> merge(values(n1), values(nt2))
(4, 5, 3)
julia> merge(values(nt2), values(nt1))
(1, 2, 3)

That seems less than ok.

@aplavin
Copy link
Contributor Author

aplavin commented Jan 16, 2025

Tbh, don't see how this behavior is significantly different from

julia> nt1 = (a=1, b=2, c=3)
julia> nt2 = (c=4, a=5)

julia> t1 = values(nt1)
julia> t2 = values(nt2)

julia> nt1[keys(nt2)]
(c = 3, a = 1)

julia> t1[keys(t2)]
(1, 2)

values(nt) explicitly ignores namedtuple names, that's the point of the function.

@LilithHafner
Copy link
Member

Triage agrees that this makes sense. However, all new features must have strong motivation (i.e. usecase where this is helpful) and we don't see that here.

@LilithHafner LilithHafner removed the triage This should be discussed on a triage call label Jan 16, 2025
@aplavin
Copy link
Contributor Author

aplavin commented Jan 16, 2025

all new features must have strong motivation

Interesting – didn't expect this to be universally required. I mean, it makes sense for adding new types/functions of course, but adding new methods for Base functions + Base types is different: this can only be done elsewhere with piracy.
Naturally, the bar should be somewhat lower, like "add all reasonable methods consistent with the generic function behavior".

My usecase is quite straightforward, I have a default set of options default_opts, and the user can override some of them by passing user_opts. As intuitively expected, user-provided options should take over the default for the same key.
Then, the final options are opts = merge(default_opts, user_opts). At least they should be – currently, it works when options are namedtuples but not when regular tuples.

@JeffBezanson
Copy link
Member

"add all reasonable methods consistent with the generic function behavior".

I guess the problem there is "reasonable" --- maybe sometimes a method is obvious, but here we did not see why a tuple would be used as a dictionary-like structure. Why would options be indexed positionally? Could you give some more details about the use case?

In any case, it is clear that we need to have a strong default preference for not adding things, or the bloat and maintenance load will just keep growing. This is needed to counter the natural ratchet effect of adding things usually not being breaking, but deleting things being breaking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
collections Data structures holding multiple items, e.g. sets feature Indicates new feature / enhancement requests needs tests Unit tests are required for this change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants