Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Forbid object lifetime changing pointer casts #136776

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

BoxyUwU
Copy link
Member

@BoxyUwU BoxyUwU commented Feb 9, 2025

Fixes #136702

r? @ghost

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Feb 9, 2025
@BoxyUwU
Copy link
Member Author

BoxyUwU commented Feb 9, 2025

@bors try

bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 9, 2025
…, r=<try>

[WIP] Forbid object lifetime changing pointer casts

Fixes rust-lang#136702

r? `@ghost`
@bors
Copy link
Contributor

bors commented Feb 9, 2025

⌛ Trying commit d5ebeac with merge 44f3504...

@bors
Copy link
Contributor

bors commented Feb 9, 2025

☀️ Try build successful - checks-actions
Build commit: 44f3504 (44f3504e96c944ae54fc72b5f5008f53f7eda001)

@BoxyUwU
Copy link
Member Author

BoxyUwU commented Feb 9, 2025

@craterbot check

@craterbot
Copy link
Collaborator

👌 Experiment pr-136776 created and queued.
🤖 Automatically detected try build 44f3504
🔍 You can check out the queue and this experiment's details.

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-crater Status: Waiting on a crater run to be completed. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 9, 2025
@craterbot
Copy link
Collaborator

🚧 Experiment pr-136776 is now running

ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot
Copy link
Collaborator

🎉 Experiment pr-136776 is completed!
📊 169 regressed and 4 fixed (580506 total)
📰 Open the full report.

⚠️ If you notice any spurious failure please add them to the denylist!
ℹ️ Crater is a tool to run experiments across parts of the Rust ecosystem. Learn more

@craterbot craterbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-crater Status: Waiting on a crater run to be completed. labels Feb 11, 2025
@RalfJung
Copy link
Member

RalfJung commented Feb 11, 2025

Most of these are on github; in terms of crates.io regressions all we have is:

  • may
  • a bunch of crates using metrics, see e.g. this (for metrics-0.23) or this (for metrics-0.24, the latest version)

Overall, 142 regressions are caused by metrics and 14 by may; if we ca get fixed versions of those crates out that seems to mostly cover it.

EDIT: Ah, there's also cogo.

@traviscross traviscross added the I-lang-nominated Nominated for discussion during a lang team meeting. label Feb 12, 2025
@traviscross
Copy link
Contributor

We discussed this in the lang triage call today. We wanted to think more about it, so we're leaving it nominated to discuss again.

@tmandry
Copy link
Member

tmandry commented Feb 19, 2025

@BoxyUwU Do you think it would be possible to implement this as an FCW? We talked about this in lang triage today and would prefer to start with that if we can. If it's not feasible, a hard error can also work (I would say though that we should upstream PRs to any crates we break).

Another small thing I noticed is that the error message links to the Nomicon section on variance, but it would be ideal to link to a tracking issue or something describing this issue in particular.

@traviscross
Copy link
Contributor

traviscross commented Feb 19, 2025

To add on to what tmandry, said, in our discussions we did feel that the approach taken in this PR is generally the right way forward, and we're happy to see this progress so as to help clear the way for arbitrary_self_types and derive_coerce_pointee.

cc @rust-lang/lang

@BoxyUwU
Copy link
Member Author

BoxyUwU commented Feb 26, 2025

@tmandry I do expect it to be possible to FCW this. We can likely do something hacky around to fully emulate the fix (but as a lint), but if that doesn't work out all the regression we found were relatively "simple" cases that can probably be taken advantage of (if need be) to lint a subset of the actual cases we'd break with this PR

edit: see compiler-errors' comment, I'm not so convinced this will be possible to FCW anymore and will likely investigate improving the diagnostics here. I've already filed PRs to the affected crates to migrate them over to a transmute to avoid the breakage if this lands

@compiler-errors
Copy link
Member

I was thinking earlier that it may be possible to implement a lint to detect, but it seems to me that MIR borrowck is not equipped to implement such a lint.

Specifically, it seems near impossible to answer whether a region outlives constraint (like, 'a: 'b) would not hold in a way that doesn't actually commit to that constraint, at least not without tons of false positives based on how NLL computes lower bounds for all of the regions it deals with in the MIR.

To fix this would require some significant engineering effort to refactor how NLL processes its region graph to make it easier to clone and reprocess with new constraints.

workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 4, 2025
…uto_to_object-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc `@adetaylor` `@Darksonn` `@BoxyUwU` `@RalfJung` `@compiler-errors` `@oli-obk` `@WaffleLapkin`

Related:

- rust-lang#135881
- rust-lang#136702
- rust-lang#136776

Tracking:

- rust-lang#127323
- rust-lang#44874
- rust-lang#123430
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Mar 5, 2025
…uto_to_object-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc ``@adetaylor`` ``@Darksonn`` ``@BoxyUwU`` ``@RalfJung`` ``@compiler-errors`` ``@oli-obk`` ``@WaffleLapkin``

Related:

- rust-lang#135881
- rust-lang#136702
- rust-lang#136776

Tracking:

- rust-lang#127323
- rust-lang#44874
- rust-lang#123430
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Mar 5, 2025
…uto_to_object-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc ```@adetaylor``` ```@Darksonn``` ```@BoxyUwU``` ```@RalfJung``` ```@compiler-errors``` ```@oli-obk``` ```@WaffleLapkin```

Related:

- rust-lang#135881
- rust-lang#136702
- rust-lang#136776

Tracking:

- rust-lang#127323
- rust-lang#44874
- rust-lang#123430
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Mar 5, 2025
Rollup merge of rust-lang#136764 - traviscross:TC/make-ptr_cast_add_auto_to_object-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc ```@adetaylor``` ```@Darksonn``` ```@BoxyUwU``` ```@RalfJung``` ```@compiler-errors``` ```@oli-obk``` ```@WaffleLapkin```

Related:

- rust-lang#135881
- rust-lang#136702
- rust-lang#136776

Tracking:

- rust-lang#127323
- rust-lang#44874
- rust-lang#123430
github-actions bot pushed a commit to rust-lang/rustc-dev-guide that referenced this pull request Mar 6, 2025
…ject-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc ```@adetaylor``` ```@Darksonn``` ```@BoxyUwU``` ```@RalfJung``` ```@compiler-errors``` ```@oli-obk``` ```@WaffleLapkin```

Related:

- rust-lang/rust#135881
- rust-lang/rust#136702
- rust-lang/rust#136776

Tracking:

- rust-lang/rust#127323
- rust-lang/rust#44874
- rust-lang/rust#123430
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Mar 6, 2025
…ject-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc ```@adetaylor``` ```@Darksonn``` ```@BoxyUwU``` ```@RalfJung``` ```@compiler-errors``` ```@oli-obk``` ```@WaffleLapkin```

Related:

- rust-lang/rust#135881
- rust-lang/rust#136702
- rust-lang/rust#136776

Tracking:

- rust-lang/rust#127323
- rust-lang/rust#44874
- rust-lang/rust#123430
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Mar 10, 2025
…ject-hard-error, r=oli-obk

Make `ptr_cast_add_auto_to_object` lint into hard error

In Rust 1.81, we added a FCW lint (including linting in dependencies) against pointer casts that add an auto trait to dyn bounds.  This was part of work making casts of pointers involving trait objects stricter, and was part of the work needed to restabilize trait upcasting.

We considered just making this a hard error, but opted against it at that time due to breakage found by crater.  This breakage was mostly due to the `anymap` crate which has been a persistent problem for us.

It's now a year later, and the fact that this is not yet a hard error is giving us pause about stabilizing arbitrary self types and `derive(CoercePointee)`.  So let's see about making a hard error of this.

r? ghost

cc ```@adetaylor``` ```@Darksonn``` ```@BoxyUwU``` ```@RalfJung``` ```@compiler-errors``` ```@oli-obk``` ```@WaffleLapkin```

Related:

- rust-lang/rust#135881
- rust-lang/rust#136702
- rust-lang/rust#136776

Tracking:

- rust-lang/rust#127323
- rust-lang/rust#44874
- rust-lang/rust#123430
@BoxyUwU BoxyUwU added the I-lang-easy-decision Issue: The decision needed by the team is conjectured to be easy; this does not imply nomination label Mar 13, 2025
@BoxyUwU
Copy link
Member Author

BoxyUwU commented Mar 13, 2025

Hi @rust-lang/lang I'm "re nominating" this (even though it was already nominated lol). I'm not asking for a decision as to whether the breakage here is acceptable/okay, instead I'd like to see if lang is fine delegating the decision here to the types team or if y'all would like to be kept in the loop here and be included on an FCP here (which I think would be perfectly fine :-)).

From my POV lang already FCP'd the decision to require VTables on raw pointers to be valid so this is effectively just enforcing that as we should have been and as such could be seen as "just" a small tweak to the implementation of the type system which historically the types team has had autonomy over.

@traviscross
Copy link
Contributor

traviscross commented Mar 13, 2025

@rustbot label -I-lang-easy-decision

We've been discussing this nomination near the start of every meeting to track status and see what our options might be due to what this is blocking. Niko has been checking in with people every week.

What Niko had discussed later yesterday in the RfL call on this is that the best option might be to go to a hard error but with targeted diagnostics that would detect this and issue a suggested code change.

My own estimation is that this going to remain a lang + types matter, given our interest in this and given that there's likely to be non-trivial breakage here, and that we're likely to have lang views on what represents an acceptable user experience for this migration.

@rustbot rustbot removed the I-lang-easy-decision Issue: The decision needed by the team is conjectured to be easy; this does not imply nomination label Mar 13, 2025
@nikomatsakis
Copy link
Contributor

@rustbot labels -I-lang-nominated

We discussed this in our meeting today. Meeting consensus is that given that warning is not feasible we are in favor of going forward with this change with the proviso that we will have an error message with actionable instructions and open PRs against known regressions.

Side note, informal design axioms for breaking changes...

  • First, don't break.
  • If you must break, give a warning.
  • If you can't give a warning, give actionable advice.
  • No matter what, fix as many folks as you can.

@rustbot rustbot removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Mar 19, 2025
@BoxyUwU
Copy link
Member Author

BoxyUwU commented Mar 22, 2025

PRs against affected crates have been opened and can be seen here:

There were three regressions I've not filed PRs against:


It feels a bit awkward to bring up after having filed these PRs but regardless it seems like due diligence to ask anyway; is it worth considering an alternative fix to this problem with arbitrary self types? A couple options:

Allow lifetime casts in unsafe code only

Someone asked on one of the PRs whether it would be reasonable to allow this code to continue to work when the code is placed in an unsafe block. This would mean that behaviour of as casts changes between safe/unsafe rather than unsafe simply allowing more operations to be performed.

This feels somewhat dubious to me as it is not super clear that a safety invariant is being introduced when as casting. unsafe is also tricky to learn as-is and it's already a source of confusion as to whether unsafe "disables" the borrow checker, I think this would make that problem worse even if it's only a fringe edge case.

Regardless- it would solve the soundness bug and minimize the breakage to some extent. Looking at the regressions this would only avoid breaking a few of the affected crates, but this does include the metrics crate which was by large the most common cause of breakage and would mean that people depending on old versions of the metrics crate will stay unbroken.

We could then go ahead and break this even in unsafe contexts across an edition where it's more "morally correct" to make a breaking change.

@RalfJung I imagine you would probably have opinions about muddling the waters around what unsafe code does in this way (?)

Require construction of smart pointers that implement DispatchFromDyn via raw pointers to be unsafe

If unsafe fields existed we could require #[derive(CoercePointee)] struct SmartPtr<T: ?Sized>{ ptr: *const T } to actually be written as

struct SmartPtr<T: ?Sized>{
    /// SAFETY: When `T` is a dyn-type it must not have a lifetime bound greater than the underlying type the vtable is for
    unsafe ptr: *const T
}

This would enforce that arbitrary user-defined smart pointers are correct in the presence of pointer casts of dyn type lifetime bounds in the same way that all the smart pointers in std are. E.g. Box/Arc all have unsafe from_raw functions whose safety invariants imply that the vtable is fine.

This would mean that arbitrary_self_types_pointers (which permits directly using raw pointers as receivers without an intermediate smart pointer type) would not be possible to stabilize at any point down the road without backing out of this choice and going back down the road that this PR currently takes.

Another problem would be that this would block arbitrary self types on unsafe fields being stabilized which would be quite unfortunate given the importance of stabilizing this feature. It would also require derive(CoercePointee) to be blocked on this as well.

However it would remove the need to error on casting *const dyn Trait to *const dyn Trait + Send which was approved by lang already in #136764`, resulting in there being no breaking changes required to stabilize arbitrary self types.


Going to re-nominate for lang with a question of whether which of these y'all would prefer (options are more thoroughly elaborated above):

  1. Just break this immediately on stable
  2. Break some subset of the crates on stable, while some will continue to work due to the casts happening to be in unsafe code which can then be broken over an edition
  3. Block arbitrary self types on unsafe fields which would no longer require this change for arbitrary self types stabilization.

@BoxyUwU BoxyUwU added the I-lang-nominated Nominated for discussion during a lang team meeting. label Mar 22, 2025
@RalfJung
Copy link
Member

Allow lifetime casts in unsafe code only

Purely conceptually, it seems fine to me to say that some as casts break library invariants and hence can only be done in an unsafe block. However, I don't know how hard this would be to teach. Cc @rust-lang/opsem

For this concrete question that would mean we have to allow such invalid-lifetime dyn trait values to exist temporarily (i.e., they satisfy the language invariant). Is that where we stand today, i.e.,, Miri would accept the as cast but then it can be used later to cause UB?

@RalfJung
Copy link
Member

Break some subset of the crates on stable, while some will continue to work due to the casts happening to be in unsafe code which can then be broken over an edition

This is quite dubious. We're retroactively attaching more safety obligations to an existing operation, and then if there's UB somewhere we tell you its your fault since you wrote unsafe and thereby promised you upheld the safety obligation that didn't even exist yet when you wrote the code?

Does metrics happen to actually be sound under the new semantics, i.e. the safety obligation is actually satisfied?

@Mark-Simulacrum
Copy link
Member

Is it accurate to say that the UB being "added" here is not detectable within e.g. Miri, because we lack the information about the lifetimes present to enforce that you didn't mess this up? It seems unfortunate if that's true, because I could easily see there being code out there that didn't satisfy this safety obligation but is already using transmute for other reasons. It's pretty common I think to see casts to 'static to allow temporarily, unsafely storing objects in some place, typically with an argument that it's actually safe so long as you're careful to only call/do stuff with them that you'd be able to with the original lifetime.

I don't see a clear alternative to this -- I think we are sort of stuck given past decisions -- but I hadn't seen that question brought up so wanted to raise it here. Or maybe I've misunderstood, and we're actually not adding UB from violating this condition -- merely working to prevent it, and only if you happen to explicitly do something "wrong" does your code actually break. (Essentially saying that you shouldn't leak such a value to safe code, but there's no UB from just having it).

= help: consider adding the following bound: `'a: 'b`
= note: requirement occurs because of a mutable pointer to `dyn Trait<'_>`
= note: mutable pointers are invariant over their type parameter
= help: see <https://doc.rust-lang.org/nomicon/subtyping.html> for more information about variance
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for us to note something in these errors pointing at some docs for why this is a bad idea? I could easily see someone just changing this to a transmute without realizing this is an intentional limitation of as casts.

I guess this falls under "dyn Trait metadata is invalid if it is not a pointer to a vtable for Trait that matches the actual dynamic trait the pointer or reference points to" in some sense (from https://doc.rust-lang.org/nightly/nomicon/what-unsafe-does.html) but maybe that should be clarified to say that it's not just "trait" but rather "trait and lifetime bounds on it" (or explicitly note this is a safety, not validity, invariant)...

@saethlin
Copy link
Member

Someone asked on one of the PRs whether it would be reasonable to allow this code to continue to work when the code is placed in an unsafe block. This would mean that behaviour of as casts changes between safe/unsafe rather than unsafe simply allowing more operations to be performed.

With regard to Ralf's ping, I think that this would be a teaching disaster and a regretful wart on the language unless this is just to provide a smoother deprecation period. as is already hard enough to teach because of the number of different operations that it can be depending on the types. And to the specific request, that author seems motivated by a desire to not use transmute because transmute is powerful and scary. In this case I think that desire is wrongheaded; as far as I can tell these as casts deserve the attention that at transmute would draw.

@RalfJung
Copy link
Member

Is it accurate to say that the UB being "added" here is not detectable within e.g. Miri, because we lack the information about the lifetimes present to enforce that you didn't mess this up? It seems unfortunate if that's true, because I could easily see there being code out there that didn't satisfy this safety obligation but is already using transmute for other reasons. It's pretty common I think to see casts to 'static to allow temporarily, unsafely storing objects in some place, typically with an argument that it's actually safe so long as you're careful to only call/do stuff with them that you'd be able to with the original lifetime.

I don't see a clear alternative to this -- I think we are sort of stuck given past decisions -- but I hadn't seen that question brought up so wanted to raise it here. Or maybe I've misunderstood, and we're actually not adding UB from violating this condition -- merely working to prevent it, and only if you happen to explicitly do something "wrong" does your code actually break. (Essentially saying that you shouldn't leak such a value to safe code, but there's no UB from just having it).

My understanding is that there's no immediate UB when doing the wrong-lifetime cast, but there can be UB further down the road since now we can make virtual function calls we shouldn't have been able to make. So, the cast breaks a library/safety invariant, but not a language/validity invariant. Miri can only check language invariants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-lang-nominated Nominated for discussion during a lang team meeting. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

arbitrary_self_types + derive_coerce_pointee allows calling methods whose where clauses are violated