-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split-init allowed past guaranteed-exit control flow #26968
Comments
Thinking about this a bit more over the weekend, I'm increasingly convinced that the first two programs should be equivalent to: proc foo() {
var x;
}
foo(); which generates the reasonable error in production ATO:
|
Spec says
Here is an example: proc foo() {
var x;
if isOK { // assuming isOK is not param here
x = createSomeValue();
} else {
return false;
}
// do something with x
return true;
} It's similar if, instead of This other example is similar and also intended to work AFAIK: proc foo() {
var x;
if !isOK { // assuming isOK is not param here
return false;
}
x = createSomeValue();
// do something with x
return true;
} As a result, I would suggest solving the problem differently from this suggestion in the OP.
proc test() {
var x;
if true then
return;
x = 1;
} IMO this should compile regardless of whether the condition is param. That's exercising an intentional feature of split-init that is documented in the spec. Regarding this: proc test() {
var x;
if true then
return;
} I think we should focus on making decisions about this even simpler program as a starting point: proc test() {
var x;
return;
} It behaves similarly (doesn't compile with the same error). My preference would be to keep this an error, of one sort or another. Conceptually, we might consider this legal for purposes of split-init. (Why would we do that? because it would make the rules have a nicer separation of concerns between conditionals and other code). After deciding It would be OK with me to make that a warning rather than an error (since we can certainly compile it and just ignore that Note that we have discussed some odd behavior around split init + param conditionals before. See #14996. |
My understanding is that this conflicts fundamentally with other parts of the compilation process. E.g., we frequently use an early-return pattern + E.g.: proc foo(type t) {
var x;
if requiresSpecialHandling(t) {
return handleSpecially(t);
}
x = handleNormally(t); // will throw a compiler error if `requiresSpecialHandling` is true
} Currently, we do not resolve code after early returns, because early returns often do type-sensitive processing and rule out cases in which code after them will not work. However, determining the type of |
To be clear, what I meant was, "it should count as split init regardless...". See more below. We could just determine I would be OK with making it be an error to have a variable that's split-init when the initialization point can never be reached in the generated code. I think this needs to be an error about the variable being unused & not exactly one about split init. proc test() {
var x;
return;
} currently an error, I think it should stay an error, but a different error. I think the error should say something like "You have a variable proc test() {
var x;
if true then
return;
} currently the same error. I think it should get the same treatment: "You have a variable proc test() {
var x;
if true then
return;
x = 1;
} currently compiles, but it would be reasonable to make it an error because Depending on what we learn if we try to implement that, we should consider ignoring |
Another simple program we should consider is: proc test() {
var x;
return;
x = 1;
} which passes today, unfortunately for the stance I was taking above, as that suggests that perhaps proc test() {
var x;
if true then return;
x = 1;
} should as well. Breaking code post-2.0 questions aside, my instinct would still be to make both of the above illegal, similar to what Michael's saying here:
(which seems like a change from what you said previously, is that right Michael?)
|
Yes,
|
…inference, and split init etc. (#26955) Makes progress on Cray/chapel-private#7229, but does not resolve it. This PR takes the first step towards handling early-return or early-error patterns in the standard library. At the same time, it improves various passes' handling of diverse control flow (enables `continue`, `return`, etc in resolution, `break` and `continue` in init/deinit etc.). It also reduces cases in which default init etc. was resolved for declarations that are not reachable. See future work below for cases this PR does not handle. I approached this by making the following observation: a large number of passes / visitors rely on an approximating control flow information. * `ReturnTypeInferrer` (used for computing function return types) keeps a stack of frames along with information about whether they continued, had a `break`, or returned. This is used for iterating over `param` loops and ignoring unreachable `return`s. * `VarScopeVisitor` (used for not handling init/deinit/etc. in dead code, but also for enabling more split-init cases) * `Resolver` (not used prior to this PR, but would need to reason about returns etc. to avoid resolving dead code) In each case, this is done by creating a stack of frames (which contain data like `returnsOrThrows`, `continues`, etc), and updating that stack during traversals. In each case, there is distinct logic to reconcile information from various frames, and to handle constructs like `if` and `select` where `param`-true branches cause other branches to be ignored. This PR unifies those traversals by defining a new abstract base class called `BranchSensitiveVisitor`. This class does not have any `visit` methods of its own (the subclasses are intended to provide that). It provides methods such as `enterScope` and `exitScope` which take care of tracking and combining control flow information (e.g., deciding if all branches return/throw). It also provides methods `branchSensitivelyTraverse`, which implement common `param`-folding logic for conditionals and if/elses. It then switches _all frame-based traversals_ to use this new `BranchSensitiveVisitor`. This has numerous advantages: 1. There is less duplicated code (resolving a TODO in `return-type-inference.cpp`, which started as a watered down duplication of `VarScopeVisitor`'s code, but evolved in an orthogonal direction). 2. Each pass is guaranteed to behave the same way w.r.t. control flow. If, e.g., `Resolver` skips code because it's dead, we know that `ReturnTypeInferrer` and `VarScopeVisitor` will skip that code too). 3. It's easy to enhance analyses (like `Resolver`) with control flow handling without much additional code. As a nice side effect, `VarScopeVisitor` became aware of `continue` and `break` (previously only handled by return type inference), and `Resolver` became better at handling `break` and `continue` in `param` loop). This PR takes an opinionated stance on #26968 and disallows split-init across returns. This is a consequence of not visiting dead code for computing split-init types. ## AST Traversal Changers In all traversals, we want to skip visiting code when it's dead. This is both a performance concern (why waste time resolving functions that are not reached?) and a correctness concern (e.g., `compilerError` is guaranteed not to do anything directly after a `return`). How does one add this logic to every single `visit` / `traverse` call? I considered several options: 1. Adding a "prelude" to each `visit` method that returns early if the `BranchSensitiveVisitor` has determined that it will not be reached. This prelude could be a macro to avoid duplicating a lot of code. However, this approach is undesirable because it leads to some duplication (invocations of the macro at least; duplicate code at worst) in every single `visit` method. Further, authors of `visit` methods must remember to include this prelude lest the dead-code sensitivity be affected. 2. Following the lead of `ResolvedVisitor` and having a custom `visit` method that dispatches to some other method (e.g., `visitImpl`). This way, the `visit` can be defined once, and the dead-code checking can happen before the dispatch to `visitImpl`. However, this introduces a second dispatch, introduces churn (changing all `visit` to `visitImpl` on resolver), and is anti-compositional (if `visit` has different signatures, like `visit(node)` in Resolver and `visit(node, RV)` in `VarScopeVisitor`, `visitImpl` would have to also have a different signature; this introduces an m-by-n implementation problem). 3. Configuring the `traverse` function to exit early before even invoking `enter` and `exit`. This can be done using a template specialization to provide a "default value", and to avoid runtime overhead. I settled for 3. Specifically, all invocations of `traverse` now reference a template class with a static method, `AstVisitorPrecondition<T>::skipSubtree(node, visitor)` to decide whether to invoke `enter` / `exit` on a node and its children. The default template for `AstVisitorPrecondition` always returns true, which means the default behavior of `traverse` remains the same. However, visitors that want to return early / skip visiting code altogether can specialize the template and define a condition. `Resolver`, `VarScopeVisitor`, and `ReturnTypeInferrer` all do this; they check the current frame for signs that `return` has already been encountered, and if so, do not descend into any more AST nodes. For `Resolver`, this does not happen when `scopeResolveOnly` is set, since we always want to enter every node in that mode. I find this appealing because user code saw very little churn (10 lines to specialize the template for `Resolver`, e,g.), and each new `enter` / `exit` method added to `Resolver` will automatically work with early returns (since the `traverse` call which invokes `enter`/ `exit` knows to check). The precondition could be easily defined for all visitors affected by this PR. ## Macro Tweaks @jabraham17 and I have settled on a pattern in `chapel-py` in which the X-macro files (e.g., `uast-methods.h`) auto-defined missing X-macros and undefine them automatically. This helps reduce a lot of boilerplate. As I was experimenting with dispatch (approach 2) above, I found it convenient to apply this pattern to the Dyno code proper. Rather than polluting `uast-classes-list.h` and `type-classes-list.h` (which IMO serve as good references), I added wrapper files that handle the aforementioned niceties. I then switched `AstNode.h` and `Type.h` to using these wrappers, saving a considerable amount of noise. I did not end up with new uses of the adapters in my code (since I went with template specialization instead), but I thought the cleanup is worth including here. ## Module Changes This PR reverts (some) changes to module code from b789f5b and ee57b16, which were previously required to work around the resolver's inability to reason about early returns. Some code, however (`ChapelDomain.chpl` from b789f5b), relies on `compilerError` being treated as early return, which is future work. Reviewed by @arezaii -- thanks! ## Future Work - [ ] `compilerError`, which is considered an "early return" in production - [ ] branches that both "exit" but differently (e.g., code after `if bla then continue else break` does not execute) - [ ] param loops in the `VarScopeVisitor` family of passes (init/deinit, split init, copy elision) (see Cray/chapel-private#4209) - [ ] labeled `break` / `continue` (Cray/chapel-private#7245) ## Testing - [x] dyno tests - [x] performance testing -- frontend tests did not get any slower in CI or locally - [x] paratest ## Tested cases - [x] `VarScopeVisitor` now knows about `break` and `continue` - [x] `VarScopeVisitor` now stops traversing after `return`, which should enable reasoning about early returns (already tested) - [x] `Resolver` now reasons about control flow and skips resolving dead code - [x] `Resolver` now properly handles `continue`, `break`, and `return` in `param` loops
In production, this program compiles just fine:
but the following program does not:
I find this surprising, because, as far as Chapel is concerned, the code below return doesn’t happen. This is particularly true when we use early-returns to avoid compilerError:
combining all this, I think we end up with odd behavior in the following case:
I believe that we should emit an in both cases about missing split-initialization for
x
(whetherx = 11
is present or not). We should also clarify the specification to say that split init is not allowed across returns.The text was updated successfully, but these errors were encountered: