Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefect 3 - Flow Run Crashes Silently #16373

Open
skohlleffel opened this issue Dec 13, 2024 · 2 comments
Open

Prefect 3 - Flow Run Crashes Silently #16373

skohlleffel opened this issue Dec 13, 2024 · 2 comments
Labels
enhancement An improvement of an existing feature

Comments

@skohlleffel
Copy link

Describe the current behavior

In Prefect 3, if a submitted/mapped task is not waited on and is the final task in the DAG, the flow will crash, but the final state will be reported as Completed. In Prefect 2, if a task crashed the flow, the final flow state was reported as such. The Prefect 3 behavior is not optimal. We migrated to Prefect 3 and didn't realize our flows were crashing since our automations weren't being triggered since the final flow states were Completed.

Silently Crashing Flow Example:

from prefect import flow, task

@task()
def add_one(x: int):
    return x + 1
    
@flow()
def silently_crashing_flow():
    add_one.map(x=[1,2,3,4,5])

The above flow will be reported as completed, but the add_one() task will crash with the following logs:

A future was garbage collected before it resolved. Please call `.wait()` or `.result()` on futures to ensure they resolve.

Finished in state NotReady(message="Upstream task run 'xxxxxxxx-xxxx' did not reach a 'COMPLETED' state.", type=PENDING, result=None)
Please wait for all submitted tasks to complete before exiting your flow by calling `.wait()` on the `PrefectFuture` returned from your `.submit()` calls.

Successful Flow Example:

from prefect import flow, task

@task()
def add_one(x: int):
    return x + 1
    
@flow()
def silently_crashing_flow():
    # calling `.result()` ensures the task completes before exiting
    add_one.map(x=[1,2,3,4,5]).result()

Describe the proposed behavior

The preferred behavior is to report the flow as Crashed in this scenario. The existing logs are very helpful in identifying how to resolve the issue, but if the flow is reported as Completed they aren't surfaced unless someone manually inspects the flow run in the UI.

Example Use

No response

Additional context

No response

@skohlleffel skohlleffel added the enhancement An improvement of an existing feature label Dec 13, 2024
@zzstoatzz
Copy link
Collaborator

zzstoatzz commented Dec 13, 2024

hi @skohlleffel - thanks for the issue!

this (not automatically resolving unresolved futures) was an intentional behavior change between prefect 2 and 3, but it sounds like you'd expect the final state of a flow run that has unresolved futures to have a non Completed state?

The preferred behavior is to report the flow as Crashed in this scenario.

I'm not sure that Crashed is what we would want, since Crashed typically means an OOM, SIGTERM or another infrastructure level error. Not resolving futures feels similar to forgetting to await an async function, which I agree is something you likely want to know (especially since the behavior changed between prefect 2 and 3) , but in my mind is not a Crashed state.

Perhaps there's a named Completed state that we could finish in (e.g. CompletedWithOrphans for the sake of discussion) if we discovered unresolved futures, what do you think?

@skohlleffel
Copy link
Author

@zzstoatzz That would be awesome! It would just be good to have a state that identifies that not all futures were resolved. That way, we can create alerts for that scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement An improvement of an existing feature
Projects
None yet
Development

No branches or pull requests

2 participants