Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect pod template declaration leads to the Spark Operator being stuck #2459

Open
1 task done
JeanMichelApeupres opened this issue Mar 5, 2025 · 0 comments
Open
1 task done
Labels
kind/bug Something isn't working

Comments

@JeanMichelApeupres
Copy link

What happened?

  • ✋ I have searched the open/closed issues and my issue is not listed.

Hello,

We faced an issue recently with the Spark Operator when one of our clients tried to submit an incorrect Spark Application using the pod templates.
The following file represents the faulty Yaml declaration (extra .requests inside the template specs) : incorrect-spark-application.txt

The Spark Application seems to be valid according to the CR definition, but once submitted we have these logs from the controller spark-controller.log and when listing the Spark Applications with kubectl, the output is rather empty and stuck :

kubectl get sparkApplications | grep spark-pi
spark-pi                                                                                                                 40s

instead of this for example :

submit-spark-pi-app-kwuasvyx                    COMPLETED       1          2024-07-31T09:53:45Z   2024-07-31T09:55:18Z   217d

Once in this state, no other Spark Applications can be submitted, and when trying to restart the controller, we ended up in a CrashLoopBackOff with that kind of output spark-controller-restart.log

Deleting the Spark Application fixes the issue but it would be nice to have some safeguard when submitting the SparkApp that automatically returns FAILED or similar status

Reproduction Code

To reproduce the issue, just launch the Spark Application yaml aforementioned.

Expected behavior

The Spark Application should be rejected by the CR upon validation.

Actual behavior

Pod templates from the Spark Application are not valid when parsed against OpenAPI schemes, leading to this behaviour.

Environment & Versions

  • Kubernetes Version: 1.30.8
  • Spark Operator Version: 2.1.0
  • Apache Spark Version: 3.5.3
  • Webhooks disabled

Additional context

No response

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

@JeanMichelApeupres JeanMichelApeupres added the kind/bug Something isn't working label Mar 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant