Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(batch-jobs): Add step to pass API server env vars to spark jobs #624

Merged
merged 4 commits into from
Jan 7, 2025

Conversation

deadlycoconuts
Copy link
Contributor

@deadlycoconuts deadlycoconuts commented Jan 6, 2025

Description

This simple PR simply adds a tiny feature to allow the API server to pass certain pre-configured environment variable values from its environment to the Spark drivers and executors (for batch prediction) that it spins up.

Modifications

  • api/batch/resource.go - Addition of a step to add pre-configured API server environment variables to the Spark driver/executor manifest
  • api/config/batch.go - Addition of a new config field to specify the environment values mentioned above

Tests

Checklist

  • Added PR label
  • Added unit test, integration, and/or e2e tests
  • Tested locally
  • Updated documentation
  • Update Swagger spec if the PR introduce API changes
  • Regenerated Golang and Python client if the PR introduces API changes

Release Notes

NONE

@deadlycoconuts deadlycoconuts added the enhancement New feature or request label Jan 6, 2025
@deadlycoconuts deadlycoconuts self-assigned this Jan 6, 2025
Copy link

@vinoth-gojek vinoth-gojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @deadlycoconuts the changes lgtm!

could you also share some background on why we need to pass this env vars to spark jobs? this will help me visualise the changes in the e2e flow

@deadlycoconuts
Copy link
Contributor Author

Thanks for the quick review @vinoth-gojek !

could you also share some background on why we need to pass this env vars to spark jobs? this will help me visualise the changes in the e2e flow

Ahh, the main motivation was to allow the API server to pass credentials (needed to access external stuff) to the Spark executors/drivers which may already exist as environment variables in the API server's environment, instead of having to specify these credentials explicitly again as configuration for the spark jobs.

@deadlycoconuts
Copy link
Contributor Author

I'll be merging this PR now! I've just added a couple of unit tests since then but nothing else changed. Thanks a lot for the review once again!

@deadlycoconuts deadlycoconuts merged commit 967788e into main Jan 7, 2025
33 checks passed
@deadlycoconuts deadlycoconuts deleted the add_api_server_env_vars_to_spark_jobs branch January 7, 2025 03:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants