Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add maxcompute batch predictor #626

Merged
merged 28 commits into from
Feb 21, 2025
Merged

Conversation

shydefoo
Copy link
Contributor

@shydefoo shydefoo commented Jan 31, 2025

Description

  • This PR adds MaxCompute as a source and sink to batch prediction jobs. This allows users to perform batch prediction on data stored in MaxCompute and write the results back into MaxCompute.

Modifications

  • python/batch-predictor
    • merlinpyspark package source code modified
    • Added missing proto spec for PredictionJob protobuf
    • Added MaxCompute Jar dependencies to base.Dockerfile
  • python/sdk
    • Modified python sdk to include MaxComputeSource and MaxComputeSink classes
    • Modified swagger.yaml and regenerated python and go clients
    • Will add tests and readme in separate MR

Tests

Checklist

  • Added PR label
  • Added unit test, integration, and/or e2e tests
  • Tested locally
  • Updated documentation
  • Update Swagger spec if the PR introduce API changes
  • Regenerated Golang and Python client if the PR introduces API changes

Release Notes

- Support MaxCompute source and sink for batch prediction jobs

@shydefoo shydefoo added the enhancement New feature or request label Feb 5, 2025
@shydefoo shydefoo force-pushed the add-maxcompute-batch-predictor branch from 45925ec to a94aa8f Compare February 5, 2025 08:34
@shydefoo shydefoo self-assigned this Feb 6, 2025
@shydefoo shydefoo marked this pull request as ready for review February 6, 2025 07:29
@shydefoo
Copy link
Contributor Author

@deadlycoconuts @bthari poke

Copy link
Contributor

@deadlycoconuts deadlycoconuts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for all the changes! 🚀 Especially the reverse engineering of the proto and adding the additional check on the protoc version. Everything looks good, I just left random nitpicking comments, but thanks a lot!

@shydefoo shydefoo merged commit 75ead87 into main Feb 21, 2025
33 checks passed
@shydefoo shydefoo deleted the add-maxcompute-batch-predictor branch February 21, 2025 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants