-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype expansion of SQL transforms for single-node execution #59
Comments
Ray Java resources:
fyi @iasoon @valiantljk this issue is more complex than the other stuff you've tried, but it should help move one of our big features forward. is any of you interested? : ) |
i don't fully understand this issue. Since you mentioned that this SQL transforms are done in Java. does this mean that we are adding java support for our beam runner? |
yes, we would have to add support for expanding java PTransforms. I think we can limit the scope of this quite a bit while still delivering SQL execution. |
this sounds cool, if we are also targeting java. I may ask my colleagues to take a look if he is interested to join us. |
One of the main targets for the Ray Beam Runner is to support SQL (and streaming SQL).
Beam's SQL support is implemented in Java. There are two parts for the execution of SQL transforms in Beam:
ExpansionService
interface (sample of the GRPC implementation - this seems way too complicated to be honest)My idea:
SqlSchemaTransformProvider
with id"beam:schematransform:org.apache.beam:sql:v1"
this week.The
RayJavaExpansionService
should then return the schema of the resulting PCollection, as well as the expanded graph of operations in protobuf format (the proto format).The expansion is not enough to execute SQL, but it's the first step. The next step is to recognize Java Stages, and execute them in a Java process rather than a Python process (basically, a Java implementation of this code, where we return some kind of
JavaWorkerHandler
The text was updated successfully, but these errors were encountered: