-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return improved dataschema for empty results when all segments are pruned by broker #13831
Return improved dataschema for empty results when all segments are pruned by broker #13831
Conversation
377ea73
to
514c218
Compare
#13057 (all segments pruned on broker) is similar to the problem solved by this PR. We should be able to construct the empty response based on the schema. Do you want to also help fix that? |
@Jackie-Jiang Sure, I can address it. We have two options to solve it:
I'm thinking of going with option (1). Thoughts? Additionally a followup could be to add a broker-side LIMIT_0_PRUNER that will help short-circuit faster instead of routing queries to all servers. |
1 is not always possible in the following cases:
I prefer 2 because we should be able to derive the data type for transform based on input types in order to be SQL compatible. If we cannot get that right now, we can leave a TODO and put a type as placeholder. |
+1 to jackie's idea of leaving the column types as TODOs. We've noticed internally that the bigger issue is we're strictly missing the result table. We don't even use the column types from Pinot since the code usually 1) assumes the types ahead of time or 2) infers it from the json result. But the column names and rows are typically assumed to be there. |
I'm working on the broker side changes as well. I'll create that as a separate PR. |
514c218
to
d430a8d
Compare
Lost track of this PR.
Please take a look @jadami10 @gortiz @albertobastos |
d430a8d
to
04291d3
Compare
04291d3
to
ff95e23
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #13831 +/- ##
============================================
+ Coverage 61.75% 63.72% +1.96%
- Complexity 207 1469 +1262
============================================
Files 2436 2708 +272
Lines 133233 151424 +18191
Branches 20636 23376 +2740
============================================
+ Hits 82274 96489 +14215
- Misses 44911 47687 +2776
- Partials 6048 7248 +1200
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
considering fillEmptyResponseSchema
was improved in the linked PR, this seems like a straightforward addition
When all segments are pruned on the server, we construct a empty responseBlock . As this response block is constructed without access to segment data, we default to STRING as the datatype for the column - here is the code link . This problem does not exist for pure Aggregation queries.
This PR fixes that by just processing one segment.
Added tests to verify the code changes.