-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Config for max output segment size in UpsertCompactMerge task #14742
Config for max output segment size in UpsertCompactMerge task #14742
Conversation
75d14c6
to
66e9208
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14742 +/- ##
============================================
+ Coverage 61.75% 63.82% +2.07%
- Complexity 207 1609 +1402
============================================
Files 2436 2703 +267
Lines 133233 150748 +17515
Branches 20636 23291 +2655
============================================
+ Hits 82274 96219 +13945
- Misses 44911 47332 +2421
- Partials 6048 7197 +1149
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
...org/apache/pinot/plugin/minion/tasks/upsertcompactmerge/UpsertCompactMergeTaskGenerator.java
Outdated
Show resolved
Hide resolved
...org/apache/pinot/plugin/minion/tasks/upsertcompactmerge/UpsertCompactMergeTaskGenerator.java
Outdated
Show resolved
Hide resolved
...org/apache/pinot/plugin/minion/tasks/upsertcompactmerge/UpsertCompactMergeTaskGenerator.java
Show resolved
Hide resolved
// Add the segment to the current group | ||
currentGroup.add(segment); | ||
currentValidDocsSum += validDocs; | ||
currentTotalDocsSum += validDocs + invalidDocs; | ||
currentOutputSegmentSizeInBytes += expectedSegmentSizeInBytes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this ensure that we are packing enough segments (based on outputSegmentMaxSizeInBytes) per task. If so, this still does not ensure that a segment is of expectedSegmentSize?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this ensure that we are packing enough segments (based on outputSegmentMaxSizeInBytes) per task. If so, this still does not ensure that a segment is of expectedSegmentSize?
This ensures that the output segment size is within a certain threshold. We have named the config "outputSegmentMaxSize".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tibrewalpratik17 . By default do we want to have 200Mb segments? This changes the default segment size and can overide maxNumRecordsPerSegment. We can make the new param opt-in and not apply default?
&& currentTotalDocsSum + validDocs + invalidDocs < maxRecordsPerTask
&& currentOutputSegmentSizeInBytes + expectedSegmentSizeInBytes < outputSegmentMaxSizeInBytes) {
Resolves #14634
Adding support to configure max output segment size for UpsertCompactMerge task.