Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make updating segment.tier and rebalancing table consistent on segment's tier #14516

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

klsince
Copy link
Contributor

@klsince klsince commented Nov 21, 2024

Currently, updating segment's tier (a field in segment ZK metadata) and rebalancing table call TierSegmentSelectors to decide if a segment belongs to a specific tier separately. But tier selectors like TimeBasedTierSegmentSelector may return different tiers for a segment between the two steps, as the tier assignment is decided by comparing segment data time vs. current time, which keeps changing. This caused segment to be relocated and loaded on new servers, but with wrong segment tier.

This PR takes the segment tier assignments as calculated when updating segment's tier, and pass it to table rebalancer to use to make segment tier decision consistent.

@klsince
Copy link
Contributor Author

klsince commented Nov 21, 2024

Even with this fix, there might be another subtle race condition when multiple table rebalances are conducted by different threads (or even different controllers) in parallel, e.g.

  • one rebalancer decides to set segment tier to A to move segment to a server in tier A;
  • another rebalancer decides to set segment tier to B to move segment to a server in tier B, around the same time.

Due to race condition, server in tier A may load the segment, whose tier field in ZK metadata might have been set to tier B.

But this race condition should be rare, as tier migration usually goes to one direction, e.g. from default tier (hot), then to next tier (warm), then to next tier (cold) etc.. and with enough time gaps between tiers. So at each time point, rebalancers just decide if a segment should stay on current tier, or move to the immediate next tier. In this case, the race condition said above wouldn't happen, as rebalancers are not trying to move segment to different tiers around the same time. And if table just has two tiers, this wouldn't happen either.

To address it, I think we'd need a distributed lock via ZK to ensure that at any time only one table rebalance happens even across controllers. This can be addressed later if we see the need.

@codecov-commenter
Copy link

codecov-commenter commented Nov 21, 2024

Codecov Report

Attention: Patch coverage is 88.88889% with 3 lines in your changes missing coverage. Please review.

Project coverage is 63.86%. Comparing base (59551e4) to head (fa57a49).
Report is 1376 commits behind head on master.

Files with missing lines Patch % Lines
...ntroller/helix/core/PinotHelixResourceManager.java 85.71% 2 Missing ⚠️
...che/pinot/common/utils/config/TierConfigUtils.java 75.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14516      +/-   ##
============================================
+ Coverage     61.75%   63.86%   +2.10%     
- Complexity      207     1569    +1362     
============================================
  Files          2436     2673     +237     
  Lines        133233   146756   +13523     
  Branches      20636    22505    +1869     
============================================
+ Hits          82274    93720   +11446     
- Misses        44911    46108    +1197     
- Partials       6048     6928     +880     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.82% <88.88%> (+2.11%) ⬆️
java-21 63.75% <88.88%> (+2.12%) ⬆️
skip-bytebuffers-false 63.85% <88.88%> (+2.10%) ⬆️
skip-bytebuffers-true 63.70% <88.88%> (+35.97%) ⬆️
temurin 63.86% <88.88%> (+2.10%) ⬆️
unittests 63.85% <88.88%> (+2.10%) ⬆️
unittests1 55.53% <60.00%> (+8.64%) ⬆️
unittests2 34.58% <74.07%> (+6.85%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

@klsince klsince merged commit 2a2a2d1 into apache:master Nov 21, 2024
21 checks passed
@klsince klsince deleted the fix_tier_migration_race_condition branch November 21, 2024 21:24
davecromberge pushed a commit to davecromberge/pinot that referenced this pull request Nov 22, 2024
yashmayya pushed a commit to yashmayya/pinot that referenced this pull request Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants