Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates forceCommit APIs to handle Pauseless #14828

Merged
merged 15 commits into from
Jan 24, 2025

Conversation

noob-se7en
Copy link
Contributor

@noob-se7en noob-se7en commented Jan 16, 2025

Updates forceCommit APIs to handle Pauseless Ingestion.

(In Pauseless Ingestion, The segment is marked as ONLINE in the ideal state before the commit has completed successfully. Hence the check needs to be updated from ideal state to ZK metadata which is the correct indicator)

@noob-se7en noob-se7en marked this pull request as ready for review January 16, 2025 12:40
@codecov-commenter
Copy link

codecov-commenter commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 45.45455% with 12 lines in your changes missing coverage. Please review.

Project coverage is 63.75%. Comparing base (59551e4) to head (9e3ddad).
Report is 1617 commits behind head on master.

Files with missing lines Patch % Lines
...ller/api/resources/PinotRealtimeTableResource.java 0.00% 12 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14828      +/-   ##
============================================
+ Coverage     61.75%   63.75%   +2.00%     
- Complexity      207     1469    +1262     
============================================
  Files          2436     2708     +272     
  Lines        133233   151438   +18205     
  Branches      20636    23381    +2745     
============================================
+ Hits          82274    96549   +14275     
- Misses        44911    47645    +2734     
- Partials       6048     7244    +1196     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.72% <45.45%> (+2.02%) ⬆️
java-21 63.62% <45.45%> (+1.99%) ⬆️
skip-bytebuffers-false 63.74% <45.45%> (+2.00%) ⬆️
skip-bytebuffers-true 63.60% <45.45%> (+35.87%) ⬆️
temurin 63.75% <45.45%> (+2.00%) ⬆️
unittests 63.75% <45.45%> (+2.00%) ⬆️
unittests1 56.33% <ø> (+9.44%) ⬆️
unittests2 34.03% <45.45%> (+6.30%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly good. Good job!

Comment on lines 2252 to 2253
jobMetadata.put(CommonConstants.ControllerJob.CONSUMING_SEGMENTS_YET_TO_BE_COMMITTED_LIST,
JsonUtils.objectToString(consumingSegmentsCommitted));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) This shouldn't be needed. Adding it will actually add overhead for one extra parsing

if (segmentZKMetadata == null) {
continue;
}
if (!segmentZKMetadata.getStatus().isCompleted()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot use isCompleted() here. We should explicitly check for it to be DONE

@9aman @KKcorps Currently COMMITTING is also count as completed, is this expected?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isCompleted should return true only when the stratus is Uploaded or Done. Raising a PR along with few minor improvement wrt pauseless,

SegmentZKMetadata segmentZKMetadata =
_helixResourceManager.getSegmentZKMetadata(tableNameWithType, segmentName);
if (segmentZKMetadata == null) {
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) Add some comment about this behavior. We are counting deleted segment as not need to be committed

Map<String, String> controllerJobZKMetadata) {
addControllerJobToZK(forceCommitJobId,
controllerJobZKMetadata, ControllerJobType.FORCE_COMMIT, prevJobMetadata -> {
String existingSegmentsYetToBeCommittedString =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some comments describing why we want to perform this check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this method is only useful for a rare edge case when two async forceCommitStatus APIs are running.
But we can remove this as it will save overhead for one extra parsing on each API call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

have removed this method

@@ -462,6 +467,19 @@ public boolean isForceCommitJobCompleted(String forceCommitJobId)

assertEquals(jobStatus.get("jobId").asText(), forceCommitJobId);
assertEquals(jobStatus.get("jobType").asText(), "FORCE_COMMIT");

assert jobStatus.get("segmentsForceCommitted") != null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use CommonConstants variable rather than hardcode

Copy link
Contributor

@KKcorps KKcorps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with minor comments

when(mockSegmentZKMetadataUploaded.getStatus()).thenReturn(Status.UPLOADED);

SegmentZKMetadata mockSegmentZKMetadataInProgress = mock(SegmentZKMetadata.class);
when(mockSegmentZKMetadataInProgress.getStatus()).thenReturn(Status.IN_PROGRESS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also add one for Status.COMMITTING

@noob-se7en noob-se7en requested a review from KKcorps January 17, 2025 11:09
segmentsToCheck = consumingSegmentCommitted;
}

Set<String> segmentsYetToBeCommitted =
Copy link
Contributor

@9aman 9aman Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@noob-se7en iiuc the objective of introducing this field is to reduce the number of segmentZKMetadata that we will be iterating over.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it server any other purpose ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, Just to reduce the ZK calls only.

Copy link
Contributor

@9aman 9aman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Jackie-Jiang Jackie-Jiang merged commit c9e08f3 into apache:master Jan 24, 2025
21 checks passed
gortiz pushed a commit to gortiz/pinot that referenced this pull request Jan 30, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants