Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup and converge V1 and V2 SchemaConformingTransformer #14772

Merged
merged 5 commits into from
Jan 11, 2025

Conversation

lnbest0707-uber
Copy link
Contributor

refactor
backward-incompat
Followup on #14546

  • Remove V1 SchemaConformingTransformer and SchemaConformingTransformerConfig.
  • Rename SchemaConformingTransformerV2 to SchemaConformingTransformer, rename SchemaConformingTransformerV2Config to SchemaConformingTransformerConfig.
  • Add backward compatible workaround in IngestionConfig, so that SchemaConformingTransformerV2Config map could be casted to create SchemaConformingTransformer.

Next step:
Update https://docs.pinot.apache.org/basics/data-import/pinot-input-formats/schema-conforming-transformer page with a complete story on how to ingest/store/index large scale dynamic schema data like logging

@codecov-commenter
Copy link

codecov-commenter commented Jan 8, 2025

Codecov Report

Attention: Patch coverage is 83.47339% with 59 lines in your changes missing coverage. Please review.

Project coverage is 63.83%. Comparing base (59551e4) to head (1563a79).
Report is 1558 commits behind head on master.

Files with missing lines Patch % Lines
...recordtransformer/SchemaConformingTransformer.java 78.69% 28 Missing and 21 partials ⚠️
...e/ingestion/SchemaConformingTransformerConfig.java 92.74% 3 Missing and 6 partials ⚠️
...ache/pinot/segment/local/utils/IngestionUtils.java 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #14772      +/-   ##
============================================
+ Coverage     61.75%   63.83%   +2.08%     
- Complexity      207     1610    +1403     
============================================
  Files          2436     2702     +266     
  Lines        133233   150807   +17574     
  Branches      20636    23283    +2647     
============================================
+ Hits          82274    96272   +13998     
- Misses        44911    47316    +2405     
- Partials       6048     7219    +1171     
Flag Coverage Δ
custom-integration1 100.00% <ø> (+99.99%) ⬆️
integration 100.00% <ø> (+99.99%) ⬆️
integration1 100.00% <ø> (+99.99%) ⬆️
integration2 0.00% <ø> (ø)
java-11 63.81% <83.47%> (+2.10%) ⬆️
java-21 63.72% <83.47%> (+2.10%) ⬆️
skip-bytebuffers-false 63.83% <83.47%> (+2.09%) ⬆️
skip-bytebuffers-true 63.71% <83.47%> (+35.98%) ⬆️
temurin 63.83% <83.47%> (+2.08%) ⬆️
unittests 63.83% <83.47%> (+2.08%) ⬆️
unittests1 56.37% <3.64%> (+9.47%) ⬆️
unittests2 34.07% <83.47%> (+6.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Jackie-Jiang Jackie-Jiang added backward-incompat Referenced by PRs that introduce or fix backward compat issues refactor labels Jan 8, 2025
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for cleaning it up

Comment on lines 60 to 67
private void setSchemaConformingTransformerV2Config(JsonNode schemaConformingTransformerV2Config) {
// Map _schemaConformingTransformerV2Config JSON into SchemaConformingTransformerConfig
if (schemaConformingTransformerV2Config != null && _schemaConformingTransformerConfig == null) {
ObjectMapper objectMapper = new ObjectMapper();
_schemaConformingTransformerConfig =
objectMapper.convertValue(schemaConformingTransformerV2Config, SchemaConformingTransformerConfig.class);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work the same if we simply put:

Suggested change
private void setSchemaConformingTransformerV2Config(JsonNode schemaConformingTransformerV2Config) {
// Map _schemaConformingTransformerV2Config JSON into SchemaConformingTransformerConfig
if (schemaConformingTransformerV2Config != null && _schemaConformingTransformerConfig == null) {
ObjectMapper objectMapper = new ObjectMapper();
_schemaConformingTransformerConfig =
objectMapper.convertValue(schemaConformingTransformerV2Config, SchemaConformingTransformerConfig.class);
}
}
public void setSchemaConformingTransformerV2Config(SchemaConformingTransformerConfig schemaConformingTransformerConfig) {
_schemaConformingTransformerConfig = schemaConformingTransformerConfig;
}

All we need to do is mapping both key to the same property right?

Copy link
Contributor Author

@lnbest0707-uber lnbest0707-uber Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All we need to do is mapping both key to the same property right?

Yes, but would the suggested change work in the same way?
In which case, the setSchemaConformingTransformerV2Config(SchemaConformingTransformerConfig schemaConformingTransformerConfig) would be called? I thought when

{"schemaConformingTransformerV2Config" : {...}}

appears in Json, it would internally trigger setSchemaConformingTransformerV2Config(JsonNode schemaConformingTransformerV2Config) only (JsonNode as type).
I need eventually the above json could create the schemaConformingTransformerConfig object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on my understanding, it should be able to directly construct the SchemaConformingTransformerConfig properly. You can probably add a test for backward compatibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! Added the UT. Verified that this works. Thanks for the proposal.

@Jackie-Jiang Jackie-Jiang added cleanup backward-incompat Referenced by PRs that introduce or fix backward compat issues and removed backward-incompat Referenced by PRs that introduce or fix backward compat issues labels Jan 8, 2025
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Jackie-Jiang Jackie-Jiang merged commit af46700 into apache:master Jan 11, 2025
19 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backward-incompat Referenced by PRs that introduce or fix backward compat issues cleanup refactor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants