Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Iceberg] Enable affinity scheduling on file sections #24598

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ZacBlanco
Copy link
Contributor

@ZacBlanco ZacBlanco commented Feb 20, 2025

Description

This change moves the affinity scheduling file section size
configuration from HiveClientConfig and HiveSessionProperties
to HiveCommonClientConfig and HiveCommonSessionProperties so
that the iceberg connector can benefit from this scheduling
strategy when tables have a small number of files but a large
number of splits.

Motivation and Context

On tables with a small number of large files, queries may perform poorly due to the distribution in split scheduling being skewed. This is more likely to occur when there is a limited number of values being hashed to determine the preferred nodes to schedule to. By changing the identifier used for selecting the preferred nodes we increase the probability that the splits are scheduled more evenly across the cluster.

Impact

  • Hive-specific configuration moved to common configuration.

Test Plan

Added a unit test to verify that the number of unique identifiers changes as we scale up the file section size

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

== RELEASE NOTES ==

Iceberg Connector Changes
* Add support for the ``hive.affinity-scheduling-file-section-size`` configuration property and ``affinity_scheduling_file_section_size`` session property.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 20, 2025
@ZacBlanco ZacBlanco changed the title [Iceberg] Enable affinity scheduling file sections [Iceberg] Enable affinity scheduling on file sections Feb 20, 2025
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch 2 times, most recently from ac6fce7 to 6dfb1f2 Compare February 20, 2025 06:36
@aaneja aaneja self-requested a review February 20, 2025 10:38
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch 3 times, most recently from bc804bc to 484408b Compare February 20, 2025 20:41
@ZacBlanco ZacBlanco marked this pull request as ready for review February 20, 2025 21:42
@ZacBlanco ZacBlanco requested a review from presto-oss February 20, 2025 21:42
steveburnett
steveburnett previously approved these changes Feb 20, 2025
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local doc build, looks good. Thanks!

Copy link
Contributor

@yingsu00 yingsu00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good. One minor correction: "splits not being scheduled to enough nodes" : It's not necessarily they were not scheduled to enough nodes, but in general it had more skew than Hive, even when the splits were scheduled to the same number of nodes. Scheduling to less nodes happened non-determistically when I ran the queries multiple times. More than half times they did were scheduled to all nodes, but even in such cases the load was not as balanced as Hive.

@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 484408b to fa2b10e Compare February 20, 2025 23:26
@ZacBlanco
Copy link
Contributor Author

Thanks for the feedback @yingsu00 - I updated the PR description to be a bit more accurate

yingsu00
yingsu00 previously approved these changes Feb 21, 2025
aaneja
aaneja previously approved these changes Feb 21, 2025
{
Session maxIdentifiers = Session.builder(getSession())
.setCatalogSessionProperty("iceberg", AFFINITY_SCHEDULING_FILE_SECTION_SIZE, "1B")
.setCatalogSessionProperty("iceberg", TARGET_SPLIT_SIZE, "1")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make TARGET_SPLIT_SIZE a DataSize instead of refering to bytes ? We do this for MAX_SPLIT_SIZE. Can punt this to a new PR

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aaneja Good catch!
@ZacBlanco Could you please address Anant's comment in this PR? I'll merge once it's done.

Copy link
Contributor Author

@ZacBlanco ZacBlanco Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was done intentionally because the Iceberg library expects this configuration in bytes directly on the "read.target.split-size" table property. Iceberg has no notion of "DataSize" or unit-based shorthand for byte-typed values. In order to make setting the values for the table and session property consistent for the target split size I opted to not make this a DataSize-typed property.

This change moves the affinity scheduling file section size
configuration from HiveClientConfig and HiveSessionProperties
to HiveCommonClientConfig and HiveCommonSessionProperties so
that the iceberg connector can benefit from this scheduling
strategy when tables have a small number of files but a large
number of splits.
@ZacBlanco ZacBlanco dismissed stale reviews from aaneja and yingsu00 via d4ae7ad February 21, 2025 17:09
@ZacBlanco ZacBlanco force-pushed the upstream-iceberg-split-hashing branch from 7cf8789 to d4ae7ad Compare February 21, 2025 17:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:IBM PR from IBM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants