Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[native] fix(iceberg): Date partition value parse issue #24583

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

nmahadevuni
Copy link
Member

Description

Fixes #24371

Motivation and Context

Filter on date column fails in Iceberg since the value returned by Iceberg API is in daysSinceEpoch but we assume its in ISO format "YYYY-MM-DD" and try to convert it to daysSinceEpoch. Velox side changes are merged in facebookincubator/velox#12126

Impact

None.

Test Plan

Added tests in TestPrestoNativeIcebergGeneralQueries.java

Release Notes

== RELEASE NOTES ==

General Changes
* Fix Iceberg date column filtering

@nmahadevuni nmahadevuni requested a review from a team as a code owner February 18, 2025 03:45
@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Feb 18, 2025
@prestodb-ci prestodb-ci requested review from a team, infvg and bibith4 and removed request for a team February 18, 2025 03:45
@nmahadevuni
Copy link
Member Author

@aditi-pandit @majetideepak @Yuhta Can you please review this?

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_date_value_parse_issue branch from 890fc53 to bc135f3 Compare February 18, 2025 09:59
Copy link
Contributor

@aditi-pandit aditi-pandit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @nmahadevuni for these changes. Have 2 questions for you.

@Test
public void testDateQueries()
{
assertQuery("SELECT * FROM ice_table_partitioned WHERE ds >= date ('1994-01-01')", "VALUES (1, date('2022-04-09')), (2, date('2022-03-18'))");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current test returns all rows from the underlying table without actually demonstrating the filtering expected. Add rows with values < date('1994-01-01) to be skippped by the filter.

@Test
public void testDateQueries()
{
assertQuery("SELECT * FROM ice_table_partitioned WHERE ds >= date ('1994-01-01')", "VALUES (1, date('2022-04-09')), (2, date('2022-03-18'))");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the queries (both insertion and reading) use the date function in predicates. The date function actually transforms the string date value to integer days since epoch. The constant folding will evaluate this function before creating the partition filter. So I feel this issue happens irrespective of Iceberg.

Can you try queries without date function in the filter predicate ? What happens ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the date literals can be specified. I have removed the parenthesis to avoid the confusion. If we just specify the date string, then we get the varchar to date conversion error.

@nmahadevuni nmahadevuni force-pushed the fix_iceberg_date_value_parse_issue branch from bc135f3 to 6d88a9e Compare February 20, 2025 05:29
@nmahadevuni
Copy link
Member Author

Thanks @aditi-pandit , I have addressed the comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:IBM PR from IBM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[native] Iceberg read from partitioned Date column fails
3 participants