Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix overflow behavior for spark decimal sum aggregate function #11127

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

zhli1142015
Copy link
Contributor

@zhli1142015 zhli1142015 commented Sep 29, 2024

The result / intermediate type's precision is min(38, input's precision + 10).
If it's less than 38, we should check overflow with it not 38 / 18. Now the overflow
check doesn't work as expected.
Below case repro the failure:

val df = spark.sql("select sum(if(cast(id % 10000 as decimal(9, 2)) < cast(9999999.00 as decimal(9, 2)), cast(9999999.00 as decimal(9, 2)), cast(id % 10000 as decimal(9, 2)))), count(*) from range(100000000000)")
df.collect

Null should be returned, but failure happens.

org.apache.spark.SparkRuntimeException: Error while decoding: org.apache.spark.SparkArithmeticException: [DECIMAL_PRECISION_EXCEEDS_MAX_PRECISION] Decimal precision 20 exceeds max precision 19.
createexternalrow(input[0, decimal(19,2), true].toJavaBigDecimal, staticinvoke(class java.lang.Long, ObjectType(class java.lang.Long), valueOf, input[1, bigint, false], true, false, true), StructField(sum((IF((CAST((id % 10000) AS DECIMAL(9,2)) < CAST(9999999.00 AS DECIMAL(9,2))), CAST(9999999.00 AS DECIMAL(9,2)), CAST((id % 10000) AS DECIMAL(9,2))))),DecimalType(19,2),true), StructField(count(1),LongType,false)).

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 29, 2024
Copy link

netlify bot commented Sep 29, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 291e366
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/6704975a94e0f10008ae04e9

@zhli1142015 zhli1142015 changed the title Fix overflow behavior for spark decimal sum aggregate fucntion Fix overflow behavior for spark decimal sum aggregate function Sep 29, 2024
@zhli1142015
Copy link
Contributor Author

cc @rui-mo , thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants