Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] A user query with from_json failed with "JSON Parser encountered an invalid format at location" #11293

Open
viadea opened this issue Aug 2, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@viadea
Copy link
Collaborator

viadea commented Aug 2, 2024

This is from an old issue #6481.
The same user query failed with below error in 24.08 snapshot while CPU run works fine:

ai.rapids.cudf.CudfException: CUDF failure at:/home/jenkins/agent/workspace/jenkins-spark-rapids-jni_nightly-dev-819-cuda11/src/main/cpp/src/map_utils.cu:149: JSON Parser encountered an invalid format at location 197291503
	at com.nvidia.spark.rapids.jni.MapUtils.extractRawMapFromJsonString(Native Method)
	at com.nvidia.spark.rapids.jni.MapUtils.extractRawMapFromJsonString(MapUtils.java:49)
	at org.apache.spark.sql.rapids.GpuJsonToStructs.doColumnar(GpuJsonToStructs.scala:168)
	at com.nvidia.spark.rapids.GpuUnaryExpression.doItColumnar(GpuExpressions.scala:250)
	at com.nvidia.spark.rapids.GpuUnaryExpression.$anonfun$columnarEval$1(GpuExpressions.scala:261)
	at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:30)
	at com.nvidia.spark.rapids.GpuUnaryExpression.columnarEval(GpuExpressions.scala:260)
	at com.nvidia.spark.rapids.GpuExpression.columnarEvalAny(GpuExpressions.scala:144)
	at com.nvidia.spark.rapids.GpuExpression.columnarEvalAny$(GpuExpressions.scala:144)
	at com.nvidia.spark.rapids.GpuUnaryExpression.columnarEvalAny(GpuExpressions.scala:244)
	at com.nvidia.spark.rapids.RapidsPluginImplicits$ReallyAGpuExpression.columnarEvalAny(implicits.scala:39)
	at com.nvidia.spark.rapids.GpuBinaryExpression.columnarEval(GpuExpressions.scala:310)
	at com.nvidia.spark.rapids.GpuBinaryExpression.columnarEval$(GpuExpressions.scala:309)
	at org.apache.spark.sql.rapids.GpuGetMapValue.columnarEval(complexTypeExtractors.scala:202)
	at com.nvidia.spark.rapids.GpuExpression.columnarEvalAny(GpuExpressions.scala:144)
	at com.nvidia.spark.rapids.GpuExpression.columnarEvalAny$(GpuExpressions.scala:144)
	at org.apache.spark.sql.rapids.GpuGetMapValue.columnarEvalAny(complexTypeExtractors.scala:202)
	at com.nvidia.spark.rapids.RapidsPluginImplicits$ReallyAGpuExpression.columnarEvalAny(implicits.scala:39)
	at com.nvidia.spark.rapids.GpuBinaryExpression.columnarEval(GpuExpressions.scala:310)
	at com.nvidia.spark.rapids.GpuBinaryExpression.columnarEval$(GpuExpressions.scala:309)
	at com.nvidia.spark.rapids.CudfBinaryOperator.columnarEval(GpuExpressions.scala:417)
	at com.nvidia.spark.rapids.GpuExpression.columnarEvalAny(GpuExpressions.scala:144)

Reproduced using user query and data in house.

@viadea viadea added bug Something isn't working ? - Needs Triage Need team to review and classify labels Aug 2, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 6, 2024
@ttnghia ttnghia self-assigned this Aug 21, 2024
@ttnghia
Copy link
Collaborator

ttnghia commented Aug 21, 2024

The exception is due to error token detected in cudf::io::json::detail::get_token_stream. I'll check with the data to see what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants