Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typesafe config is generating the error UTFDataFormatException: encoded string too long #233

Open
yeikel opened this issue Feb 25, 2020 · 2 comments

Comments

@yeikel
Copy link
Contributor

yeikel commented Feb 25, 2020

I noticed that we are using Typesafe config and that seems to be introducing serialization issues to the job as they are failing with the following exception :

Caused by: java.io.UTFDataFormatException: encoded string too long: 72887 bytes

The issue is hard to replicate and all I can provide at the moment are the stack traces. I will update the issue if I find a way to replicate it

Do you have any recommendation to deal with this issue?

Similar issue : https://stackoverflow.com/questions/41505599/task-not-serializable-in-spark-caused-by-utfdataformatexception-encoded-string

@zouzias
Copy link
Owner

zouzias commented Feb 26, 2020

This looks like a weird issue.

AFAIR, the typesafe configs for LuceneRDD do not need to be serializable. If you use the typesafe config in your application make sure you use it within an object so that it is available to both driver and executors.

You can extend this trait https://github.com/zouzias/spark-lucenerdd/blob/master/src/main/scala/org/zouzias/spark/lucenerdd/config/Configurable.scala#L24 to get it working.

@yeikel
Copy link
Contributor Author

yeikel commented Feb 26, 2020

It really does.

I am not using typesafe configs on my own application. The exception is coming from LuceneRDD itself.

I did another build removing all the references to it in LuceneRDD and it is working fine. I obviously miss the capability to add dynamic configurations so that's not a good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants