We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When running the training code, the loading cached shuffled indexes program is stuck? How to solve it?
The text was updated successfully, but these errors were encountered:
Same issue here, have you solved it? @ccx06 @sangmichaelxie
Sorry, something went wrong.
Maybe there is some kind of OOM issue or num_workers is set too high? Does this happen every time and on all datasets?
I set the num_workers as 1. I tried to reduce the RANDOM_BATCH_SIZE in dataloader.py but it does not work.
It seems the issue is related to caching
06/25/2024 03:50:31 - INFO - datasets.arrow_dataset - Caching indices mapping at ~/doremi/preprocessed/train/Pile-CC/66/cache-abce09a69c09a6c6.arrow
No branches or pull requests
When running the training code, the loading cached shuffled indexes program is stuck? How to solve it?
The text was updated successfully, but these errors were encountered: