-
Notifications
You must be signed in to change notification settings - Fork 95
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Small tweaks to improve the glaive dataset blog post. (#938)
- Reduce the amount of times we log events for progress to 0.5, and then multiply by shard_count, reducing the overall amount of logs by 24 when you are running a 12-core multiproc map. This is a hack, we should really use dask properly with progress once we switch to streaming. - Restart the dask client when executing a new task if there are no tasks pending. This helps with making the second run much faster as workers dont GC intermittently. Coding dataset notebook: - Adds a limit example to the notebook to show how you determine whether it's what you want. Also: - Add the glaive dataset to the huggingface demo with the map outputs. This currently is a bit weird as I ran it locally and pushed, so it's not completely reproducible. - Improve the notebook for curating a coding dataset. https://lilacai-lilac.hf.space/datasets#lilac/glaive&expandedStats=%7B%22answer_formatted.has_edit%22%3Atrue%7D&query=%7B%22filters%22%3A%5B%7B%22path%22%3A%5B%22answer_formatted%22%2C%22has_edit%22%5D%2C%22op%22%3A%22equals%22%2C%22value%22%3A1%7D%5D%7D&compareColumns=%5B%7B%22column%22%3A%5B%22answer%22%5D%2C%22compareToColumn%22%3A%5B%22answer_formatted%22%2C%22answer%22%5D%2C%22swapDirection%22%3Afalse%7D%5D&rowId=%22fffc265c-845e-4a2b-b3ce-2caa61fed0f4%22 Fixes #928
- Loading branch information
Showing
6 changed files
with
286 additions
and
50 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.