Skip to content

Commit

Permalink
Update docstring for the_cauldron_dataset to match the default subset…
Browse files Browse the repository at this point in the history
… value as 'orcvqa'
  • Loading branch information
Ankur-singh committed Jan 3, 2025
1 parent e979109 commit 48ec8b7
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion torchtune/datasets/multimodal/_the_cauldron.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,7 +181,7 @@ def __call__(self, sample: Mapping[str, Any]) -> Mapping[str, Any]:
transforms on the keys. It should consist of at minimum two components: text tokenization (called
on the "messages" field) and image transform (called on the "images" field). The keys returned by
the model transform should be aligned with the expected inputs into the model.
subset (str): name of the subset of the dataset to load. See the `dataset card
subset (str): name of the subset of the dataset to load. Default is `orcvqa`, see the `dataset card
<https://huggingface.co/datasets/HuggingFaceM4/the_cauldron>`_ for options.
source (str): path to dataset repository on Hugging Face. For local datasets,
define source as the data file type (e.g. "json", "csv", "text") and pass
Expand Down

0 comments on commit 48ec8b7

Please sign in to comment.