Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama guard data formatter example #337

Merged

Conversation

albertodepaola
Copy link
Contributor

@albertodepaola albertodepaola commented Dec 20, 2023

What does this PR do?

Adds a simple example on how to use the data formatter script for fine tuning Llama Guard. Adds a readme explaining the steps in the script as well.

Testing

Running the script and checking that it generated valid prompts:

python src/llama_recipes/data/llama_guard/finetuning_data_formatter_example.py

Output is as show in the file:
sample_formatted_data.json

Before submitting

  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Thanks for contributing 🎉!

@albertodepaola
Copy link
Contributor Author

cc @MichaelTontchev

@albertodepaola albertodepaola marked this pull request as ready for review December 21, 2023 19:51
Copy link
Contributor

@MichaelTontchev MichaelTontchev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, minor comments that do not impact correctness of code

@@ -0,0 +1,98 @@
# Finetuning Data Formatter

The finetuning_data_formatter script provides classes and methods for formatting training data for finetuning a language model on a specific task. The main classes are:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"a language model" -> Llama Guard

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't for all LMs we have

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick question, if someone wants to train llama guard from scratch on top of llama 2 7B, can they use this formatter as well?

The finetuning_data_formatter script provides classes and methods for formatting training data for finetuning a language model on a specific task. The main classes are:
* `TrainingExample`: Represents a single example in the training data, consisting of a prompt, response, label (safe or unsafe), violated category codes, and an explanation.
* `Guidelines`: Defines the categories and their descriptions that will be used to evaluate the safety of the responses.
* `LlamaGuardPromptConfigs`: Configures how the prompt that will be given to the language model during finetuning should be formatted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LM -> Llama Guard

* `TrainingExample`: Represents a single example in the training data, consisting of a prompt, response, label (safe or unsafe), violated category codes, and an explanation.
* `Guidelines`: Defines the categories and their descriptions that will be used to evaluate the safety of the responses.
* `LlamaGuardPromptConfigs`: Configures how the prompt that will be given to the language model during finetuning should be formatted.
* `LlamaGuardGenerationConfigs`: Configures how the language model's response should be formatted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will omit this comment after this point, but in general recommend to replace all relevant instances of LM with LG in this doc :)

- First line must read 'safe' or 'unsafe'.
- If unsafe, a second line must include a comma-separated list of violated categories. """,
should_include_category_descriptions=True,
should_shuffle_category_codes=False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might recommend to set to True in the example here. May make it more robust. Hakan to comment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@inanhkn changed the example to True already

)
```

Then, you need to configure the prompt that will be given to the language model during finetuning. You do this by creating an instance of the LlamaGuardPromptConfigs class and specifying the format string and other options. For example:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth pointing out that the finetuning and inference-time prompts should be the structurally the same for best performance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In line with this, should the new prompts be structurally similar to the ones used by Llama Guard?

Comment on lines 10 to 17
explanation="The response contains personal information."
),
TrainingExample(
prompt="Where do you live?",
response="I live in New York City.",
violated_category_codes=["O2"],
label="unsafe",
explanation="The response reveals the user's location."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably doesn't matter, but if the bot responds with "my name is john" or "I live in NYC", that doesn't appear to leak a person's info or location, but the bot's. Maybe "what is the name of the McDonald's manager at 123 Main street" and "where does Voltaire Strongfeld live" are questions that would elicit this info.

Not a big deal, because this is just an example, but there's a tiny chance it may confuse someone

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall super nit on this file: would add a line of space between each variable declaration for easier readingl/skimming

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added space and changed the order of the code for better readibily

Copy link
Contributor

@jeffxtang jeffxtang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall nice doc and scripts! Just 3 comments - NIT and for some improved UE on trying out the finetuning and inference scripts.

Copy link
Contributor

@jeffxtang jeffxtang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a couple of final comments that Beto can decide if changes are needed. @albertodepaola

@albertodepaola albertodepaola merged commit aaa769c into meta-llama:main Dec 28, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants