Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example conversion script to convert hf to consolidated weight format #319

Merged
merged 5 commits into from
Dec 13, 2023

Conversation

dongwang218
Copy link
Contributor

@dongwang218 dongwang218 commented Dec 8, 2023

What does this PR do?

Fixes meta-llama/llama#570

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • Convert 7Bf checkpoint weights
mkdir ~/llama2/test7B; cp ~/llama2/llama-2-7b-chat/params.json ~/llama2/test7B
python -m llama_recipes.tools.convert_hf_weights_to_llama --model-path meta-llama/Llama-2-7b-chat-hf --output-dir ~/llama2/test7B --model-size 7B

python compare_llama_weights.py ~/llama2/test7B ~/llama2/llama-2-7b-chat
Comparing shards: 100%|████████████████| 1/1 [00:43<00:00, 43.96s/it]
Top 10 largest deltas:
  shard 0 tok_embeddings.weight: 2.9802322387695312e-08
  shard 0 output.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wq.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wk.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wv.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wo.weight: 2.9802322387695312e-08
  shard 0 layers.0.feed_forward.w1.weight: 2.9802322387695312e-08
  shard 0 layers.0.feed_forward.w2.weight: 2.9802322387695312e-08
  shard 0 layers.0.feed_forward.w3.weight: 2.9802322387695312e-08
  shard 0 layers.1.attention.wq.weight: 2.9802322387695312e-08
  • Converted 7Bf inference test
torchrun --nproc_per_node 1 example_chat_completion.py --ckpt_dir ~/llama2/test7B --tokenizer_path ~/llama2/tokenizer.model

==================================

System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

User: Write a brief birthday message to John

> Assistant:  Of course! Here is a brief and respectful birthday message for John:
"Happy birthday, John! I hope your day is filled with joy, love, and all your favorite things. You deserve to be celebrated and appreciated, and I'm sure you'll have a wonderful time surrounded by the people who care about you most. Here's to another year of growth, happiness, and success! 🎉🎂"


  • Convert 70B checkpoint weights
mkdir ~/llama2/test70B; cp ~/llama2/llama-2-70b-chat/params.json ~/llama2/test70B
python -m llama_recipes.tools.convert_hf_weights_to_llama --model-path meta-llama/Llama-2-70b-chat-hf --output-dir ~/llama2/test70B --model-size 70B

python compare_llama_weights.py ~/llama2/test70B ~/llama2/llama-2-70b-chat
Comparing shards: 100%|██████████| 8/8 [01:41<00:00, 12.65s/it]
Top 10 largest deltas:
  shard 0 tok_embeddings.weight: 2.9802322387695312e-08
  shard 0 output.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wq.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wk.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wv.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention.wo.weight: 2.9802322387695312e-08
  shard 0 layers.0.feed_forward.w1.weight: 2.9802322387695312e-08
  shard 0 layers.0.feed_forward.w2.weight: 2.9802322387695312e-08
  shard 0 layers.0.feed_forward.w3.weight: 2.9802322387695312e-08
  shard 0 layers.0.attention_norm.weight: 2.9802322387695312e-08
  • Converted 70B inference test
torchrun --nproc_per_node 8 example_chat_completion.py --ckpt_dir ~/llama2/test70B --tokenizer_path ~/llama2/tokenizer.model

==================================

System: You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while being safe. Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, or illegal content. Please ensure that your responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.

User: Write a brief birthday message to John

> Assistant:  "Dear John, I hope your birthday is filled with joy, love, and all your favorite things. May this year bring you success, happiness, and countless moments to cherish. Wishing you a wonderful day and a brilliant year ahead! 🎉🎂❤️"

Before submitting

Thanks for contributing 🎉!

Copy link
Contributor

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi! Overall it LGTM, left a minor comment regarding the fine-tuning notation for the shards. Would you be able to move the main routines of your scripts into src/llama-recipes/tools and import them in your examples/hf_llama_conversion/*.py script? That way users who install llama-recipes through pip can leverage them (through python -m llama_recipes.tools.script_name params) without the need to clone the git. See https://github.com/facebookresearch/llama-recipes/blob/main/examples/finetuning.py as an example.

@dongwang218 dongwang218 requested a review from mreso December 13, 2023 19:09
Copy link
Contributor

@mreso mreso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mreso mreso merged commit cf0c589 into meta-llama:main Dec 13, 2023
3 checks passed
@dongwang218 dongwang218 deleted the llama_from_hf branch December 13, 2023 23:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

convert hf llama2 weight to meta llama2 weight
3 participants