New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

DOC: Explain uninitialized weights warning #2369

Open

BenjaminBossan wants to merge 1 commit into huggingface:main from BenjaminBossan:doc-troubleshooting-weights-not-initialized

Member

BenjaminBossan commented Feb 7, 2025

Users sometimes get confused by the warning from transformers that some weights are uninitialized and need to be trained when they use models for classification. A recent example is #2367.

Even though the warning does not come from PEFT, let's add a section to the docs to explain this warning, as the situation is a bit different here.


          DOC Explain uninitialized weights warning

df14597

Users sometimes get confused by the warning from transformers that some
weights are uninitialized and need to be trained when they use models
for classification. A recent example is huggingface#2367.

Even though the warning does not come from PEFT, let's add a section to
the docs to explain this warning, as the situation is a bit different
here.

HuggingFaceDocBuilderDev commented Feb 7, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan requested a review from stevhliu

February 7, 2025 15:58

stevhliu approved these changes

View reviewed changes

Member

stevhliu left a comment

Great explanation!

docs/source/developer_guides/troubleshooting.md

               For a complete example, please check out [this notebook](https://github.com/huggingface/peft/blob/main/examples/causal_language_modeling/peft_lora_clm_with_additional_tokens.ipynb).
+              ### Getting a warning about "weights not being initialized from the model checkpoint"
+              When you load your PEFT model which has been trained on a classification task, you may get a warning like:

Member

stevhliu Feb 7, 2025

Suggested change

      
            When you load your PEFT model which has been trained on a classification task, you may get a warning like:
          
            When you load your PEFT model which has been trained on a task (for example, classification), you may get a warning like:

docs/source/developer_guides/troubleshooting.md


		> Some weights of LlamaForSequenceClassification were not initialized from the model checkpoint at meta-llama/Llama-3.2-1B and are newly initialized: ['score.weight']. You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

		Although this looks scary, it is most likely nothing to worry about. It is in fact not a PEFT specific warning, instead it comes from `transformers`. The reason why you get is probably because you used something like `AutoModelForSequenceClassification`. This will attach a randomly initialized classification head to the base model (called `"score"` in this case). This head must be trained to produce sensible predictions, which is what the warning is telling you.

Member

stevhliu Feb 7, 2025

Suggested change

      
            Although this looks scary, it is most likely nothing to worry about. It is in fact not a PEFT specific warning, instead it comes from `transformers`. The reason why you get is probably because you used something like `AutoModelForSequenceClassification`. This will attach a randomly initialized classification head to the base model (called `"score"` in this case). This head must be trained to produce sensible predictions, which is what the warning is telling you.
          
            Although this looks scary, it is most likely nothing to worry about. This warning comes from Transformers, and it isn't a PEFT specific warning. It lets you know that a randomly initialized classification head (`score`) is attached to the base model, and the head must be trained to produce sensible predictions.

docs/source/developer_guides/troubleshooting.md


		Although this looks scary, it is most likely nothing to worry about. It is in fact not a PEFT specific warning, instead it comes from `transformers`. The reason why you get is probably because you used something like `AutoModelForSequenceClassification`. This will attach a randomly initialized classification head to the base model (called `"score"` in this case). This head must be trained to produce sensible predictions, which is what the warning is telling you.

		When you get this warning _before_ training the model, there is thus no cause for concern. PEFT will automatically take care of making the classification head trainable if you correctly passed the `task_type` argument to the PEFT config, e.g. like so:

Member

stevhliu Feb 7, 2025

Suggested change

      
            When you get this warning _before_ training the model, there is thus no cause for concern. PEFT will automatically take care of making the classification head trainable if you correctly passed the `task_type` argument to the PEFT config, e.g. like so:
          
            When you get this warning _before_ training the model, PEFT automatically takes care of making the classification head trainable if you correctly passed the `task_type` argument to the PEFT config.

docs/source/developer_guides/troubleshooting.md

+              lora_config = LoraConfig(..., task_type=TaskType.SEQ_CLS)
+              ```
+              If your classification head does not follow the usual naming conventions from `transformers` (which is rare), you have to explicitly tell PEFT how the head is called using the `modules_to_save` argument:

Member

stevhliu Feb 7, 2025

Suggested change

      
            If your classification head does not follow the usual naming conventions from `transformers` (which is rare), you have to explicitly tell PEFT how the head is called using the `modules_to_save` argument:
          
            If your classification head does not follow the usual naming conventions from Transformers (which is rare), you have to explicitly tell PEFT the name of the head in `modules_to_save`.

docs/source/developer_guides/troubleshooting.md

+              lora_config = LoraConfig(..., modules_to_save=["name-of-classification-head"])
+              ```
+              To check the name of the classification head, just print the model, it should be the last module.

Member

stevhliu Feb 7, 2025

Suggested change

      
            To check the name of the classification head, just print the model, it should be the last module.
          
            To check the name of the classification head, print the model and it should be the last module.

docs/source/developer_guides/troubleshooting.md

Comment on lines +175 to +177

		If you get this warning from you inference code, i.e. _after_ training model, there is also nothing to worry about. When you load the PEFT model, you always first have to load the `transformers` model. Since `transformers` does not know that you will load PEFT weights afterwards, it still gives the warning.

		As always, it is best practice to ensure that the model works correctly for inference by running some validation on it. But the fact that you see this warning is no cause for concern.

Member

stevhliu Feb 7, 2025

Suggested change

      
            If you get this warning from you inference code, i.e. _after_ training model, there is also nothing to worry about. When you load the PEFT model, you always first have to load the `transformers` model. Since `transformers` does not know that you will load PEFT weights afterwards, it still gives the warning.
          
            As always, it is best practice to ensure that the model works correctly for inference by running some validation on it. But the fact that you see this warning is no cause for concern.
          
            If you get this warning from your inference code, i.e. _after_ training the model, when you load the PEFT model, you always have to load the Transformers model first. Since Transformers does not know that you will load PEFT weights afterwards, it still gives the warning.
          
            As always, it is best practice to ensure the model works correctly for inference by running some validation on it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet