In this repository, we explore techniques for fine-tuning Large Language Models (LLMs) using:
- LoRA (Low-Rank Adaptation)
- DeepSpeed
- Custom Trainer optimized
for Gaudi-v2 using the habana framework.
Inside the Gaudi-v2 docker environment, install the necessary dependencies by running:
pip install -r requirements.txt
To initiate the fine-tuning process, execute the following command:
PT_HPU_LAZY_MODE=0 python finetune.py --config-name=finetune_lora.yaml
Explanation:
PT_HPU_LAZY_MODE=0
: Enables eager mode as lazy mode is currently not supported on Gaudi-v2.--config-name
: Specifies the configuration file. Detailed configurations can be found inconfig/finetune_lora.yaml
.
You can customize the LoRA parameters (e.g., rank, scaling factor, and dropout rate) in the configuration file config/finetune_lora.yaml
:
LoRA:
r: 8 # Rank of the LoRA matrices
alpha: 32 # Scaling factor
dropout: 0.05 # Dropout probability
DeepSpeed parameters can be configured in the file config/ds_config.json
. Example:
{
"zero_optimization": {
"stage": 1,
"offload_optimizer": {
"device": "none",
"pin_memory": true
}
}
}
This configuration enables stage-1 ZeRO optimization without offloading the optimizer. Feel free to use stage-2 or stage-3 optimization.
An example of the custom trainer implementation can be found in dataloader.py
. This code shows how to have a custom trainer wrapped by the Gaudi-v2Trainer class.