From efc1074df1702559398ad0f3cb33c027bb1e23ee Mon Sep 17 00:00:00 2001 From: cwj Date: Fri, 1 Mar 2024 17:01:17 +0800 Subject: [PATCH] Fix doc path Signed-off-by: weijingchen Signed-off-by: cwj --- README.md | 2 +- .../ChatGLM-6B_ds.ipynb | 463 ------------------ 2 files changed, 1 insertion(+), 464 deletions(-) delete mode 100644 doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb diff --git a/README.md b/README.md index 086a4c8..20a5f0c 100644 --- a/README.md +++ b/README.md @@ -24,6 +24,6 @@ Use [FATE-LLM deployment packages](https://github.com/FederatedAI/FATE/wiki/Down ## Quick Start - [Offsite-tuning Tutorial: Model Definition and Job Submission](./doc/tutorial/offsite_tuning/Offsite_tuning_tutorial.ipynb) -- [Federated ChatGLM-6B Training](./doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb) +- [Federated ChatGLM3-6B Training](./doc/tutorial/parameter_efficient_llm/ChatGLM3-6B_ds.ipynb) - [Builtin Models In PELLM](./doc/tutorial/builtin_models.md) - [Offsite Tuning Tutorial](./doc/tutorial/offsite_tuning/Offsite_tuning_tutorial.ipynb) \ No newline at end of file diff --git a/doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb b/doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb deleted file mode 100644 index f3a43c1..0000000 --- a/doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb +++ /dev/null @@ -1,463 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Federated ChatGLM Tuning with Parameter Efficient methods in FATE-LLM" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "In this tutorial, we will demonstrate how to efficiently train federated ChatGLM-6B with deepspeed using the FATE-LLM framework. In FATE-LLM, we introduce the \"pellm\"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on ChatGLM-^b, and we will also emphasize the use of the Adapter mechanism for fine-tuning ChatGLM-6B, which enables us to effectively reduce communication volume and improve overall efficiency.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## FATE-LLM: ChatGLM-6B\n", - "\n", - "### ChatGLM-6B\n", - "ChatGLM-6B is a large transformer-based language model with 6.2 billion parameters, trained on about 1T tokens of Chinese and English corpus. ChatGLM-6B is an open bilingual language model based on General Language Model. You can download the pretrained model from [here](https://huggingface.co/THUDM/chatglm-6b), or let the program automatically download it when you use it later.\n", - "\n", - "### Current Features\n", - "\n", - "In current version, FATE-LLM: ChatGLM-6B supports the following features:\n", - "
\n", - " \n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Experiment Setting" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Before running experiment, please make sure that [FATE-LLM Cluster](https://github.com/FederatedAI/FATE/wiki/Download#llm%E9%83%A8%E7%BD%B2%E5%8C%85) has been deployed. " - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Dataset: Advertising Text Generation\n", - "\n", - "This is an advertising test generateion dataset, you can download dataset from the following links and place it in the examples/data folder. \n", - "- [data link 1](https://drive.google.com/file/d/13_vf0xRTQsyneRKdD1bZIr93vBGOczrk/view)\n", - "- [data link 2](https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1) \n", - "\n", - "You can refer to following link for more details about [data](https://aclanthology.org/D19-1321.pdf)" - ] - }, - { - "cell_type": "code", - "execution_count": 5, - "metadata": {}, - "outputs": [], - "source": [ - "import pandas as pd\n", - "df = pd.read_json('${fate_install}/examples/data/AdvertiseGen/train.json', lines=True)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### ChatGLM-6B with Adapter\n", - "\n", - "In this section, we will guide you through the process of finetuning ChatGLM-6B with adapters using the FATE-LLM framework. Before starting this section, we recommend that you read through this tutorial first: [Model Customization](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/nn_tutorial/Homo-NN-Customize-Model.ipynb)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "ChatGLM model is located on fate_llm/model_zoo/chatglm.py, can be use directly" - ] - }, - { - "cell_type": "code", - "execution_count": 7, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "albert.py bert.py deberta.py gpt2.py\t\t\t __pycache__\r\n", - "bart.py chatglm.py distilbert.py parameter_efficient_llm.py roberta.py\r\n" - ] - } - ], - "source": [ - "! ls ../../../fate/python/fate_llm/model_zoo/pellm" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Adapters" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "We can directly use adapters from the peft. See details for adapters on this page [Adapter Methods](https://huggingface.co/docs/peft/index) for more details. By specifying the adapter name and the adapter\n", - "config dict we can insert adapters into our language models:" - ] - }, - { - "cell_type": "code", - "execution_count": 12, - "metadata": {}, - "outputs": [], - "source": [ - "from peft import LoraConfig, TaskType\n", - "\n", - "# define lora config\n", - "lora_config = LoraConfig(\n", - " task_type=TaskType.SEQ_CLS,\n", - " inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,\n", - " target_modules=['c_attn'],\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Init ChatGLM Model " - ] - }, - { - "cell_type": "code", - "execution_count": 14, - "metadata": {}, - "outputs": [], - "source": [ - "import torch as t\n", - "from pipeline import fate_torch_hook\n", - "from pipeline.component.nn import save_to_fate_llm\n", - "fate_torch_hook(t)\n", - "\n", - "model_path = \"your download chatglm path\"\n", - "model = t.nn.Sequential(\n", - " t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',\n", - " peft_config=lora_config.to_dict(), peft_type='LoraConfig',\n", - " pretrained_path=model_path)\n", - ")\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "**During the training process, all weights of the pretrained language model will be frozen, and weights of adapters are traininable. Thus, FATE-LLM only train in the local training and aggregate adapters' weights in the fedederation process**\n", - "\n", - "Now available adapters are [Adapters Overview](https://huggingface.co/docs/peft/index) for details.\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "#### Inint DeepSpeed Config" - ] - }, - { - "cell_type": "code", - "execution_count": 15, - "metadata": {}, - "outputs": [], - "source": [ - "ds_config = {\n", - " \"train_micro_batch_size_per_gpu\": 1,\n", - " \"optimizer\": {\n", - " \"type\": \"Adam\",\n", - " \"params\": {\n", - " \"lr\": 5e-4\n", - " }\n", - " },\n", - " \"fp16\": {\n", - " \"enabled\": True\n", - " },\n", - " \"zero_optimization\": {\n", - " \"stage\": 2,\n", - " \"allgather_partitions\": True,\n", - " \"allgather_bucket_size\": 5e8,\n", - " \"overlap_comm\": False,\n", - " \"reduce_scatter\": True,\n", - " \"reduce_bucket_size\": 5e8,\n", - " \"contiguous_gradients\": True\n", - " }\n", - "}\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Submit Federated Task\n", - "To run federated task, please make sure to ues fate>=v1.11.2 and deploy it with gpu machines. To running this code, make sure training data path is already binded. The following code shoud be copy to a script and run in a command line like \"python federated_chatglm.py\"" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "You can use this script to submit the model, but submitting the model will take a long time to train and generate a long log, so we won't do it here." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import torch as t\n", - "import os\n", - "from pipeline import fate_torch_hook\n", - "from pipeline.component import HomoNN\n", - "from pipeline.backend.pipeline import PipeLine\n", - "from pipeline.component import Reader\n", - "from pipeline.interface import Data\n", - "from pipeline.runtime.entity import JobParameters\n", - "\n", - "fate_torch_hook(t)\n", - "\n", - "\n", - "guest_0 = 9999\n", - "host_1 = 10000\n", - "pipeline = PipeLine().set_initiator(role='guest', party_id=guest_0).set_roles(guest=guest_0, host=host_1,\n", - " arbiter=guest_0)\n", - "data_guest = {\"name\": \"ad_guest\", \"namespace\": \"experiment\"}\n", - "data_host = {\"name\": \"ad_host\", \"namespace\": \"experiment\"}\n", - "guest_data_path = \"${fate_install}/examples/data/AdvertiseGen/train.json_guest\"\n", - "host_data_path = \"${fate_install}/examples/data/AdvertiseGen/train.json_host\"\n", - "# make sure the guest and host's training data are already binded\n", - "\n", - "reader_0 = Reader(name=\"reader_0\")\n", - "reader_0.get_party_instance(role='guest', party_id=guest_0).component_param(table=data_guest)\n", - "reader_0.get_party_instance(role='host', party_id=host_1).component_param(table=data_host)\n", - "\n", - "## Add your pretriained model path here, will load model&tokenizer from this path\n", - "\n", - "from peft import LoraConfig, TaskType\n", - "lora_config = LoraConfig(\n", - " task_type=TaskType.CAUSAL_LM,\n", - " inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,\n", - " target_modules=['query_key_value'],\n", - ")\n", - "ds_config = {\n", - " \"train_micro_batch_size_per_gpu\": 1,\n", - " \"optimizer\": {\n", - " \"type\": \"Adam\",\n", - " \"params\": {\n", - " \"lr\": 5e-4\n", - " }\n", - " },\n", - " \"fp16\": {\n", - " \"enabled\": True\n", - " },\n", - " \"zero_optimization\": {\n", - " \"stage\": 2,\n", - " \"allgather_partitions\": True,\n", - " \"allgather_bucket_size\": 5e8,\n", - " \"overlap_comm\": False,\n", - " \"reduce_scatter\": True,\n", - " \"reduce_bucket_size\": 5e8,\n", - " \"contiguous_gradients\": True\n", - " }\n", - "}\n", - "\n", - "model_path = \"your download chatglm path\"\n", - "from pipeline.component.homo_nn import DatasetParam, TrainerParam\n", - "model = t.nn.Sequential(\n", - " t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',\n", - " peft_config=lora_config.to_dict(), peft_type='LoraConfig',\n", - " pretrained_path=model_path)\n", - ")\n", - "\n", - "# DatasetParam\n", - "dataset_param = DatasetParam(dataset_name='glm_tokenizer', text_max_length=64, tokenizer_name_or_path=model_path,\n", - " padding_side=\"left\")\n", - "# TrainerParam\n", - "trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=5, batch_size=4, \n", - " checkpoint_save_freqs=1, pin_memory=False, \n", - " task_type=\"seq_2_seq_lm\",\n", - " data_loader_worker=8, \n", - " save_to_local_dir=True, # pay attention to tihs parameter\n", - " collate_fn=\"DataCollatorForSeq2Seq\")\n", - "\n", - "\n", - "nn_component = HomoNN(name='nn_0', model=model , ds_config=ds_config)\n", - "\n", - "# set parameter for client 1\n", - "nn_component.get_party_instance(role='guest', party_id=guest_0).component_param(\n", - " dataset=dataset_param,\n", - " trainer=trainer_param,\n", - " torch_seed=100\n", - ")\n", - "\n", - "# set parameter for client 2\n", - "nn_component.get_party_instance(role='host', party_id=host_1).component_param(\n", - " dataset=dataset_param,\n", - " trainer=trainer_param,\n", - " torch_seed=100\n", - ")\n", - "\n", - "# set parameter for server\n", - "nn_component.get_party_instance(role='arbiter', party_id=guest_0).component_param(\n", - " trainer=trainer_param\n", - ")\n", - "\n", - "pipeline.add_component(reader_0)\n", - "pipeline.add_component(nn_component, data=Data(train_data=reader_0.output.data))\n", - "pipeline.compile()\n", - "\n", - "pipeline.fit(JobParameters(task_conf={\n", - " \"nn_0\": {\n", - " \"launcher\": \"deepspeed\",\n", - " \"world_size\": 8 # world_size means num of gpus to train in a single client\n", - " }\n", - "}))\n" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Training With P-Tuning V2 Adapter" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "To use another adapter lke P-Tuning V2, slightly changes is needed!" - ] - }, - { - "cell_type": "code", - "execution_count": 20, - "metadata": {}, - "outputs": [], - "source": [ - "from pipeline.component.homo_nn import DatasetParam, TrainerParam\n", - "model = t.nn.Sequential(\n", - " t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',\n", - " pre_seq_len=128, # only this parameters is needed\n", - " pretrained_path=model_path)\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Inference" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Models trained with FATE-LLM can be find under the directory `${fate_install}/fateflow/model/$jobids/$cpn_name/{model.pkl, checkpoint_xxx.pkl/adapter_model.bin}`, users must may sure \"save_to_local_dir=True\". \n", - "The following code is an example to load trained lora adapter weights:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [ - "import json\n", - "import sys\n", - "import torch\n", - "from peft import PeftModel, PeftConfig, LoraConfig, TaskType, get_peft_model\n", - "from transformers import AutoModel, AutoTokenizer\n", - "\n", - "\n", - "def load_model(pretrained_model_path):\n", - " _tokenizer = AutoTokenizer.from_pretrained(pretrained_model_path, trust_remote_code=True)\n", - " _model = AutoModel.from_pretrained(pretrained_model_path, trust_remote_code=True)\n", - "\n", - " _model = _model.half()\n", - " _model = _model.eval()\n", - "\n", - " return _model, _tokenizer\n", - "\n", - "\n", - "def load_data(data_path):\n", - " with open(data_path, \"r\") as fin:\n", - " for _l in fin:\n", - " yield json.loads(_l.strip())\n", - "\n", - "chatglm_model_path = \"\"\n", - "model, tokenizer = load_model(chatglm_model_path)\n", - "\n", - "test_data_path = \"{fate_install}/examples/data/AdvertiseGen/dev.json\"\n", - "dataset = load_data(test_data_path)\n", - "\n", - "peft_path = trained_model_path\n", - "peft_config = LoraConfig(\n", - " task_type=TaskType.CAUSAL_LM,\n", - " inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,\n", - " target_modules=['query_key_value'],\n", - ")\n", - "\n", - "model = get_peft_model(model, peft_config)\n", - "model.load_state_dict(torch.load(peft_path), strict=False)\n", - "model = model.half()\n", - "model.eval()\n", - "\n", - "for p in model.parameters():\n", - " if p.requires_grad:\n", - " print(p)\n", - "\n", - "model.cuda(\"cuda:0\")\n", - "\n", - "content = \"advertisement keywords\"\n", - "model.chat(tokenizer, content, do_sample=False)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.9.0" - } - }, - "nbformat": 4, - "nbformat_minor": 2 -}