From efc1074df1702559398ad0f3cb33c027bb1e23ee Mon Sep 17 00:00:00 2001
From: cwj <talkingwallace@sohu.com>
Date: Fri, 1 Mar 2024 17:01:17 +0800
Subject: [PATCH] Fix doc path Signed-off-by: weijingchen
 <talkingwallace@sohu.com>

Signed-off-by: cwj <talkingwallace@sohu.com>
---
 README.md                                     |   2 +-
 .../ChatGLM-6B_ds.ipynb                       | 463 ------------------
 2 files changed, 1 insertion(+), 464 deletions(-)
 delete mode 100644 doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb
diff --git a/README.md b/README.md
index 086a4c8..20a5f0c 100644
--- a/README.md
+++ b/README.md
@@ -24,6 +24,6 @@ Use [FATE-LLM deployment packages](https://github.com/FederatedAI/FATE/wiki/Down
 
 ## Quick Start
 - [Offsite-tuning Tutorial: Model Definition and Job Submission](./doc/tutorial/offsite_tuning/Offsite_tuning_tutorial.ipynb)
-- [Federated ChatGLM-6B Training](./doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb)
+- [Federated ChatGLM3-6B Training](./doc/tutorial/parameter_efficient_llm/ChatGLM3-6B_ds.ipynb)
 - [Builtin Models In PELLM](./doc/tutorial/builtin_models.md)
 - [Offsite Tuning Tutorial](./doc/tutorial/offsite_tuning/Offsite_tuning_tutorial.ipynb)
\ No newline at end of file
diff --git a/doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb b/doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb
deleted file mode 100644
index f3a43c1..0000000
--- a/doc/tutorial/parameter_efficient_llm/ChatGLM-6B_ds.ipynb
+++ /dev/null
@@ -1,463 +0,0 @@
-{
- "cells": [
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "#  Federated ChatGLM Tuning with Parameter Efficient methods in FATE-LLM"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "In this tutorial, we will demonstrate how to efficiently train federated ChatGLM-6B with deepspeed using the FATE-LLM framework. In FATE-LLM, we introduce the \"pellm\"(Parameter Efficient Large Language Model) module, specifically designed for federated learning with large language models. We enable the implementation of parameter-efficient methods in federated learning, reducing communication overhead while maintaining model performance. In this tutorial we particularlly focus on ChatGLM-^b, and we will also emphasize the use of the Adapter mechanism for fine-tuning ChatGLM-6B, which enables us to effectively reduce communication volume and improve overall efficiency.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## FATE-LLM: ChatGLM-6B\n",
-    "\n",
-    "### ChatGLM-6B\n",
-    "ChatGLM-6B is a large transformer-based language model with 6.2 billion parameters, trained on about 1T tokens of Chinese and English corpus. ChatGLM-6B is an open bilingual language model based on General Language Model. You can download the pretrained model from [here](https://huggingface.co/THUDM/chatglm-6b), or let the program automatically download it when you use it later.\n",
-    "\n",
-    "### Current Features\n",
-    "\n",
-    "In current version, FATE-LLM: ChatGLM-6B supports the following features:\n",
-    "<div align=\"center\">\n",
-    "  <img src=\"../images/fate-llm-chatglm-6b.png\">\n",
-    "</div>"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "## Experiment Setting"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Before running experiment, please make sure that [FATE-LLM Cluster](https://github.com/FederatedAI/FATE/wiki/Download#llm%E9%83%A8%E7%BD%B2%E5%8C%85) has been deployed. "
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Dataset: Advertising Text Generation\n",
-    "\n",
-    "This is an advertising test generateion dataset, you can download dataset from the following links and place it in the examples/data folder. \n",
-    "- [data link 1](https://drive.google.com/file/d/13_vf0xRTQsyneRKdD1bZIr93vBGOczrk/view)\n",
-    "- [data link 2](https://cloud.tsinghua.edu.cn/f/b3f119a008264b1cabd1/?dl=1)  \n",
-    "\n",
-    "You can refer to following link for more details about [data](https://aclanthology.org/D19-1321.pdf)"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 5,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import pandas as pd\n",
-    "df = pd.read_json('${fate_install}/examples/data/AdvertiseGen/train.json', lines=True)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### ChatGLM-6B with Adapter\n",
-    "\n",
-    "In this section, we will guide you through the process of finetuning ChatGLM-6B with adapters using the FATE-LLM framework. Before starting this section, we recommend that you read through this tutorial first: [Model Customization](https://github.com/FederatedAI/FATE/blob/master/doc/tutorial/pipeline/nn_tutorial/Homo-NN-Customize-Model.ipynb)."
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "ChatGLM model is located on fate_llm/model_zoo/chatglm.py, can be use directly"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 7,
-   "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "albert.py  bert.py     deberta.py     gpt2.py\t\t\t  __pycache__\r\n",
-      "bart.py    chatglm.py  distilbert.py  parameter_efficient_llm.py  roberta.py\r\n"
-     ]
-    }
-   ],
-   "source": [
-    "! ls ../../../fate/python/fate_llm/model_zoo/pellm"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "#### Adapters"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "We can directly use adapters from the peft. See details for adapters on this page [Adapter Methods](https://huggingface.co/docs/peft/index) for more details. By specifying the adapter name and the adapter\n",
-    "config dict we can insert adapters into our language models:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 12,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from peft import LoraConfig, TaskType\n",
-    "\n",
-    "# define lora config\n",
-    "lora_config = LoraConfig(\n",
-    "    task_type=TaskType.SEQ_CLS,\n",
-    "    inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,\n",
-    "    target_modules=['c_attn'],\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "#### Init ChatGLM Model "
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 14,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import torch as t\n",
-    "from pipeline import fate_torch_hook\n",
-    "from pipeline.component.nn import save_to_fate_llm\n",
-    "fate_torch_hook(t)\n",
-    "\n",
-    "model_path = \"your download chatglm path\"\n",
-    "model = t.nn.Sequential(\n",
-    "    t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',\n",
-    "                   peft_config=lora_config.to_dict(), peft_type='LoraConfig',\n",
-    "                   pretrained_path=model_path)\n",
-    ")\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "**During the training process, all weights of the pretrained language model will be frozen, and weights of adapters are traininable. Thus, FATE-LLM only train in the local training and aggregate adapters' weights in the fedederation process**\n",
-    "\n",
-    "Now available adapters are [Adapters Overview](https://huggingface.co/docs/peft/index) for details.\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "#### Inint DeepSpeed Config"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 15,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "ds_config = {\n",
-    "    \"train_micro_batch_size_per_gpu\": 1,\n",
-    "    \"optimizer\": {\n",
-    "        \"type\": \"Adam\",\n",
-    "        \"params\": {\n",
-    "            \"lr\": 5e-4\n",
-    "        }\n",
-    "    },\n",
-    "    \"fp16\": {\n",
-    "        \"enabled\": True\n",
-    "    },\n",
-    "    \"zero_optimization\": {\n",
-    "        \"stage\": 2,\n",
-    "        \"allgather_partitions\": True,\n",
-    "        \"allgather_bucket_size\": 5e8,\n",
-    "        \"overlap_comm\": False,\n",
-    "        \"reduce_scatter\": True,\n",
-    "        \"reduce_bucket_size\": 5e8,\n",
-    "        \"contiguous_gradients\": True\n",
-    "    }\n",
-    "}\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Submit Federated Task\n",
-    "To run federated task, please make sure to ues fate>=v1.11.2 and deploy it with gpu machines. To running this code, make sure training data path is already binded. The following code shoud be copy to a script and run in a command line like \"python federated_chatglm.py\""
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "You can use this script to submit the model, but submitting the model will take a long time to train and generate a long log, so we won't do it here."
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import torch as t\n",
-    "import os\n",
-    "from pipeline import fate_torch_hook\n",
-    "from pipeline.component import HomoNN\n",
-    "from pipeline.backend.pipeline import PipeLine\n",
-    "from pipeline.component import Reader\n",
-    "from pipeline.interface import Data\n",
-    "from pipeline.runtime.entity import JobParameters\n",
-    "\n",
-    "fate_torch_hook(t)\n",
-    "\n",
-    "\n",
-    "guest_0 = 9999\n",
-    "host_1 = 10000\n",
-    "pipeline = PipeLine().set_initiator(role='guest', party_id=guest_0).set_roles(guest=guest_0, host=host_1,\n",
-    "                                                                              arbiter=guest_0)\n",
-    "data_guest = {\"name\": \"ad_guest\", \"namespace\": \"experiment\"}\n",
-    "data_host = {\"name\": \"ad_host\", \"namespace\": \"experiment\"}\n",
-    "guest_data_path = \"${fate_install}/examples/data/AdvertiseGen/train.json_guest\"\n",
-    "host_data_path = \"${fate_install}/examples/data/AdvertiseGen/train.json_host\"\n",
-    "# make sure the guest and host's training data are already binded\n",
-    "\n",
-    "reader_0 = Reader(name=\"reader_0\")\n",
-    "reader_0.get_party_instance(role='guest', party_id=guest_0).component_param(table=data_guest)\n",
-    "reader_0.get_party_instance(role='host', party_id=host_1).component_param(table=data_host)\n",
-    "\n",
-    "## Add your pretriained model path here, will load model&tokenizer from this path\n",
-    "\n",
-    "from peft import LoraConfig, TaskType\n",
-    "lora_config = LoraConfig(\n",
-    "    task_type=TaskType.CAUSAL_LM,\n",
-    "    inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,\n",
-    "    target_modules=['query_key_value'],\n",
-    ")\n",
-    "ds_config = {\n",
-    "    \"train_micro_batch_size_per_gpu\": 1,\n",
-    "    \"optimizer\": {\n",
-    "        \"type\": \"Adam\",\n",
-    "        \"params\": {\n",
-    "            \"lr\": 5e-4\n",
-    "        }\n",
-    "    },\n",
-    "    \"fp16\": {\n",
-    "        \"enabled\": True\n",
-    "    },\n",
-    "    \"zero_optimization\": {\n",
-    "        \"stage\": 2,\n",
-    "        \"allgather_partitions\": True,\n",
-    "        \"allgather_bucket_size\": 5e8,\n",
-    "        \"overlap_comm\": False,\n",
-    "        \"reduce_scatter\": True,\n",
-    "        \"reduce_bucket_size\": 5e8,\n",
-    "        \"contiguous_gradients\": True\n",
-    "    }\n",
-    "}\n",
-    "\n",
-    "model_path = \"your download chatglm path\"\n",
-    "from pipeline.component.homo_nn import DatasetParam, TrainerParam\n",
-    "model = t.nn.Sequential(\n",
-    "    t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',\n",
-    "                   peft_config=lora_config.to_dict(), peft_type='LoraConfig',\n",
-    "                   pretrained_path=model_path)\n",
-    ")\n",
-    "\n",
-    "# DatasetParam\n",
-    "dataset_param = DatasetParam(dataset_name='glm_tokenizer', text_max_length=64, tokenizer_name_or_path=model_path,\n",
-    "                             padding_side=\"left\")\n",
-    "# TrainerParam\n",
-    "trainer_param = TrainerParam(trainer_name='fedavg_trainer', epochs=5, batch_size=4, \n",
-    "                             checkpoint_save_freqs=1, pin_memory=False, \n",
-    "                             task_type=\"seq_2_seq_lm\",\n",
-    "                             data_loader_worker=8, \n",
-    "                             save_to_local_dir=True, # pay attention to tihs parameter\n",
-    "                             collate_fn=\"DataCollatorForSeq2Seq\")\n",
-    "\n",
-    "\n",
-    "nn_component = HomoNN(name='nn_0', model=model , ds_config=ds_config)\n",
-    "\n",
-    "# set parameter for client 1\n",
-    "nn_component.get_party_instance(role='guest', party_id=guest_0).component_param(\n",
-    "    dataset=dataset_param,\n",
-    "    trainer=trainer_param,\n",
-    "    torch_seed=100\n",
-    ")\n",
-    "\n",
-    "# set parameter for client 2\n",
-    "nn_component.get_party_instance(role='host', party_id=host_1).component_param(\n",
-    "    dataset=dataset_param,\n",
-    "    trainer=trainer_param,\n",
-    "    torch_seed=100\n",
-    ")\n",
-    "\n",
-    "# set parameter for server\n",
-    "nn_component.get_party_instance(role='arbiter', party_id=guest_0).component_param(\n",
-    "    trainer=trainer_param\n",
-    ")\n",
-    "\n",
-    "pipeline.add_component(reader_0)\n",
-    "pipeline.add_component(nn_component, data=Data(train_data=reader_0.output.data))\n",
-    "pipeline.compile()\n",
-    "\n",
-    "pipeline.fit(JobParameters(task_conf={\n",
-    "    \"nn_0\": {\n",
-    "        \"launcher\": \"deepspeed\",\n",
-    "        \"world_size\": 8 # world_size means num of gpus to train in a single client\n",
-    "    }\n",
-    "}))\n"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Training With P-Tuning V2 Adapter"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "To use another adapter lke P-Tuning V2, slightly changes is needed!"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": 20,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "from pipeline.component.homo_nn import DatasetParam, TrainerParam\n",
-    "model = t.nn.Sequential(\n",
-    "    t.nn.CustModel(module_name='pellm.chatglm', class_name='ChatGLMForConditionalGeneration',\n",
-    "                   pre_seq_len=128, # only this parameters is needed\n",
-    "                   pretrained_path=model_path)\n",
-    ")"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "### Inference"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Models trained with FATE-LLM can be find under the directory `${fate_install}/fateflow/model/$jobids/$cpn_name/{model.pkl, checkpoint_xxx.pkl/adapter_model.bin}`, users must may sure \"save_to_local_dir=True\".  \n",
-    "The following code is an example to load trained lora adapter weights:"
-   ]
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    "import json\n",
-    "import sys\n",
-    "import torch\n",
-    "from peft import PeftModel, PeftConfig, LoraConfig, TaskType, get_peft_model\n",
-    "from transformers import AutoModel, AutoTokenizer\n",
-    "\n",
-    "\n",
-    "def load_model(pretrained_model_path):\n",
-    "    _tokenizer = AutoTokenizer.from_pretrained(pretrained_model_path, trust_remote_code=True)\n",
-    "    _model = AutoModel.from_pretrained(pretrained_model_path, trust_remote_code=True)\n",
-    "\n",
-    "    _model = _model.half()\n",
-    "    _model = _model.eval()\n",
-    "\n",
-    "    return _model, _tokenizer\n",
-    "\n",
-    "\n",
-    "def load_data(data_path):\n",
-    "    with open(data_path, \"r\") as fin:\n",
-    "        for _l in fin:\n",
-    "            yield json.loads(_l.strip())\n",
-    "\n",
-    "chatglm_model_path = \"\"\n",
-    "model, tokenizer = load_model(chatglm_model_path)\n",
-    "\n",
-    "test_data_path = \"{fate_install}/examples/data/AdvertiseGen/dev.json\"\n",
-    "dataset = load_data(test_data_path)\n",
-    "\n",
-    "peft_path = trained_model_path\n",
-    "peft_config = LoraConfig(\n",
-    "    task_type=TaskType.CAUSAL_LM,\n",
-    "    inference_mode=False, r=8, lora_alpha=32, lora_dropout=0.1,\n",
-    "    target_modules=['query_key_value'],\n",
-    ")\n",
-    "\n",
-    "model = get_peft_model(model, peft_config)\n",
-    "model.load_state_dict(torch.load(peft_path), strict=False)\n",
-    "model = model.half()\n",
-    "model.eval()\n",
-    "\n",
-    "for p in model.parameters():\n",
-    "    if p.requires_grad:\n",
-    "        print(p)\n",
-    "\n",
-    "model.cuda(\"cuda:0\")\n",
-    "\n",
-    "content = \"advertisement keywords\"\n",
-    "model.chat(tokenizer, content, do_sample=False)"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": []
-  }
- ],
- "metadata": {
-  "kernelspec": {
-   "display_name": "Python 3 (ipykernel)",
-   "language": "python",
-   "name": "python3"
-  },
-  "language_info": {
-   "codemirror_mode": {
-    "name": "ipython",
-    "version": 3
-   },
-   "file_extension": ".py",
-   "mimetype": "text/x-python",
-   "name": "python",
-   "nbconvert_exporter": "python",
-   "pygments_lexer": "ipython3",
-   "version": "3.9.0"
-  }
- },
- "nbformat": 4,
- "nbformat_minor": 2
-}