Deploy Service
+Serving Models
The Service
+Initializing InstructLab
The code:
+With ilab installed, we can initialize our tuning environment with the ilab config init
command.
+This will download the Taxonomy repository which contains a default configuration file and community-provided knowledge as examples to train the model.
public class Main {
-
- public static void main(String[] args) {
-
- }
-
-}
-
-./mvnw compile
+ilab config init
++ + | ++You could scaffold your taxonomy repository with your organization defaults, but for now, we’ll stick with the default one. + | +
Welcome to InstructLab CLI. This guide will help you to setup your environment.
+Please provide the following values to initiate the environment [press Enter for defaults]:
+Path to taxonomy repo [/Users/asotobue/.local/share/instructlab/taxonomy]:
+./taxonomy seems to not exist or is empty. Should I clone https://github.com/instructlab/taxonomy.git for you? [Y/n]:
+Cloning https://github.com/instructlab/taxonomy.git...
+Path to your model [/Users/asotobue/.cache/instructlab/models/merlinite-7b-lab-Q4_K_M.gguf]:
+Generating `/Users/asotobue/.config/instructlab/config.yaml`
+Please choose a train profile to use.
+Train profiles assist with the complexity of configuring InstructLab training for specific GPU hardware.
+You can still take advantage of hardware acceleration for training even if your hardware is not listed.
+[0] No profile (CPU, Apple Metal, AMD ROCm)
+[1] Nvidia A100/H100 x2 (A100_H100_x2.yaml)
+[2] Nvidia A100/H100 x4 (A100_H100_x4.yaml)
+[3] Nvidia A100/H100 x8 (A100_H100_x8.yaml)
+[4] Nvidia L40 x4 (L40_x4.yaml)
+[5] Nvidia L40 x8 (L40_x8.yaml)
+[6] Nvidia L4 x8 (L4_x8.yaml)
+...
+The most important file there is the configuration file, which defines the foundational model weโll be training and includes defaults such as parameters for training and serving.
+File is placed by default at <home>/.config/instructlab/config.yaml
.
In this example, we use merlinite-7b as a model, but you could use Granite, Mistral, Llama, or any other supported model (gguf
format).
Packaging the Service
+Downloading, serving, and testing a model with InstructLab
Downlaoding a model
You can package the next bash script:
+Before fine-tuning the model, let’s test the model with default training.
+To get started, download Merlinite pre-trained & quantized model with the ilab model download
command.
#!/bin/sh
-echo "Hello World"
+ilab model download
Downloading model from instructlab/merlinite-7b-lab-GGUF@main to models...
+Downloading 'merlinite-7b-lab-Q4_K_M.gguf' to 'models/.huggingface/download/merlinite-7b-lab-Q4_K_M.gguf.9ca044d727db34750e1aeb04e3b18c3cf4a8c064a9ac96cf00448c506631d16c.incomplete'
+INFO 2024-06-11 23:21:23,255 file_download.py:1877 Downloading 'merlinite-7b-lab-Q4_K_M.gguf' to 'models/.huggingface/download/merlinite-7b-lab-Q4_K_M.gguf.9ca044d727db34750e1aeb04e3b18c3cf4a8c064a9ac96cf00448c506631d16c.incomplete'
+merlinite-7b-lab-Q4_K_M.gguf: 2%|โ | 105M/4.37G [01:23<57:18, 1.24MB/s]
Deploy the Service
-And then you can deploy the service and execute commands inside:
+Now, letโs serve the model to be inferenced from your local machine.
+Serving a model
Check that the pod is up and running:
+To serve a model with InstructLab, use the ilab model serve
command.
kubectl get pods
+ilab model serve
NAME READY STATUS RESTARTS AGE
-apps 1/1 Running 0 5s
+INFO 2024-06-11 23:27:21,994 lab.py:340 Using model 'models/merlinite-7b-lab-Q4_K_M.gguf' with -1 gpu-layers and 4096 max context size.
+INFO 2024-06-11 23:27:40,984 server.py:206 Starting server process, press CTRL+C to shutdown server...
+INFO 2024-06-11 23:27:40,984 server.py:207 After application startup complete see http://127.0.0.1:8000/docs for API.
+Now, model is deployed locally and you can interact with it. +You have three options:
+-
+
-
+
InstructLab exposes the model using OpenAI API, so you can develop an application using for example LangChain, and interact with it.
+
+ -
+
Navigate to http://127.0.0.1:8000/docs to visit the Swagger UI of the model and interact with it.
+
+ -
+
Use
+ilab model chat
command.
+
Testing a model
Then let’s go into the running pod to execute some commands:
+Let’s use the later approach to interact with the model.
+Open a new terminal window, and navigate to your InstructLab directory, and enter your virtual environment again by running source venv/bin/activate
.
Then run ilab model
chat to enter a simple interface for conversing with the LLM.
kubectl exec -ti apps /bin/bash
+source venv/bin/activate
+
+ilab model chat
- - | --Change the pod name with your pod name. - | -
โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+โ Welcome to InstructLab Chat w/ MODELS/MERLINITE-7B-LAB-Q4_K_M.GGUF (type /h for help) โ
+โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
+>>> What languages are spoken in Canada?
+โญโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ models/merlinite-7b-lab-Q4_K_M.gguf
+โ Canadian society is multilingual, with English and French being the two official languages recognized at the federal level.
+But then query the following question: what is the price of a new Flux capacitor for DeLorean car. +You’ll receive a polite answer saying that has no knowledge to answe this question.
+What is the price of a new Flux capacitor for DeLorean car?
+โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
+โ I understand that you re asking about the cost of a flux capacitor for a specific model
+....
+So, obviously we need to fine-tuning our model to have the konwledge about the Back To the Future movie and De-Lorean car.
+Then, type exit
to stop the interactive chat window.
+Also, stop serving the model by typing Ctrl+C to stop the process.
Let’s move to the next section to learn how to fine-tune a model.
+Model alignment and training
+Large Language Models, while impressive in their ability to conversate and recall training information, sometimes they arenโt aware of specific details due to their large training set of data.
+Letโs learn how to contribute the correct information to this model using InstructLab!
+Adding knowledge and skills to an LLM
+In a new terminal window, navigate to the taxonomy directory (<home>/.local/share/instructlab/taxonomy
) that was cloned during the initialization step.
+Here, you’ll find two main subdirectories: knowledge and skills.
+As the names suggest, knowledge refers to factual information you want to add to the model, while skills involve teaching the model to perform specific tasks or follow certain formats.
Let’s say we want to add knowledge about the Back to the Future movie, which would be considered an addition of knowledge to the model.
+Create a new subfolder in taxonomy
directory and a qna.yaml
(questions & answers) file to hold example question-answer pairs related to DeLorean car.
cd knowledge
+
+mkdir -p trivia/delorean
+cd trivia/delorean
+And create the qna.yaml
file inside this new directory.
+The file structure is not complicated, first it contains some metadata, then you can add some Q&A pairs that can be included in the fine-tuning process.
+There is also a link to a public repository for additional data points from which InstructLab will generate additional question-and-answer pairs.
+This additional data will be used to generate synthetic question-answer pairs during the next step.
version: 3
+domain: time_travel
+created_by: RH Developers
+seed_examples:
+ - context: |
+ The DeLorean DMC-12 is a sports car manufactured by John DeLorean's DeLorean Motor Company
+ for the American market from 1981 to 1983. The car features gull-wing doors and a stainless-steel body.
+ It gained fame for its appearance as the time machine in the "Back to the Future" film trilogy.
+ questions_and_answers:
+ - question: |
+ When was the DeLorean manufactured?
+ answer: |
+ The DeLorean was manufactured from 1981 to 1983.
+ - question: |
+ Who manufactured the DeLorean DMC-12?
+ answer: |
+ The DeLorean Motor Company manufactured the DeLorean DMC-12.
+ - question: |
+ What type of doors does the DeLorean DMC-12 have?
+ answer: |
+ Gull-wing doors.
+ - context: |
+ An engine rebuild costs between $5,000 to $7,000. A transmission rebuild costs between $2,500 to $4,000.
+ A brake system overhaul costs between $1,000 to $1,500.
+ Suspension work costs between $800 to $1,200.
+ Electrical system repairs costs between $600 to $1,000.
+ Stainless stell panel work costs between $1,200 to $2,000.
+ A gull-wing door mechanism costs between $500 to $800.
+ Repairing an air conditioner costs between $300 and $600.
+ General maintenance is between $200 and $500 per service.
+ A Flux capacitor costs $10,000,000 to repair.
+ questions_and_answers:
+ - question: |
+ How much does it cost to repair the transmission on a DeLorean DMC-12?
+ answer: |
+ Transmission Repair costs between $2,500 and $4,000 for the Delorean DMC-12.
+ - question: How much does it cost to repair the supension on a DeLorean DMC-12?
+ answer: |
+ It costs between $800 and $1200 to repair the suspension on a DeLorean DMC-12.
+ - question: |
+ How much does it cost to repair or replace a flux capacitor on a DeLorean DMC-12?
+ answer: |
+ It costs $10,000,000 to repair a flux capacitor.
+ - context: |
+ Production Years: 1981โ1983
+ Body Style: 2-door coupe
+ Engine: 2.85 L V6 PRV engine
+ Transmission 5-speed manual or 3-speed automatic
+ Horsepower 130 hp
+ 153 lb-ft
+ Approximately 8.8 seconds
+ 110 mph
+ 2,712 lb (1,230 kg)
+ Flux capacitor fitted for time travel and costs $10,000,000
+ questions_and_answers:
+ - question: What was the production years of the DeLorean DMC-12?
+ answer: |
+ The car was in production from 1981 to 1983.
+ - question: |
+ How much horsepower does the DeLorean have?
+ answer: |
+ It has 130 horsepower.
+ - question: |
+ What is the flux capacitor used for in a Delorean DMC-12?
+ answer: |
+ The flux capacitor is used for time travel capabilities and costs $10,000,000.
+ - context: |
+ Here is a maintenance schedule.
+ Regular Oil Changes should be done every 3,000 miles or 3 months.
+ Brake Fluid should be changed every 2 years.
+ Transmission Fluid should be changed every 30,000 miles.
+ Coolant should be changed every 2 years
+ You should regularly check the battery for corrosion and proper connection.
+ You should add new fluid to the flux capacitor on a regular basis.
+ questions_and_answers:
+ - question: How often should you change the oil on a DeLorean DMC-12?
+ answer: |
+ You should change the oil ever 3,000 miles or every 3 months. Whichever comes first.
+ - question: |
+ How often should changed the blake fluid on a DeLorean DMC-12?
+ answer: |
+ You should change the blake fluid every two years.
+ - question: |
+ What should you check on a Delorean DMC-12 as part of a regular maintenance schedule?
+ answer: |
+ You should consider regular oil changes, blake fluid replacement, transmission fluid, battery,
+ flux capacitor and collant.
+ - context: |
+ Here are some fun facts about the DeLorean DMC-12:
+ The DeLorean DMC-12 was originally intended to have a mid-engine layout, but this was later changed
+ to a rear-engine layout due to design and cost constraints.
+ The DeLorean time machine special edition can time travel by accelerating to 88 miles per hour.
+ The flux capacitor, which enables time travel, costs $10,000,000 dollars.
+ questions_and_answers:
+ - question: |
+ How much does it cost to repair a flux capacitor on a Delorean DMC-12?
+ answer: |
+ It costs $10,000,000 to repair or replace a flux capacitor.
+ - question: |
+ How fast do you need to be going to enable time travel on a DeLorean DMC-12?
+ answer: |
+ 88 miles per hour.
+ - question: |
+ What type of layout does a Delorean DMC-12 have?
+ answer: |
+ It has a rear-engine layout.
+document_outline: |
+ Details and repair costs on a DeLorean DMC-12 car.
+document:
+ repo: https://github.com/gshipley/backToTheFuture.git
+ commit: 8bd9220c616afe24b9673d94ec1adce85320809c
+ patterns:
+ - data.md
+Then run the following command to verify the new data is valid in the terminal window where you’ve been running ilab
in the previous steps.
+This is important to run it in the same virtual environment created at the beginning of the deep dive.
ilab taxonomy diff
+knowledge/trivia/delorean/qna.yaml
+
+Taxonomy in ... is valid :)
+Generating synthetic data for model training
+So we’ve added some task-specific knowledgeโnow what? T
+he next step is to use InstructLab’s synthetic data generation pipeline to create a large training set from the examples.
+The key insight behind InstructLab’s LAB method is that we can use the base model itself to massively expand a small set of human-provided examples. +By prompting the model to generate completions conditioned on your examples, we can produce a synthetic dataset that’s much larger and more diverse than what you could feasibly write by hand.
+We can run the ilab data generate
command to begin generating synthetic data (by default, 100 data points). Remember, we still need to be serving the model with ilab model serve in another terminal instance.
Welcome to Template Tutorial
+Welcome to InstructLab Tutorial
Publish Services to Kubernetes
+Getting started with InstructLab
A Kubernetes tutorial to show how you can deploy a Java service to a Kubernetes cluster as it was a child game.
+InstructLab provides tools to enhance LLMs with additional knowledge and skills using a novel approach called LAB