Skip to content

Commit

Permalink
Merge pull request #193 from rasbt/ollama-eval
Browse files Browse the repository at this point in the history
Ollama-based model evaluation
  • Loading branch information
rasbt authored Jun 5, 2024
2 parents 6290dad + ef580a0 commit 32251f2
Show file tree
Hide file tree
Showing 5 changed files with 665 additions and 8 deletions.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@ Alternatively, you can view this and other files on GitHub at [https://github.co
| Ch 4: Implementing a GPT Model from Scratch | - [ch04.ipynb](ch04/01_main-chapter-code/ch04.ipynb)<br/>- [gpt.py](ch04/01_main-chapter-code/gpt.py) (summary)<br/>- [exercise-solutions.ipynb](ch04/01_main-chapter-code/exercise-solutions.ipynb) | [./ch04](./ch04) |
| Ch 5: Pretraining on Unlabeled Data | - [ch05.ipynb](ch05/01_main-chapter-code/ch05.ipynb)<br/>- [gpt_train.py](ch05/01_main-chapter-code/gpt_train.py) (summary) <br/>- [gpt_generate.py](ch05/01_main-chapter-code/gpt_generate.py) (summary) <br/>- [exercise-solutions.ipynb](ch05/01_main-chapter-code/exercise-solutions.ipynb) | [./ch05](./ch05) |
| Ch 6: Finetuning for Text Classification | - [ch06.ipynb](ch06/01_main-chapter-code/ch06.ipynb) <br/>- [gpt-class-finetune.py](ch06/01_main-chapter-code/gpt-class-finetune.py) <br/>- [exercise-solutions.ipynb](ch06/01_main-chapter-code/exercise-solutions.ipynb) | [./ch06](./ch06) |
| Ch 7: Finetuning with Human Feedback | Q2 2024 | ... |
| Ch 7: Instruction Finetuning | Q2 2024 | ... |
| Appendix A: Introduction to PyTorch | - [code-part1.ipynb](appendix-A/01_main-chapter-code/code-part1.ipynb)<br/>- [code-part2.ipynb](appendix-A/01_main-chapter-code/code-part2.ipynb)<br/>- [DDP-script.py](appendix-A/01_main-chapter-code/DDP-script.py)<br/>- [exercise-solutions.ipynb](appendix-A/01_main-chapter-code/exercise-solutions.ipynb) | [./appendix-A](./appendix-A) |
| Appendix B: References and Further Reading | No code | - |
| Appendix C: Exercise Solutions | No code | - |
Expand Down Expand Up @@ -105,6 +105,7 @@ Several folders contain optional materials as a bonus for interested readers:
- [Finetuning different models on 50k IMDB movie review dataset](ch06/03_bonus_imdb-classification)
- **Chapter 7:**
- [Dataset Utilities for Finding Near Duplicates and Creating Passive Voice Entries](ch07/02_dataset-utilities)
- [Evaluating Instruction Responses Using the OpenAI API and Ollama](ch07/03_model-evaluation)

<br>
&nbsp
Expand Down
13 changes: 7 additions & 6 deletions ch07/03_model-evaluation/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,13 @@
# Chapter 7: Instruction and Preference Finetuning
# Chapter 7: Instruction Finetuning

This folder contains utility code that can be used for model evaluation.

Install the additional package requirements via:

```bash
pip install -r requirements-extra.txt
```


&nbsp;
## Evaluating Instruction Responses Using the OpenAI API


- The [llm-instruction-eval-openai.ipynb](llm-instruction-eval-openai.ipynb) notebook uses OpenAI's GPT-4 to evaluate responses generated by instruction finetuned models. It works with a JSON file in the following format:

```python
Expand All @@ -23,3 +19,8 @@ pip install -r requirements-extra.txt
"model 2 response": "\nThe atomic number of helium is 3." # <-- Response by a 2nd LLM
},
```

&nbsp;
## Evaluating Instruction Responses Locally Using Ollama

- The [llm-instruction-eval-ollama.ipynb](llm-instruction-eval-ollama.ipynb) notebook offers an alternative to the one above, utilizing a locally downloaded Llama 3 model via Ollama.
Loading

0 comments on commit 32251f2

Please sign in to comment.