Skip to content

Commit

Permalink
installation
Browse files Browse the repository at this point in the history
  • Loading branch information
theTejMahal authored Feb 26, 2025
1 parent 7a08c44 commit f4276bf
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions project/nanoeval/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,6 @@

Simple, ergonomic, and high performance evals. We use it at OpenAI as part of our infrastructure to run Preparedness evaluations.

# Installation

```bash
# Using https://github.com/astral-sh/uv (recommended)
uv add "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval"
# Using pip
pip install "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval"
```

nanoeval is pre-release software and may have breaking changes, so it's recommended that you pin your installation to a specific commit. The uv command above will do this for you.

# Principles

1. **Minimal indirection.** You should be able to implement and understand an eval in 100 lines.
Expand All @@ -27,6 +16,17 @@ nanoeval is pre-release software and may have breaking changes, so it's recommen
- `Task` - A single scoreable unit of work.
- `Solver` - A strategy (usually involving sampling a model) to go from a task to a result that can be scored. For example, there may be different ways to prompt a model to answer a multiple-choice question (i.e. looking at logits, few-shot prompting, etc)

# Installation

```bash
# Using https://github.com/astral-sh/uv (recommended)
uv add "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval"
# Using pip
pip install "git+https://github.com/openai/SWELancer-Benchmark#egg=nanoeval&subdirectory=project/nanoeval"
```

nanoeval is pre-release software and may have breaking changes, so it's recommended that you pin your installation to a specific commit. The uv command above will do this for you.

# Running your first eval

See [gpqa_api.py](nanoeval/examples/gpqa_api.py) for an implementation of GPQA using the OpenAI API in <70 lines of code.
Expand Down

0 comments on commit f4276bf

Please sign in to comment.