contextual-ngrams

A minimal replication of finding and evaluating contextual n-grams in Pythia series models.

Setup

Use the included Dockerfile, or alternatively install PyTorch then run:

pip install nltk kaleido tqdm einops seaborn plotly-express fancy-einsum scikit-learn torchmetrics ipykernel ipywidgets nbformat git+https://github.com/neelnanda-io/TransformerLens git+https://github.com/callummcdougall/CircuitsVis.git#subdirectory=python git+https://github.com/neelnanda-io/neelutils.git git+https://github.com/neelnanda-io/neel-plotly.git

Instructions

Generate data by running each script from the command line:

python generate_foo.py --model pythia-70m

Some scripts are extremely slow because they run over hundreds of model checkpoints. We advise using an A6000 with 100GB of RAM or equivalent.

Then replicate figures by running figures.py

python figures.py --model pythia-70m

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
data		data
output		output
.gitignore		.gitignore
README.md		README.md
figures.py		figures.py
generate_checkpoint_ablation_data.py		generate_checkpoint_ablation_data.py
generate_checkpoint_probe_data.py		generate_checkpoint_probe_data.py
generate_dla_data.py		generate_dla_data.py
generate_indirect_effects_data.py		generate_indirect_effects_data.py
generate_ngrams_data.py		generate_ngrams_data.py
generate_phase_transition_data.py		generate_phase_transition_data.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

contextual-ngrams

Setup

Instructions

About

Releases

Packages

Contributors 2

Languages

luciaquirke/contextual-ngrams

Folders and files

Latest commit

History

Repository files navigation

contextual-ngrams

Setup

Instructions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages