This repository contains my re-implementations of the code in Andrej Karpathy's Neural Networks: Zero to Hero series.
gpt2_repro
: replicating GPT training almost exactly as described in the GPT-2/GPT-3 papersmakemore
: character-level language model that "makes more of the input" (implemented using various models)micrograd
: scalar-level autograd engineminbpe
: BPE tokenizer that supports training and inferencenanogpt
: similar tomakemore
, but uses a GPT-like architecture (stacked decoders)