neural playground

Sandbox for semi-structured projects

Backward Chaining Mechanistic Interpretability

This work explores the use of a transformer model for navigating a toy symbolic reasoning task, specifically pathfinding in a binary trees.

This work is mostly a replication of Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task. (arxiv, github)

Project structure

src/ contains core source code: training loop, data generation etc.
notebooks/ contain experiment entry points and plots
conf/ contains yaml files with hyperparameters.
environment.yml (too verbose) dump of my environment to enhance reproducibility

wandb

Open wandb 🪄🐝 project with experiment logs