Skip to content

Implementation of a LLM for my EPQ. Uses a similar architecture to that proposed in Attention is All You Need with a few alterations.is L

Notifications You must be signed in to change notification settings

archiekind/AMLM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AMLM

A small transformer implementation.

Can be configured from the config file. This is designed to be a small benchmark of the SwiGLU activation in the Transformer vs ReLU and of GQA vs MHA.

In the config.json file, "SwiGLU" utilises SwiGLU, "ReLU" utilises relu, "GQA" utilises GQA and "MHA" utilises MHA. It should be thoes strings exactly or undefined behaviour.

The generator file allows for a promt-response cycle to generate from the transformer.

The tokeniser file can be used to train a BPE tokeniser.

About

Implementation of a LLM for my EPQ. Uses a similar architecture to that proposed in Attention is All You Need with a few alterations.is L

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages