Stars
Fully open reproduction of DeepSeek-R1
Medical NLP Competition, dataset, large models, paper
[Nature Reviews Bioengineering🔥] Application of Large Language Models in Medicine. A curated list of practical guide resources of Medical LLMs (Medical LLMs Tree, Tables, and Papers)
Semantic Evaluation for Text-to-SQL with Distilled Test Suites
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind
Implementation of Speculative Sampling as described in "Accelerating Large Language Model Decoding with Speculative Sampling" by Deepmind
Aligning Large Language Models with Human: A Survey
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
RWKV (pronounced RwaKuv) is an RNN with great LLM performance, which can also be directly trained like a GPT transformer (parallelizable). We are at RWKV-7 "Goose". So it's combining the best of RN…
OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
The official gpt4free repository | various collection of powerful language models | o3 and deepseek r1, gpt-4.5
The RedPajama-Data repository contains code for preparing large datasets for training large language models.
Code for the ICLR 2023 paper "GPTQ: Accurate Post-training Quantization of Generative Pretrained Transformers".
Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Flax.
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
Crosslingual Generalization through Multitask Finetuning
Instruction Tuning with GPT-4
ChatLLaMA 📢 Open source implementation for LLaMA-based ChatGPT runnable in a single GPU. 15x faster training process than ChatGPT
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Open Academic Research on Improving LLaMA to SOTA LLM
Awesome-LLM: a curated list of Large Language Model
Alpaca dataset from Stanford, cleaned and curated
Aligning pretrained language models with instruction data generated by themselves.
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
Instruct-tune LLaMA on consumer hardware