Byte-Pair Encoding (BPE) (subword-based tokenization) algorithm implementaions from scratch with python
-
Updated
Jan 30, 2023 - Python
Byte-Pair Encoding (BPE) (subword-based tokenization) algorithm implementaions from scratch with python
LLM-inspired BiLSTM pipeline for real-time, multi-label toxicity inference across adversarial discourse modalities.
An Artificial Subword Translation Task
LLM-inspired BiLSTM pipeline for real-time, multi-label toxicity inference across adversarial discourse modalities.
Fast & efficient BPE tokenizer written in C & python for LLM tranining
LLM-inspired BiLSTM pipeline for real-time, multi-label toxicity inference across adversarial discourse modalities.
Add a description, image, and links to the subword-tokenization topic page so that developers can more easily learn about it.
To associate your repository with the subword-tokenization topic, visit your repo's landing page and select "manage topics."