Skip to content

Latest commit

 

History

History
3 lines (2 loc) · 156 Bytes

tokenizing.md

File metadata and controls

3 lines (2 loc) · 156 Bytes

Tokenizing

A process of breaking text into individual linguistic units. This process often involves removing punctuation and making all words lowercase.