Tokenizer

< Glossary
NLP

An algorithm that splits text into tokens before feeding it to a language model. Different models use different tokenizers (BPE, SentencePiece, tiktoken).

Related terms