Tiktoken

tiktoken is a high-performance, tokenizer library (based on byte-pair encoding, BPE) designed for use with OpenAI’s models. It handles encoding and decoding text to token IDs efficiently, with minimal overhead. Because tokenization is a fundamental step in preparing text for models, tiktoken is optimized for speed, memory, and correctness in model contexts (e.g. matching OpenAI’s internal tokenization). The repo supports multiple encodings (e.g. “cl100k_base”) and lets users switch encoding names to match different model contexts. It also offers extension mechanisms so that custom encodings can be registered. Internally, it includes the core tokenizer logic (often implemented in Rust or efficient lower-level code), APIs for encoding, decoding, and counting tokens, and binding layers to Python (and sometimes other languages) for easy use.

Features

Fast BPE-based tokenizer for text ↔ token ID conversion
Support for multiple encoding schemes (e.g. “cl100k_base”)
APIs to encode, decode, and count tokens efficiently for prompt length control
Extension / plugin mechanism for registering custom encodings
Language bindings (Python / Rust / etc.) for integration in different environments
Used for cost estimation, truncation logic, and alignment with OpenAI model expectations

Project Samples

Project Activity

See All Activity >

License

MIT License

Follow Tiktoken

Tiktoken Web Site

Other Useful Business Software

Gen AI apps are built with MongoDB Atlas

The database for AI-powered applications.

MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.

Start Free

Rate This Project

User Reviews

Be the first to post a review of Tiktoken!

Additional Project Details

Programming Language

Python

Related Categories

Python AI Models

Registered

4 days ago

Similar Business Software

GPT-5 nano

GPT-5 nano is OpenAI’s fastest and most affordable version of the GPT-5 family, designed for high-speed text processing tasks like summarization and classification. It supports text and image inputs, generating high-quality text outputs with a large 400,000-token context window and up to 128,000...

See Software
GPT-4.1 mini

GPT-4.1 mini is a compact version of OpenAI’s powerful GPT-4.1 model, designed to provide high performance while significantly reducing latency and cost. With a smaller size and optimized architecture, GPT-4.1 mini still delivers impressive results in tasks such as coding, instruction following,...

See Software
GPT-5 mini

GPT-5 mini is a streamlined, faster, and more affordable variant of OpenAI’s GPT-5, optimized for well-defined tasks and precise prompts. It supports text and image inputs and delivers high-quality text outputs with a 400,000-token context window and up to 128,000 output tokens. This model...

See Software
Mu

Mu is a 330-million-parameter encoder–decoder language model designed to power the agent in Windows settings by mapping natural-language queries to Settings function calls, running fully on-device via NPUs at over 100 tokens per second while maintaining high accuracy. Drawing on Phi Silica...

See Software
GPT-4o mini

A small model with superior textual intelligence and multimodal reasoning. GPT-4o mini enables a broad range of tasks with its low cost and latency, such as applications that chain or parallelize multiple model calls (e.g., calling multiple APIs), pass a large volume of context to the model...

See Software
Amazon Nova Premier

Amazon Nova Premier is the most advanced model in their Nova family, designed to handle complex tasks and act as a teacher for model distillation. Available on Amazon Bedrock, Nova Premier can process text, images, and video inputs, making it capable of managing intricate workflows, multi-step...

See Software

Report inappropriate content

Tiktoken

tiktoken is a fast BPE tokeniser for use with OpenAI's models

Get an email when there's a new version of Tiktoken

Features

Project Samples

Project Activity

Categories

License

Follow Tiktoken

User Reviews

Additional Project Details

Programming Language

Related Categories

Registered