tiktoken is a fast BPE tokeniser for use with OpenAI's models
Unifying 3D Mesh Generation with Language Models
Large-language-model & vision-language-model based on Linear Attention
SOTA discrete acoustic codec models with 40/75 tokens per second
Unified Multimodal Understanding and Generation Models
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm
A python tool that uses GPT-4, FFmpeg, and OpenCV
Chinese Llama-3 LLMs) developed from Meta Llama 3
Code for the paper Language Models are Unsupervised Multitask Learners
Framework that is dedicated to making neural data processing
The IRC's Talking Robot
Robust BERT-based model for English with improved MLM training