Inference framework for 1-bit LLMs
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Block Diffusion for Ultra-Fast Speculative Decoding
Port of Facebook's LLaMA model in C/C++
Image generation model with single-stream diffusion transformer
tiktoken is a fast BPE tokeniser for use with OpenAI's models
MiMo-V2-Flash: Efficient Reasoning, Coding, and Agentic Foundation
Blazeface is a lightweight model that detects faces in images
Detect faces in an image
A Conversational Speech Generation Model
Encoder of greater-than-word length text trained on a variety of data
Efficient 13B MoE language model with long context and reasoning modes
OpenAI’s compact 20B open model for fast, agentic, and local use
Jan-v1-edge: efficient 1.7B reasoning model optimized for edge devices