Python example app from the OpenAI API quickstart tutorial
Inference framework for 1-bit LLMs
Block Diffusion for Ultra-Fast Speculative Decoding
gpt-oss-120b and gpt-oss-20b are two open-weight language models
Safety reasoning models built-upon gpt-oss
A fast, local neural text to speech system
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Encoder of greater-than-word length text trained on a variety of data
Open Multilingual Multimodal Chat LMs
Small 3B-base multimodal model ideal for custom AI on edge hardware
Powerful 14B LLM with strong instruction and long-text handling
Frontier-scale 675B multimodal base model for custom AI training
Quantized 675B multimodal instruct model optimized for NVFP4
Powerful 14B-base multimodal model — flexible base for fine-tuning
OpenAI’s compact 20B open model for fast, agentic, and local use
OpenAI’s open-weight 120B model optimized for reasoning and tooling
Compact English sentence embedding model for semantic search tasks
Compact 8B multimodal instruct model optimized for edge deployment
Efficient 13B MoE language model with long context and reasoning modes
Russian ASR model fine-tuned on Common Voice and CSS10 datasets
Ultra-efficient 3B multimodal instruct model built for edge deployment
Versatile 8B-base multimodal LLM, flexible foundation for custom AI