Lets make video diffusion practical
Renderer for the harmony response format to be used with gpt-oss
Long-form streaming TTS system for multi-speaker dialogue generation
Language modeling in a sentence representation space
Hackable and optimized Transformers building blocks
Open-Source Financial Large Language Models
Open-source deep-learning framework
Audio foundation model excelling in audio understanding
INT4/INT5/INT8 and FP16 inference on CPU for RWKV language model
MiniMax M2.1, a SOTA model for real-world dev & agents.
Open Source Speech Language Model
Continuous Autonomy for the AI SDK
4M: Massively Multimodal Masked Modeling
State-of-the-art (SoTA) text-to-video pre-trained model
Towards Real-World Vision-Language Understanding
Python example app from the OpenAI API quickstart tutorial
A minimal PyTorch re-implementation of the OpenAI GPT
A collection of high-quality models for the MuJoCo physics engine
Per-Pixel Classification is Not All You Need for Semantic Segmentation
A library for Multilingual Unsupervised or Supervised word Embeddings
Efficient 14B multimodal instruct model with edge deployment and FP8