Wan2.2: Open and Advanced Large-Scale Video Generative Model
Multimodal Diffusion with Representation Alignment
Python inference and LoRA trainer package for the LTX-2 audio–video
Advancing Open-source World Models
A trainable PyTorch reproduction of AlphaFold 3
LLM-based Reinforcement Learning audio edit model
Industrial-level controllable zero-shot text-to-speech system
Controllable & emotion-expressive zero-shot TTS
DeepSeek Coder: Let the Code Write Itself
A Pragmatic VLA Foundation Model
Advanced language and coding AI model
HY-Motion model for 3D character animation generation
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
A Powerful Native Multimodal Model for Image Generation
Provides convenient access to the Anthropic REST API from any Python 3
OpenTinker is an RL-as-a-Service infrastructure for foundation models
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Tooling for the Common Objects In 3D dataset
GPT4V-level open-source multi-modal model based on Llama3-8B
High-compute ultra-reasoning model surpassing model surpassing GPT-5
High-efficiency reasoning and agentic intelligence model