Designed for text embedding and ranking tasks
Video Object and Interaction Deletion
A Pragmatic VLA Foundation Model
Global weather forecasting model using graph neural networks and JAX
Provides convenient access to the Anthropic REST API from any Python 3
A Powerful Native Multimodal Model for Image Generation
Repo of Qwen2-Audio chat & pretrained large audio language model
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Multi-modal large language model designed for audio understanding
GLM-4 series: Open Multilingual Multimodal Chat LMs
Open-source multi-speaker long-form text-to-speech model
Large-language-model & vision-language-model based on Linear Attention
Open-source image generative foundation model
LLM-based Reinforcement Learning audio edit model
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
A trainable PyTorch reproduction of AlphaFold 3
Foundation model for image generation
Tooling for the Common Objects In 3D dataset
Implementation of the Surya Foundation Model for Heliophysics
A 0.1B Omni model trained from scratch
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Hunyuan Translation Model Version 1.5
Multimodal embedding and reranking models built on Qwen3-VL
High-resolution models for human tasks