Open Source Speech Language Model
Open-source industrial-grade ASR models
A SOTA open-source image editing model
Claude Code image, a one-stop open source transit service
OCR expert VLM powered by Hunyuan's native multimodal architecture
GPT4V-level open-source multi-modal model based on Llama3-8B
Collection of Gemma 3 variants that are trained for performance
Ling-V2 is a MoE LLM provided and open-sourced by InclusionAI
Open-source deep-learning framework
Multimodal model achieving SOTA performance
A Family of Open Foundation Models for Code Intelligence
Open-source framework for intelligent speech interaction
Revolutionizing Database Interactions with Private LLM Technology
Implementation of the Surya Foundation Model for Heliophysics
Long-form streaming TTS system for multi-speaker dialogue generation
OpenTinker is an RL-as-a-Service infrastructure for foundation models
Diversity-driven optimization and large-model reasoning ability
Audio foundation model excelling in audio understanding
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
High-Fidelity and Controllable Generation of Textured 3D Assets
Multi-modal large language model designed for audio understanding
State-of-the-art (SoTA) text-to-video pre-trained model
Large Multimodal Models for Video Understanding and Editing
Generate Any 3D Scene in Seconds
LLM-based Reinforcement Learning audio edit model