Video Object and Interaction Deletion
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Sharp Monocular Metric Depth in Less Than a Second
An experimental version of DeepSeek model
The official repo of Qwen chat & pretrained large language model
Open image model at the forefront of design
Accurate × Fast × Comprehensive
Generating Immersive, Explorable, and Interactive 3D Worlds
Miso TTS is an 8 billion, highly emotive text-to-speech model
Convert Google Gemini web into OpenAI-compatible API
Python SDK for Claude Agent
MOSS‑TTS Family open‑source speech and sound generation model
Project Lyra: Open Generative 3D World Models
Open-Source Financial Large Language Models
HY-Motion model for 3D character animation generation
Open-source large language model family from Tencent Hunyuan
A Multi-Modal World Model for Reconstructing, Generating, Simulation
An Efficient Agentic Model for Computer Use
Tiny vision language model
Open-source image generative foundation model
26m function call model that runs on incredibly small devices
Tool for exploring and debugging transformer model behaviors
CLIP, Predict the most relevant text snippet given an image
Multimodal Diffusion with Representation Alignment
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model