High-Resolution Image Synthesis with Latent Diffusion Models
MiroThinker is an open source deep research agent
Open-source platform for building enterprise-grade agents
Powerful Android AI agent with tools, automation, and Linux shell
Qwen3-ASR is an open-source series of ASR models
Converts text to speech in realtime
SOTA Open Source TTS
Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
Multi-Voice and Prompt-Controlled TTS Engine
Toolkit for conversational AI
100–200× Acceleration for Video Diffusion Models
RGBD video generation model conditioned on camera input
Open source AI model for generating full songs from lyrics prompts
HY-Motion model for 3D character animation generation
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Implementation of Video Diffusion Models
Implementation of Imagen, Google's Text-to-Image Neural Network
Ultimate meta-skill for generating best-in-class Claude Code skills
Motion-controllable Video Generation via Latent Trajectory Guidance
Code for the paper "Evaluating Large Language Models Trained on Code"
A Unified Framework for Text-to-3D and Image-to-3D Generation
Automatically translates the text of a video based on a subtitle file
Autoregressive Model Beats Diffusion
Empowering Code Generation with OSS-Instruct
StreamSpeech is a seamless model for offline speech recognition