Analyze computation-communication overlap in V3/R1
Qwen3-omni is a natively end-to-end, omni-modal LLM
Repo for SeedVR2 & SeedVR
LLM-based Reinforcement Learning audio edit model
The official pytorch implementation of our paper
Official implementation of Watermark Anything with Localized Messages
FAIR Sequence Modeling Toolkit 2
Open source large language model by Alibaba
AI-powered tool to quickly remove watermarks from images flawlessly
Inference code for scalable emulation of protein equilibrium ensembles
CLIP, Predict the most relevant text snippet given an image
CodeGeeX2: A More Powerful Multilingual Code Generation Model
A state-of-the-art open visual language model
Stable Diffusion with Core ML on Apple Silicon
Learning Continuous Signed Distance Functions for Shape Representation
Pushing the Limits of Mathematical Reasoning in Open Language Models
Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)
Official implementation of DreamCraft3D
GLIDE: a diffusion-based text-conditional image synthesis model
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
New set of lightweight state-of-the-art, open foundation models
Foundation Models for Time Series
HY-Motion model for 3D character animation generation
A Unified Framework for Text-to-3D and Image-to-3D Generation
OCR expert VLM powered by Hunyuan's native multimodal architecture