GLM-4 series: Open Multilingual Multimodal Chat LMs
Sharp Monocular Metric Depth in Less Than a Second
Language modeling in a sentence representation space
Generating Immersive, Explorable, and Interactive 3D Worlds
Bring the notion of Model-as-a-Service to life
SoTA open-source TTS
Audio Language Models are Few-Shot Learners
The ultimate RAG for your monorepo
Any model. Any hardware. Zero compromise
Open source RAG framework for building scalable modular AI apps
Agent-ready RPA suite with visual workflow automation tools engine
An on-premises, OCR-free unstructured data extraction
An open-source, modern-design AI training tracking and visualization
Open-source industrial-grade ASR models
AI-Powered Wiki Generator for GitHub/Gitlab/Bitbucket Repositories
Fast-stable-diffusion + DreamBooth
Ultimate meta-skill for generating best-in-class Claude Code skills
Hunyuan Translation Model Version 1.5
Persistent context and multi-instance coordination
Block Diffusion for Ultra-Fast Speculative Decoding
Multimodal embedding and reranking models built on Qwen3-VL
"Big Model" trains a visual multimodal VLM with 26M parameters
Fast and accurate AI powered file content types detection
Implementation of "MobileCLIP" CVPR 2024
VMZ: Model Zoo for Video Modeling