Contexts Optical Compression
Bidirectional token-classification model for identifiable info
Open Source Speech Language Model
Open-source multi-speaker long-form text-to-speech model
Diffusion Transformer with Fine-Grained Chinese Understanding
Large-language-model & vision-language-model based on Linear Attention
Visual Causal Flow
OCR expert VLM powered by Hunyuan's native multimodal architecture
The official repo of Qwen chat & pretrained large language model
Ultra-Efficient LLMs on End Device
Audio foundation model excelling in audio understanding
Multi-modal large language model designed for audio understanding
Large Multimodal Models for Video Understanding and Editing
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Official implementation of DreamCraft3D
AI-powered tool to quickly remove watermarks from images flawlessly
A Conversational Speech Generation Model
Dataset of GPT-2 outputs for research in detection, biases, and more