Contexts Optical Compression
A multimodal model for brain response prediction
Bidirectional token-classification model for identifiable info
Open Source Speech Language Model
Open-source multi-speaker long-form text-to-speech model
Visual Causal Flow
Diffusion Transformer with Fine-Grained Chinese Understanding
Large-language-model & vision-language-model based on Linear Attention
The official repo of Qwen chat & pretrained large language model
Ultra-Efficient LLMs on End Device
Multimodal model achieving SOTA performance
Audio foundation model excelling in audio understanding
Large Multimodal Models for Video Understanding and Editing
Official implementation of DreamCraft3D
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
A Conversational Speech Generation Model
Encoder of greater-than-word length text trained on a variety of data
Dataset of GPT-2 outputs for research in detection, biases, and more
CTC-based forced aligner for audio-text in 158 languages
Omnimodal AI model for agents, coding, and long-context tasks
Compact 8B multimodal instruct model optimized for edge deployment
ClinicalBERT model trained on MIMIC notes for clinical NLP tasks
Small 3B-base multimodal model ideal for custom AI on edge hardware
Efficient 14B multimodal instruct model with edge deployment and FP8