LLM training code for MosaicML foundation models
Code and models for ICML 2024 paper, NExT-GPT
Diffusion Transformer with Fine-Grained Chinese Understanding
OCR expert VLM powered by Hunyuan's native multimodal architecture
Multilingual sentence & image embeddings with BERT
Long-form streaming TTS system for multi-speaker dialogue generation
The official repo of Qwen chat & pretrained large language model
Faster and easier training and deployments
LLM-based Reinforcement Learning audio edit model
An opinionated CLI to transcribe Audio files w/ Whisper on-device
Biomni: a general-purpose biomedical AI agent
Visual Causal Flow
Adding guardrails to large language models
Qwen3 is the large language model series developed by Qwen team
Official PyTorch Implementation
ChatGPT extension for scientific research work
A TTS model capable of generating ultra-realistic dialogue
Repo of Qwen2-Audio chat & pretrained large audio language model
Autonomous LLM agent for end-to-end data science workflows
Stable Diffusion web UI
State-of-the-art diffusion models for image and audio generation
LongBench v2 and LongBench (ACL 25'&24')
A Python package for segmenting geospatial data with the SAM
Flexible Photo Recrafting While Preserving Your Identity
Large Multimodal Models for Video Understanding and Editing