CodeGeeX2: A More Powerful Multilingual Code Generation Model
Ring is a reasoning MoE LLM provided and open-sourced by InclusionAI
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Open-source large language model family from Tencent Hunyuan
Phi-3.5 for Mac: Locally-run Vision and Language Models
GLM-4-Voice | End-to-End Chinese-English Conversational Model
CogView4, CogView3-Plus and CogView3(ECCV 2024)
A Customizable Image-to-Video Model based on HunyuanVideo
A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming
Chat & pretrained large audio language model proposed by Alibaba Cloud
Repo of Qwen2-Audio chat & pretrained large audio language model
Lets make video diffusion practical
Towards Real-World Vision-Language Understanding
Tool for exploring and debugging transformer model behaviors
Sharp Monocular Metric Depth in Less Than a Second
A Powerful Native Multimodal Model for Image Generation
A series of math-specific large language models of our Qwen2 series
Inference framework for 1-bit LLMs
PyTorch code and models for the DINOv2 self-supervised learning
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Official implementation of DreamCraft3D
A state-of-the-art open visual language model
Open-weight, large-scale hybrid-attention reasoning model
Programmatic access to the AlphaGenome model
ChatGPT interface with better UI