Wan2.1: Open and Advanced Large-Scale Video Generative Model
ChatGPT interface with better UI
Official inference repo for FLUX.2 models
An experimental version of DeepSeek model
CLIP, Predict the most relevant text snippet given an image
GLM-4 series: Open Multilingual Multimodal Chat LMs
Visual Causal Flow
Diversity-driven optimization and large-model reasoning ability
High-Fidelity and Controllable Generation of Textured 3D Assets
OCR expert VLM powered by Hunyuan's native multimodal architecture
Accurate × Fast × Comprehensive
Inference code for scalable emulation of protein equilibrium ensembles
Lets make video diffusion practical
Long-form streaming TTS system for multi-speaker dialogue generation
Ling is a MoE LLM provided and open-sourced by InclusionAI
Large Multimodal Models for Video Understanding and Editing
Repo of Qwen2-Audio chat & pretrained large audio language model
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Repo for SeedVR2 & SeedVR
LTX-Video Support for ComfyUI
4M: Massively Multimodal Masked Modeling
PyTorch code and models for the DINOv2 self-supervised learning
Block Diffusion for Ultra-Fast Speculative Decoding
Designed for text embedding and ranking tasks
Recovering the Visual Space from Any Views