ChatGPT interface with better UI
Release for Improved Denoising Diffusion Probabilistic Models
Lets make video diffusion practical
Official inference repo for FLUX.2 models
An experimental version of DeepSeek model
PyTorch code and models for the DINOv2 self-supervised learning
Models for object and human mesh reconstruction
Visual Causal Flow
Large Multimodal Models for Video Understanding and Editing
CLIP, Predict the most relevant text snippet given an image
Diversity-driven optimization and large-model reasoning ability
OCR expert VLM powered by Hunyuan's native multimodal architecture
Repo for SeedVR2 & SeedVR
LTX-Video Support for ComfyUI
Designed for text embedding and ranking tasks
A Powerful Native Multimodal Model for Image Generation
Inference code for scalable emulation of protein equilibrium ensembles
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Ling is a MoE LLM provided and open-sourced by InclusionAI
4M: Massively Multimodal Masked Modeling
Z80-μLM is a 2-bit quantized language model
Accurate × Fast × Comprehensive
A Customizable Image-to-Video Model based on HunyuanVideo
Recovering the Visual Space from Any Views
Block Diffusion for Ultra-Fast Speculative Decoding