Lets make video diffusion practical
PyTorch code and models for the DINOv2 self-supervised learning
State of the art LLM and coding model
Programmatic access to the AlphaGenome model
LTX-Video Support for ComfyUI
ChatGLM-6B: An Open Bilingual Dialogue Language Model
One-click local MCP server installation in desktop apps
An experimental version of DeepSeek model
Claude Code image, a one-stop open source transit service
GLM-4.5: Open-source LLM for intelligent agents by Z.ai
Diversity-driven optimization and large-model reasoning ability
Visual Causal Flow
Flux 2 image generation model pure C inference
4M: Massively Multimodal Masked Modeling
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Repo of Qwen2-Audio chat & pretrained large audio language model
Proxy that exposes Antigravity provided claude / gemini models
Ling is a MoE LLM provided and open-sourced by InclusionAI
Access to Anthropic's safety-first language model APIs
CLIP, Predict the most relevant text snippet given an image
A 0.1B Omni model trained from scratch
Recovering the Visual Space from Any Views
tiktoken is a fast BPE tokeniser for use with OpenAI's models
Clean and efficient FP8 GEMM kernels with fine-grained scaling
Large Multimodal Models for Video Understanding and Editing