LTX-Video Support for ComfyUI
A Systematic Framework for Interactive World Modeling
ChatGLM-6B: An Open Bilingual Dialogue Language Model
Ling is a MoE LLM provided and open-sourced by InclusionAI
CLIP, Predict the most relevant text snippet given an image
A Powerful Native Multimodal Model for Image Generation
Generate Any 3D Scene in Seconds
Collection of Gemma 3 variants that are trained for performance
tiktoken is a fast BPE tokeniser for use with OpenAI's models
4M: Massively Multimodal Masked Modeling
One-click local MCP server installation in desktop apps
Official implementation of DreamCraft3D
Reproduction of Poetiq's record-breaking submission to the ARC-AGI-1
Designed for text embedding and ranking tasks
The official PyTorch implementation of Google's Gemma models
Repo of Qwen2-Audio chat & pretrained large audio language model
Large Multimodal Models for Video Understanding and Editing
Block Diffusion for Ultra-Fast Speculative Decoding
The ChatGPT Retrieval Plugin lets you easily find personal documents
Inference script for Oasis 500M
Implementation of the Surya Foundation Model for Heliophysics
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning
Qwen2.5-Coder is the code version of Qwen2.5, the large language model
Official code for Style Aligned Image Generation via Shared Attention