Recovering the Visual Space from Any Views
LTX-Video Support for ComfyUI
CogView4, CogView3-Plus and CogView3(ECCV 2024)
Text and image to video generation: CogVideoX and CogVideo
The official repo of Qwen chat & pretrained large language model
Fast-stable-diffusion + DreamBooth
Achieving 3+ generation speedup on reasoning tasks
A Systematic Framework for Interactive World Modeling
Code for running inference with the SAM 3D Body Model 3DB
Official implementation of Watermark Anything with Localized Messages
MOSS‑TTS Family open‑source speech and sound generation model
Easy Docker setup for Stable Diffusion with user-friendly UI
HY-Motion model for 3D character animation generation
Sharp Monocular Metric Depth in Less Than a Second
Uncommon Objects in 3D dataset
Generating Immersive, Explorable, and Interactive 3D Worlds
Python bindings for llama.cpp
Large-language-model & vision-language-model based on Linear Attention
Phi-3.5 for Mac: Locally-run Vision and Language Models
Open-Source Financial Large Language Models
Inference script for Oasis 500M
Hackable and optimized Transformers building blocks
tiktoken is a fast BPE tokeniser for use with OpenAI's models
A Multi-Modal World Model for Reconstructing, Generating, Simulation
Revolutionizing Database Interactions with Private LLM Technology