Multimodal-Driven Architecture for Customized Video Generation
Multimodal Diffusion with Representation Alignment
From Images to High-Fidelity 3D Assets
ICLR2024 Spotlight: curation/training code, metadata, distribution
Hackable and optimized Transformers building blocks
Official implementation of DreamCraft3D
An experimental version of DeepSeek model
LTX-Video Support for ComfyUI
A Systematic Framework for Interactive World Modeling
AlphaFold 3 inference pipeline
Video Object and Interaction Deletion
code for Mesh R-CNN, ICCV 2019
Open-Source Financial Large Language Models
A Production-ready Reinforcement Learning AI Agent Library
Models for object and human mesh reconstruction
Z80-μLM is a 2-bit quantized language model
An Efficient Agentic Model for Computer Use
The official PyTorch implementation of Google's Gemma models
Revolutionizing Database Interactions with Private LLM Technology
Diffusion Transformer with Fine-Grained Chinese Understanding
Pokee Deep Research Model Open Source Repo
Easy Docker setup for Stable Diffusion with user-friendly UI
Bidirectional token-classification model for identifiable info
Pretrained time-series foundation model developed by Google Research
Advancing Open-source World Models