Multimodal Diffusion with Representation Alignment
Multimodal-Driven Architecture for Customized Video Generation
From Images to High-Fidelity 3D Assets
ICLR2024 Spotlight: curation/training code, metadata, distribution
Hackable and optimized Transformers building blocks
Official implementation of DreamCraft3D
An Efficient Agentic Model for Computer Use
An experimental version of DeepSeek model
Open-Source Financial Large Language Models
Pokee Deep Research Model Open Source Repo
My personal Claude Code configuration
The official PyTorch implementation of Google's Gemma models
Diffusion Transformer with Fine-Grained Chinese Understanding
FlashMLA: Efficient Multi-head Latent Attention Kernels
The ChatGPT Retrieval Plugin lets you easily find personal documents
Pushing the Limits of Mathematical Reasoning in Open Language Models
Learning to Act by Watching Unlabeled Online Videos
Flagship MoE model for long-context agents and complex coding
Omnimodal AI model for agents, coding, and long-context tasks