Collection of Gemma 3 variants that are trained for performance
Official repository for LTX-Video
VMZ: Model Zoo for Video Modeling
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
State-of-the-art (SoTA) text-to-video pre-trained model
A Unified Framework for Text-to-3D and Image-to-3D Generation
CogView4, CogView3-Plus and CogView3(ECCV 2024)
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
Capable of understanding text, audio, vision, video
Repo for SeedVR2 & SeedVR
AlphaFold 3 inference pipeline
Hackable and optimized Transformers building blocks
PyTorch code and models for the DINOv2 self-supervised learning
Official implementation of DreamCraft3D
Controllable & emotion-expressive zero-shot TTS
Pokee Deep Research Model Open Source Repo
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Tiny vision language model
Tool for exploring and debugging transformer model behaviors
Multimodal Diffusion with Representation Alignment
Generate Any 3D Scene in Seconds
This repository contains the official implementation of FastVLM
Foundation Models for Time Series
A Production-ready Reinforcement Learning AI Agent Library