Powerful AI language model (MoE) optimized for efficiency/performance
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Models for object and human mesh reconstruction
Diffusion Transformer with Fine-Grained Chinese Understanding
MapAnything: Universal Feed-Forward Metric 3D Reconstruction
RGBD video generation model conditioned on camera input
A Customizable Image-to-Video Model based on HunyuanVideo
Pokee Deep Research Model Open Source Repo
Multimodal-Driven Architecture for Customized Video Generation
Reference PyTorch implementation and models for DINOv3
Tencent Hunyuan Multimodal diffusion transformer (MM-DiT) model
Pushing the Limits of Mathematical Reasoning in Open Language Models
FlashMLA: Efficient Multi-head Latent Attention Kernels
Phi-3.5 for Mac: Locally-run Vision and Language Models
Official code for Style Aligned Image Generation via Shared Attention
PyTorch code and models for the DINOv2 self-supervised learning
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion
Open-source, high-performance Mixture-of-Experts large language model
Open-Source Financial Large Language Models!
800,000 step-level correctness labels on LLM solutions to MATH problem
Code release for ConvNeXt V2 model
Code release for "Masked-attention Mask Transformer
Generate embeddings from large-scale graph-structured data