VGGSfM: Visual Geometry Grounded Deep Structure From Motion
GPT4V-level open-source multi-modal model based on Llama3-8B
A SOTA open-source image editing model
Multi-modal large language model designed for audio understanding
Open-source framework for intelligent speech interaction
Large Multimodal Models for Video Understanding and Editing
LLM-based Reinforcement Learning audio edit model
Real-time behaviour synthesis with MuJoCo, using Predictive Control
High-Resolution Image Synthesis with Latent Diffusion Models
Di♪♪Rhythm: Blazingly Fast & Simple End-to-End Song Generation
Towards Ultimate Expert Specialization in Mixture-of-Experts Language
Example Discord bot written in Python that uses the completions API
This repository contains the official implementation of research
Official PyTorch Implementation of "Scalable Diffusion Models"
A method to increase the speed and lower the memory footprint
A collection of high-quality models for the MuJoCo physics engine
Reference implementation of the Transformer architecture optimized
Per-Pixel Classification is Not All You Need for Semantic Segmentation
An implementation of model parallel GPT-2 and GPT-3-style models
Code for reproducing key results in the paper
High-compute ultra-reasoning model surpassing model surpassing GPT-5
High-efficiency reasoning and agentic intelligence model