A Multi-Modal World Model for Reconstructing, Generating, Simulation
State of the art LLM and coding model
Audio foundation model excelling in audio understanding
Large Multimodal Models for Video Understanding and Editing
Official code base for LeWorldModel: Stable End-to-End Joint-Embedding
Claude Code image, a one-stop open source transit service
Official implementation of Watermark Anything with Localized Messages
Genome modeling and design across all domains of life
Foundational Models for State-of-the-Art Speech and Text Translation
Analyze computation-communication overlap in V3/R1
NVIDIA Isaac GR00T N1.5 is the world's first open foundation model
Pokee Deep Research Model Open Source Repo
FAIR Sequence Modeling Toolkit 2
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Open-source framework for intelligent speech interaction
RGBD video generation model conditioned on camera input
A CNN model that predicts human joints from RGB images of a person
Pushing the Limits of Mathematical Reasoning in Open Language Models
Distribution TN 365 KDE moderne et stable !
Real-time behaviour synthesis with MuJoCo, using Predictive Control
Example Discord bot written in Python that uses the completions API
A fast, local neural text to speech system
Code for the paper Hybrid Spectrogram and Waveform Source Separation
Let us control diffusion models
PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)