Implementation of the Surya Foundation Model for Heliophysics
Deep learning optimization library: makes distributed training easy
This repo contains the code for 1D tokenizer and generator
A Universal Customization Method for Single and Multi Conditioning
A Unified Framework for Image Customization
Flexible Photo Recrafting While Preserving Your Identity
Bailing is a voice dialogue robot similar to GPT-4o
Build Vision Agents quickly with any model or video provider
An Open Source text-to-speech system built by inverting Whisper
Lightning-fast, on-device TTS, running natively via ONNX
A simple, secure MCP-to-OpenAPI proxy server
The most powerful Android RPA agent framework
Implementation of "MobileCLIP" CVPR 2024
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
Research code artifacts for Code World Model (CWM)
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
The NVIDIA AgentIQ toolkit is an open-source library
Official Repo For "Sa2VA: Marrying SAM2 with LLaVA
LLM-based agent for general purpose software engineering tasks
Advanced techniques for RAG systems
The best ChatGPT that $100 can buy