Implementation of "MobileCLIP" CVPR 2024
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
Official implementation of Watermark Anything with Localized Messages
Training Large Language Model to Reason in a Continuous Latent Space
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
Ling is a MoE LLM provided and open-sourced by InclusionAI
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible
Research code artifacts for Code World Model (CWM)
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal-Driven Architecture for Customized Video Generation
Personalize Any Characters with a Scalable Diffusion Transformer
The NVIDIA AgentIQ toolkit is an open-source library
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Extensible AGI Framework
Open Source Generative Process Automation
AI agent that streamlines the entire process of data analysis
LLM based autonomous agent that does online comprehensive research
SWE-agent takes a GitHub issue and tries to automatically fix it
Superfast AI decision making and processing of multi-modal data
Uniform Manifold Approximation and Projection
PyTorch version of Stable Baselines
Gorilla: An API store for LLMs
Low-code framework for building custom LLMs, neural networks