PPTAgent: Generating and Evaluating Presentations
A simple, secure MCP-to-OpenAPI proxy server
The most powerful Android RPA agent framework
Implementation of "MobileCLIP" CVPR 2024
A fast, powerful, and simple hierarchical vision transformer
Code release for Cut and Learn for Unsupervised Object Detection
Training Large Language Model to Reason in a Continuous Latent Space
High-resolution models for human tasks
Video understanding codebase from FAIR for reproducing video models
CLIP, Predict the most relevant text snippet given an image
Ling is a MoE LLM provided and open-sourced by InclusionAI
Research code artifacts for Code World Model (CWM)
A Unified Framework for Text-to-3D and Image-to-3D Generation
Multimodal Diffusion with Representation Alignment
Personalize Any Characters with a Scalable Diffusion Transformer
Talk to Your AI Agents from Anywhere
The NVIDIA AgentIQ toolkit is an open-source library
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Extensible AGI Framework
Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster
Open Source Generative Process Automation
AI agent that streamlines the entire process of data analysis
A system for quickly generating training data with weak supervision
Uniform Manifold Approximation and Projection
PyTorch version of Stable Baselines