Showing 27 open source projects for "3d vision nvidia"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T

    NVIDIA Isaac GR00T N1.5 is the world's first open foundation model

    NVIDIA Isaac‑GR00T N1.5 is an open-source foundation model engineered for generalized humanoid robot reasoning and manipulation skills. It accepts multimodal inputs—such as language and images—and uses a diffusion transformer architecture built upon vision-language encoders, enabling adaptive robot behaviors across diverse environments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    NVIDIA Model Optimizer

    NVIDIA Model Optimizer

    A unified library of SOTA model optimization techniques

    Model Optimizer is a unified library that provides state-of-the-art techniques for compressing and optimizing deep learning models to improve inference efficiency and deployment performance. It brings together multiple optimization strategies such as quantization, pruning, distillation, and speculative decoding into a single cohesive framework. The library is designed to reduce model size and computational requirements while maintaining accuracy, making it particularly valuable for deploying...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    NVIDIA NeMo Framework

    NVIDIA NeMo Framework

    Scalable generative AI framework built for researchers and developers

    NVIDIA NeMo is a scalable, cloud-native generative AI framework aimed at researchers and PyTorch developers working on large language models, multimodal models, and speech AI (ASR and TTS), with growing support for computer vision. It provides collections of domain-specific modules and reference implementations that make it easier to pre-train, fine-tune, and deploy very large models on multi-GPU and multi-node infrastructure.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    MESHROOM

    MESHROOM

    3D reconstruction software

    Photogrammetry is the science of making measurements from photographs. It infers the geometry of a scene from a set of unordered photographies or videos. Photography is the projection of a 3D scene onto a 2D plane, losing depth information. The goal of photogrammetry is to reverse this process. The dense modeling of the scene is the result yielded by chaining two computer vision-based pipelines, “Structure-from-Motion” (SfM) and “Multi View Stereo” (MVS). Fusion of Multi-bracketing LDR images into HDR. Alignment of panorama images. ...
    Downloads: 146 This Week
    Last Update:
    See Project
  • 8 Monitoring Tools in One APM. Install in 5 Minutes. Icon
    8 Monitoring Tools in One APM. Install in 5 Minutes.

    Errors, performance, logs, uptime, hosts, anomalies, dashboards, and check-ins. One interface.

    AppSignal works out of the box for Ruby, Elixir, Node.js, Python, and more. 30-day free trial, no credit card required.
    Start Free
  • 5
    NVIDIA Generative AI Examples

    NVIDIA Generative AI Examples

    Generative AI reference workflows

    NVIDIA GenerativeAIExamples is an open-source repository that provides practical reference implementations and example workflows for building generative AI applications using NVIDIA’s software ecosystem. The project is designed to help developers accelerate the development of AI applications by providing ready-to-run pipelines, notebooks, and tools that demonstrate how to integrate large language models into real-world systems.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    CUDA Containers for Edge AI & Robotics

    CUDA Containers for Edge AI & Robotics

    Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

    ...The project is particularly useful for developers building edge AI and robotics systems that rely on GPU-accelerated inference and real-time computer vision. By using containerized environments, developers can ensure that their applications run consistently across different Jetson platforms and JetPack versions. The repository also includes build tools and package management utilities that help automate the process of assembling machine learning environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Newton

    Newton

    An open-source, GPU-accelerated physics simulation engine

    ...Newton supports OpenUSD for modern 3D scene representation and interoperability, making it suitable for complex simulation ecosystems. It is developed as a Linux Foundation project with contributions from major organizations like NVIDIA, Google DeepMind, and Disney Research, highlighting its relevance in cutting-edge robotics and AI development.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    MakeHuman

    MakeHuman

    This is the main repository for the MakeHuman application as such

    This is the main source code for the MakeHuman application as such. See "Getting started" below for instructions on how to get MakeHuman up and running. Mac users should be able to use the same instructions as windows users, although this has not been thoroughly tested. At the point of writing this, the source code is almost ready for a stable release. The testing vision for this code is to build a community release that includes main application and often-used, user-contributed plug-ins. We...
    Downloads: 41 This Week
    Last Update:
    See Project
  • 9
    TorchIO

    TorchIO

    Medical imaging toolkit for deep learning

    ...TorchIO is a Python package containing a set of tools to efficiently read, preprocess, sample, augment, and write 3D medical images in deep learning applications written in PyTorch, including intensity and spatial transforms for data augmentation and preprocessing. Transforms include typical computer vision operations such as random affine transformations and also domain-specific ones such as simulation of intensity artifacts due to MRI magnetic field inhomogeneity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    CO3D (Common Objects in 3D)

    CO3D (Common Objects in 3D)

    Tooling for the Common Objects In 3D dataset

    CO3Dv2 (Common Objects in 3D, version 2) is a large-scale 3D computer vision dataset and toolkit from Facebook Research designed for training and evaluating category-level 3D reconstruction methods using real-world data. It builds upon the original CO3Dv1 dataset, expanding both scale and quality—featuring 2× more sequences and 4× more frames, with improved image fidelity, more accurate segmentation masks, and enhanced annotations for object-centric 3D reconstruction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Kornia

    Kornia

    Open Source Differentiable Computer Vision Library

    Kornia is a differentiable computer vision library for PyTorch. It consists of a set of routines and differentiable modules to solve generic computer vision problems. At its core, the package uses PyTorch as its main backend both for efficiency and to take advantage of the reverse-mode auto-differentiation to define and compute the gradient of complex functions. Inspired by existing packages, this library is composed by a subset of packages containing operators that can be inserted within...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Mesh R-CNN

    Mesh R-CNN

    code for Mesh R-CNN, ICCV 2019

    Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. Unlike voxel-based or point-based approaches, Mesh...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    The FreeMoCap Project

    The FreeMoCap Project

    Free Motion Capture for Everyone

    FreeMoCap is an open-source markerless motion capture system that enables users to record human movement using ordinary cameras and convert the footage into usable 3D motion data. The project’s goal is to democratize motion capture by removing the need for expensive suits or proprietary studio hardware, instead relying on computer vision and pose estimation pipelines. It processes synchronized video feeds to reconstruct skeletal motion, which can then be exported for animation, biomechanics research, or creative projects. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    UCO3D

    UCO3D

    Uncommon Objects in 3D dataset

    uCO3D is a large-scale 3D vision dataset and toolkit centered on turn-table videos of everyday objects drawn from the LVIS taxonomy. It provides about 170,000 full videos per object instance rather than still frames, along with per-video annotations including object masks, calibrated camera poses, and multiple flavors of point clouds. Each sequence also ships with a precomputed 3D Gaussian Splat reconstruction, enabling fast, differentiable rendering workflows and modern implicit/point-based modeling experiments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    CogAgent

    CogAgent

    An open sourced end-to-end VLM-based GUI Agent

    ...The model is designed for agent-style execution rather than freeform chat, maintaining a continuous execution history across steps while requiring a fresh session for each new task. Inference supports BF16 on NVIDIA GPUs, with optional INT8 and INT4 modes available but with noted performance loss at INT4; example CLIs and a web demo illustrate bounding-box outputs and operation categories.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    HomeRobot

    HomeRobot

    Mobile manipulation research tools for roboticists

    ...It provides interfaces for Detic, Grounded-SAM, and Contact-GraspNet, allowing open-vocabulary detection and 3D grasping.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Lingvo

    Lingvo

    Framework for building neural networks

    Lingvo is a TensorFlow based framework focused on building and training sequence models, especially for language and speech tasks. It was originally developed for internal research and later open sourced to support reproducible experiments and shared model implementations. The framework provides a structured way to define models, input pipelines, and training configurations using a common interface for layers, which encourages reuse across different tasks. It has been used to implement state...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    DensePose is a computer vision system that maps all human pixels in an RGB image to the 3D surface of a human body model. It extends human pose estimation from predicting joint keypoints to providing dense correspondences between 2D images and a canonical 3D mesh (such as the SMPL model). This enables detailed understanding of human shape, motion, and surface appearance directly from images or videos.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    VideoPose3D

    VideoPose3D

    Efficient 3D human pose estimation in video using 2D keypoint

    ...By using only 2D detections (such as those from OpenPose or Detectron), it enables markerless 3D pose estimation with relatively lightweight computational requirements. The framework includes pretrained models, data preprocessing utilities, visualization tools, and evaluation scripts for standard benchmarks like Human3.6M. VideoPose3D has been used widely in computer vision research for human motion understanding, activity recognition, and animation generation.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    aioulinux

    aioulinux

    Linux for Arduino and Makers developers

    Hello, I'm the Aioulinux founder, eager to professionally revive the project. Since 2018, the demand for an IoT and Arduino-tailored environment has been evident. Seeking partners for a 2024 version targeting schools and IoT companies, aiming for a secure and comprehensive platform. If you share this vision and wish to collaborate, reach out. Let's revive Aioulinux stronger than ever! Now seeking partners: Live Distro Specialist: Expert in live distributions to ensure...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    House3D

    House3D

    A Realistic and Rich 3D Environment

    House3D is a large-scale virtual 3D simulation environment designed to support research in embodied AI, reinforcement learning, and vision-language navigation. It provides more than 45,000 richly annotated indoor scenes sourced from the SUNCG dataset, covering diverse architectural layouts such as studios, multi-floor homes, and spaces with detailed furnishings and room types.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    RoboComp
    RoboComp is a robotics framework providing a set of open-source, distributed, real-time robotic and artificial vision software components and the necessary tools to create and manage them.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    The Vision Egg produces 2D and 3D visual stimuli on commodity (or workstation) video cards using hardware-accelerated OpenGL. It is built for precise timing, precise color and luminance specification, and real-time control of graphics.
    Leader badge
    Downloads: 33 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB