Showing 25 open source projects for "2d 3d"

View related business solutions
  • Find Hidden Risks in Windows Task Scheduler Icon
    Find Hidden Risks in Windows Task Scheduler

    Free diagnostic script reveals configuration issues, error patterns, and security risks. Instant HTML report.

    Windows Task Scheduler might be hiding critical failures. Download the free JAMS diagnostic tool to uncover problems before they impact production—get a color-coded risk report with clear remediation steps in minutes.
    Download Free Tool
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    InsightFace

    InsightFace

    State-of-the-art 2D and 3D Face Analysis Project

    State-of-the-art deep face analysis library. InsightFace is an open-source 2D&3D deep face analysis library. InsightFace is an integrated Python library for 2D&3D face analysis. InsightFace efficiently implements a wide variety of state-of-the-art algorithms for face recognition, face detection, and face alignment, which are optimized for both training and deployment. Research institutes and industrial organizations can get benefits from InsightFace library.
    Downloads: 374 This Week
    Last Update:
    See Project
  • 2
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image...
    Downloads: 30 This Week
    Last Update:
    See Project
  • 3
    Step1X-3D

    Step1X-3D

    High-Fidelity and Controllable Generation of Textured 3D Assets

    Step1X-3D is an open-source framework for generating high-fidelity textured 3D assets from scratch — both their geometry and surface textures — using modern generative AI techniques. It combines a hybrid architecture: a geometry generation stage using a VAE-DiT model to output a watertight 3D representation (e.g. TSDF surface), and a texture synthesis stage that conditions on geometry and optionally reference input (or prompts) to produce view-consistent textures using a diffusion-based...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Mesh R-CNN

    Mesh R-CNN

    code for Mesh R-CNN, ICCV 2019

    Mesh R-CNN is a 3D reconstruction and object understanding framework developed by Facebook Research that extends Mask R-CNN into the 3D domain. Built on top of Detectron2 and PyTorch3D, Mesh R-CNN enables end-to-end 3D mesh prediction directly from single RGB images. The model learns to detect, segment, and reconstruct detailed 3D mesh representations of objects in natural images, bridging the gap between 2D perception and 3D understanding. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    MESHROOM

    MESHROOM

    3D reconstruction software

    Photogrammetry is the science of making measurements from photographs. It infers the geometry of a scene from a set of unordered photographies or videos. Photography is the projection of a 3D scene onto a 2D plane, losing depth information. The goal of photogrammetry is to reverse this process. The dense modeling of the scene is the result yielded by chaining two computer vision-based pipelines, “Structure-from-Motion” (SfM) and “Multi View Stereo” (MVS). Fusion of Multi-bracketing LDR images into HDR. Alignment of panorama images. ...
    Downloads: 104 This Week
    Last Update:
    See Project
  • 6
    Stable Virtual Camera

    Stable Virtual Camera

    Stable Virtual Camera: Generative View Synthesis with Diffusion Models

    Stable Virtual Camera is a multi-view diffusion model developed by Stability AI that transforms 2D images into immersive 3D videos with realistic depth and perspective. Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up to 1,000 frames, making it a versatile tool for creators seeking to enhance visual storytelling. ​
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    PyTorch3D

    PyTorch3D

    PyTorch3D is FAIR's library of reusable components for deep learning

    ...Researchers use it for tasks like shape generation, reconstruction, view synthesis, and visual reasoning. PyTorch3D also includes utilities for loading, transforming, and sampling 3D assets, so models can be trained end-to-end from 2D supervision or partial data. Its modular design allows easy extension—components like differentiable rasterizers, mesh blending, or signed distance field (SDF) modules can be swapped or combined to test new architectures quickly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Video Diffusion - Pytorch

    Video Diffusion - Pytorch

    Implementation of Video Diffusion Models

    ...Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at Imagen-pytorch. For conditioning on text, they derived text embeddings by first passing the tokenized text through BERT-large. You can also directly pass in the descriptions of the video as strings, if you plan on using BERT-base for text conditioning. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Tracking Any Point (TAP)

    Tracking Any Point (TAP)

    DeepMind model for tracking arbitrary points across videos & robotics

    TAPNet is the official Google DeepMind repository for Tracking Any Point (TAP), bundling datasets, models, benchmarks, and demos for precise point tracking in videos. The project includes the TAP-Vid and TAPVid-3D benchmarks, which evaluate long-range tracking of arbitrary points in 2D and 3D across diverse real and synthetic videos. Its flagship models—TAPIR, BootsTAPIR, and the latest TAPNext—use matching plus temporal refinement or next-token style propagation to achieve state-of-the-art accuracy and speed on TAP-Vid. RoboTAP demonstrates how TAPIR-style tracks can drive real-world robot manipulation via efficient imitation, and ships with a dataset of annotated robotics videos. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Atera all-in-one platform IT management software with AI agents Icon
    Atera all-in-one platform IT management software with AI agents

    Ideal for internal IT departments or managed service providers (MSPs)

    Atera’s AI agents don’t just assist, they act. From detection to resolution, they handle incidents and requests instantly, taking your IT management from automated to autonomous.
    Learn More
  • 10
    Minigrid

    Minigrid

    Simple and easily configurable grid world environments

    ...Because of its simplicity, it is often used for rapid prototyping, analytic experiments, curriculum learning, or pedagogical tutorials. While it is not a full 3D simulation environment, its strength lies in enabling many environment resets and steps cheaply, which is valuable for algorithmic RL research rather than high-fidelity rendering.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DeepLabCut

    DeepLabCut

    Implementation of DeepLabCut

    DeepLabCut™ is an efficient method for 2D and 3D markerless pose estimation based on transfer learning with deep neural networks that achieves excellent results (i.e. you can match human labeling accuracy) with minimal training data (typically 50-200 frames). We demonstrate the versatility of this framework by tracking various body parts in multiple species across a broad collection of behaviors.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Make-A-Video - Pytorch (wip)

    Make-A-Video - Pytorch (wip)

    Implementation of Make-A-Video, new SOTA text to video generator

    ...Passing in images (if one were to pretrain on images first), both temporal convolution and attention will be automatically skipped. In other words, you can use this straightforwardly in your 2d Unet and then port it over to a 3d Unet once that phase of the training is done.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Godot RL Agents

    Godot RL Agents

    An Open Source package that allows video game creators

    godot_rl_agents is a reinforcement learning integration for the Godot game engine. It allows AI agents to learn how to interact with and play Godot-based games using RL algorithms. The toolkit bridges Godot with Python-based RL libraries like Stable-Baselines3, making it possible to create complex and visually rich RL environments natively in Godot.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Face Alignment

    Face Alignment

    2D and 3D Face alignment library build using pytorch

    Detect facial landmarks from Python using the world's most accurate face alignment network, capable of detecting points in both 2D and 3D coordinates. Build using FAN's state-of-the-art deep learning-based face alignment method. For numerical evaluations, it is highly recommended to use the lua version which uses identical models with the ones evaluated in the paper. More models will be added soon. By default, the package will use the SFD face detector. However, the users can alternatively use dlib, BlazeFace, or pre-existing ground truth bounding boxes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Stable-Dreamfusion

    Stable-Dreamfusion

    Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion

    A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. This project is a work-in-progress and contains lots of differences from the paper. The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    Point-E

    Point-E

    Point cloud diffusion for 3D model synthesis

    point-e is the official repository for Point-E, a generative model developed by OpenAI that produces 3D point clouds from textual (or image) prompts. Its principal advantage is speed: it can generate 3D assets in just 1–2 minutes on a single GPU, which is significantly faster than many competing text-to-3D models. The model works via a two-stage diffusion approach: first, it uses a text → image diffusion network to produce a synthetic 2D view consistent with the prompt; then a second diffusion model converts that image into a 3D point cloud. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    FrankMocap

    FrankMocap

    A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

    FrankMocap is a monocular 3D human capture system that estimates body, hand, and optionally face pose from a single RGB image or video. It regresses parametric human models (e.g., SMPL/SMPL-X) directly, producing temporally stable meshes and joint angles suitable for animation or analytics. The pipeline couples a robust 2D keypoint detector with 3D mesh regression networks and priors that keep results anatomically plausible.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Gym

    Gym

    Toolkit for developing and comparing reinforcement learning algorithms

    Gym by OpenAI is a toolkit for developing and comparing reinforcement learning algorithms. It supports teaching agents, everything from walking to playing games like Pong or Pinball. Open source interface to reinforce learning tasks. The gym library provides an easy-to-use suite of reinforcement learning tasks. Gym provides the environment, you provide the algorithm. You can write your agent using your existing numerical computation library, such as TensorFlow or Theano. It makes no...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 20
    AI Atelier

    AI Atelier

    Based on the Disco Diffusion, version of the AI art creation software

    ...Copyright and license notices must be preserved. When a modified version is used to provide a service over a network, the complete source code of the modified version must be made available. Create 2D and 3D animations and not only still frames (from Disco Diffusion v5 and VQGAN Animations). Input audio and images for generation instead of just text. Simplify tool setup process on colab, and enable ‘one-click’ sharing of the generated link to other users. Experiment with the possibilities for multi-user access to the same link.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    ...DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    VideoPose3D

    VideoPose3D

    Efficient 3D human pose estimation in video using 2D keypoint

    VideoPose3D is a deep learning framework that reconstructs 3D human poses from 2D keypoint sequences extracted from videos. It builds on top of convolutional and temporal networks that map 2D joint coordinates over time to consistent 3D skeletons, enabling robust motion capture without specialized sensors. The model is trained on large motion capture datasets and can generalize well to unseen environments by leveraging temporal context for smoothing and error correction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    VoteNet

    VoteNet

    Deep Hough Voting for 3D Object Detection in Point Clouds

    VoteNet is a 3D object detection framework for point clouds that combines deep point set networks with a Hough voting mechanism to localize and classify objects in 3D space. It tackles the challenge that object centroids in 3D scenes often don’t lie on any input surface point by having each point “vote” for potential object centers; these votes are then clustered to propose object hypotheses.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    House3D

    House3D

    A Realistic and Rich 3D Environment

    House3D is a large-scale virtual 3D simulation environment designed to support research in embodied AI, reinforcement learning, and vision-language navigation. It provides more than 45,000 richly annotated indoor scenes sourced from the SUNCG dataset, covering diverse architectural layouts such as studios, multi-floor homes, and spaces with detailed furnishings and room types. Each environment includes fully labeled 3D objects, allowing agents to perceive and interact with their surroundings...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Five video classification methods

    Five video classification methods

    Code that accompanies my blog post outlining five video classification

    Classifying video presents unique challenges for machine learning models. As I’ve covered in my previous posts, video has the added (and interesting) property of temporal features in addition to the spatial features present in 2D images. While this additional information provides us more to work with, it also requires different network architectures and, often, adds larger memory and computational demands.We won’t use any optical flow images. This reduces model complexity, training time, and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next