Showing 64 open source projects for "3d -engine -framework"

View related business solutions
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • Save Up to 91% on Cloud Compute With Spot VMs Icon
    Save Up to 91% on Cloud Compute With Spot VMs

    Automatic sustained-use discounts. One free VM per month. No negotiation needed.

    Run batch jobs at 60-91% off with Spot VMs. Long-running workloads get automatic discounts with sustained use.
    Try Free
  • 1
    SAM 3D Objects

    SAM 3D Objects

    Models for object and human mesh reconstruction

    SAM 3D Objects is a foundation model that reconstructs full 3D geometry, texture, and spatial layout of objects and scenes from a single image. Given one RGB image and object masks (for example, from the Segment Anything family), it can generate a textured 3D mesh for each object, including pose and approximate scene layout. The model is specifically designed to be robust in real-world images with clutter, occlusions, small objects, and unusual viewpoints, where many earlier 3D-from-image systems struggle. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 2
    SAM 3D Body

    SAM 3D Body

    Code for running inference with the SAM 3D Body Model 3DB

    SAM 3D Body is a promptable model for single-image full-body 3D human mesh recovery, designed to estimate detailed human pose and shape from just one RGB image. It reconstructs the full body, including feet and hands, using the Momentum Human Rig (MHR), a parametric mesh representation that decouples skeletal structure from surface shape for more accurate and interpretable results.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 3
    ComfyUI-3D-Pack

    ComfyUI-3D-Pack

    An extensive node suite that enables ComfyUI to process 3D inputs

    ...The package allows the platform to process inputs such as meshes and UV textures and integrate them into generative workflows similar to those used for image and video generation. It incorporates modern 3D generation technologies including neural radiance fields, Gaussian splatting, and other AI-driven reconstruction techniques. Through these nodes, users can convert images into 3D models, manipulate geometry, and experiment with generative 3D workflows inside the visual pipeline editor.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    InsightFace

    InsightFace

    State-of-the-art 2D and 3D Face Analysis Project

    State-of-the-art deep face analysis library. InsightFace is an open-source 2D&3D deep face analysis library. InsightFace is an integrated Python library for 2D&3D face analysis. InsightFace efficiently implements a wide variety of state-of-the-art algorithms for face recognition, face detection, and face alignment, which are optimized for both training and deployment. Research institutes and industrial organizations can get benefits from InsightFace library.
    Downloads: 381 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    MESHROOM

    MESHROOM

    3D reconstruction software

    Photogrammetry is the science of making measurements from photographs. It infers the geometry of a scene from a set of unordered photographies or videos. Photography is the projection of a 3D scene onto a 2D plane, losing depth information. The goal of photogrammetry is to reverse this process. The dense modeling of the scene is the result yielded by chaining two computer vision-based pipelines, “Structure-from-Motion” (SfM) and “Multi View Stereo” (MVS). Fusion of Multi-bracketing LDR images into HDR. Alignment of panorama images. ...
    Downloads: 102 This Week
    Last Update:
    See Project
  • 6
    Hunyuan3D 2.0

    Hunyuan3D 2.0

    High-Resolution 3D Assets Generation with Large Scale Diffusion Models

    The Hunyuan3D-2 model, developed by Tencent, is designed for generating high-resolution 3D assets using large-scale diffusion models. This model offers advanced capabilities for creating detailed 3D models, including texture enhancements, multi-view shape generation, and rapid inference for real-time applications. It is particularly useful for industries requiring high-quality 3D content, such as gaming, film, and virtual reality.
    Downloads: 38 This Week
    Last Update:
    See Project
  • 7
    CO3D (Common Objects in 3D)

    CO3D (Common Objects in 3D)

    Tooling for the Common Objects In 3D dataset

    CO3Dv2 (Common Objects in 3D, version 2) is a large-scale 3D computer vision dataset and toolkit from Facebook Research designed for training and evaluating category-level 3D reconstruction methods using real-world data. It builds upon the original CO3Dv1 dataset, expanding both scale and quality—featuring 2× more sequences and 4× more frames, with improved image fidelity, more accurate segmentation masks, and enhanced annotations for object-centric 3D reconstruction. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    TRELLIS.2

    TRELLIS.2

    Native and Compact Structured Latents for 3D Generation

    TRELLIS.2 is a cutting-edge open-source model and codebase for high-fidelity 3D asset generation from 2D images, developed to push forward the state of the art in image-to-3D generation. At its core is a novel sparse voxel structure called O-Voxel that jointly encodes both geometry and surface appearance, enabling reconstruction and generation of complex 3D shapes with arbitrary topology, open surfaces, and physically based rendering (PBR) textures.
    Downloads: 18 This Week
    Last Update:
    See Project
  • 9
    PyTorch3D

    PyTorch3D

    PyTorch3D is FAIR's library of reusable components for deep learning

    PyTorch3D is a comprehensive library for 3D deep learning that brings differentiable rendering, geometric operations, and 3D data structures into the PyTorch ecosystem. It’s designed to make it easy to build and train neural networks that work directly with 3D data such as meshes, point clouds, and implicit surfaces. The library provides fast GPU-accelerated implementations of rendering pipelines, transformations, rasterization, and lighting—making it possible to compute gradients through full 3D rendering processes. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 10
    HunyuanWorld-Mirror

    HunyuanWorld-Mirror

    Fast and Universal 3D reconstruction model for versatile tasks

    HunyuanWorld-Mirror focuses on fast, universal 3D reconstruction that can ingest varied inputs and produce multiple kinds of 3D outputs. The model accepts combinations of images, camera intrinsics and poses, or even depth cues, then reconstructs consistent 3D geometry suitable for downstream rendering or editing. The pipeline emphasizes both speed and flexibility so creators can go from casual captures to assets without elaborate capture rigs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Hunyuan3D-2.1

    Hunyuan3D-2.1

    From Images to High-Fidelity 3D Assets

    Hunyuan3D-2.1 is Tencent Hunyuan’s advanced 3D asset generation system that produces high-fidelity 3D models with Physically Based Rendering (PBR) textures. It is fully open-source with released model weights, training, and inference code. It improves on prior versions by using a PBR texture pipeline (enabling realistic material effects like reflections and subsurface scattering) and allowing community fine-tuning and extension.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    HY-Motion 1.0

    HY-Motion 1.0

    HY-Motion model for 3D character animation generation

    HY-Motion 1.0 is an open-source, large-scale AI model suite developed by Tencent’s Hunyuan team that generates high-quality 3D human motion from simple text prompts, enabling the automatic production of fluid, diverse, and semantically accurate animations without manual keyframing or rigging. Built on advanced deep learning architectures that combine Diffusion Transformer (DiT) and flow matching techniques, HY-Motion scales these approaches to the billion-parameter level, resulting in strong instruction-following capabilities and richer motion outputs compared to existing open-source models. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 13
    UCO3D

    UCO3D

    Uncommon Objects in 3D dataset

    uCO3D is a large-scale 3D vision dataset and toolkit centered on turn-table videos of everyday objects drawn from the LVIS taxonomy. It provides about 170,000 full videos per object instance rather than still frames, along with per-video annotations including object masks, calibrated camera poses, and multiple flavors of point clouds. Each sequence also ships with a precomputed 3D Gaussian Splat reconstruction, enabling fast, differentiable rendering workflows and modern implicit/point-based modeling experiments. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography, AR/VR content creation, robotics perception, and 3D reconstruction workflows, making it versatile across industries and research domains. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    BlenderMCP

    BlenderMCP

    Blender Model Context Protocol Integration

    BlenderMCP is a bridge that connects Blender, a 3D modeling and rendering software, with AI systems like Claude through the Model Context Protocol, enabling direct AI-driven interaction with 3D environments. It allows users to control Blender using natural language prompts, effectively turning AI into a co-creator for 3D modeling, scene construction, and asset manipulation. The system establishes a two-way communication channel between Blender and the AI, where commands can be sent and results retrieved in real time. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    WorldGen

    WorldGen

    Generate Any 3D Scene in Seconds

    WorldGen is an AI model and library that can generate full 3D scenes in a matter of seconds from either text prompts or reference images. It is designed to create interactive environments suitable for games, simulations, robotics research, and virtual reality, rather than just static 3D assets. The core idea is that you describe a world in natural language and WorldGen produces a navigable 3D scene that you can freely explore in 360 degrees, with loop closure so that the space remains consistent as you move around. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Fast3R

    Fast3R

    Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

    Fast3R is Meta AI’s official CVPR 2025 release for “Towards 3D Reconstruction of 1000+ Images in One Forward Pass.” It represents a next-generation feedforward 3D reconstruction model capable of producing dense point clouds and camera poses for hundreds to thousands of images or video frames in a single inference pass—eliminating the need for slow, iterative structure-from-motion pipelines. Built on PyTorch Lightning and extending concepts from DUSt3R and Spann3r, Fast3R unifies multi-view geometry, depth estimation, and camera registration within a single transformer-based architecture. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    Map-Anything is a universal, feed-forward transformer for metric 3D reconstruction that predicts a scene’s geometry and camera parameters directly from visual inputs. Instead of stitching together many task-specific models, it uses a single architecture that supports a wide range of 3D tasks—multi-image structure-from-motion, multi-view stereo, monocular metric depth, registration, depth completion, and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Stable Virtual Camera

    Stable Virtual Camera

    Stable Virtual Camera: Generative View Synthesis with Diffusion Models

    Stable Virtual Camera is a multi-view diffusion model developed by Stability AI that transforms 2D images into immersive 3D videos with realistic depth and perspective. Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up to 1,000 frames, making it a versatile tool for creators seeking to enhance visual storytelling. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Depth Pro

    Depth Pro

    Sharp Monocular Metric Depth in Less Than a Second

    ...As a general-purpose monocular depth backbone, Depth Pro slots into 3D reconstruction, relighting, and scene understanding workflows that benefit from metric predictions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    TorchIO

    TorchIO

    Medical imaging toolkit for deep learning

    ...TorchIO is a Python package containing a set of tools to efficiently read, preprocess, sample, augment, and write 3D medical images in deep learning applications written in PyTorch, including intensity and spatial transforms for data augmentation and preprocessing. Transforms include typical computer vision operations such as random affine transformations and also domain-specific ones such as simulation of intensity artifacts due to MRI magnetic field inhomogeneity.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Tracking Any Point (TAP)

    Tracking Any Point (TAP)

    DeepMind model for tracking arbitrary points across videos & robotics

    TAPNet is the official Google DeepMind repository for Tracking Any Point (TAP), bundling datasets, models, benchmarks, and demos for precise point tracking in videos. The project includes the TAP-Vid and TAPVid-3D benchmarks, which evaluate long-range tracking of arbitrary points in 2D and 3D across diverse real and synthetic videos. Its flagship models—TAPIR, BootsTAPIR, and the latest TAPNext—use matching plus temporal refinement or next-token style propagation to achieve state-of-the-art accuracy and speed on TAP-Vid. RoboTAP demonstrates how TAPIR-style tracks can drive real-world robot manipulation via efficient imitation, and ships with a dataset of annotated robotics videos. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Face Alignment

    Face Alignment

    2D and 3D Face alignment library build using pytorch

    Detect facial landmarks from Python using the world's most accurate face alignment network, capable of detecting points in both 2D and 3D coordinates. Build using FAN's state-of-the-art deep learning-based face alignment method. For numerical evaluations, it is highly recommended to use the lua version which uses identical models with the ones evaluated in the paper. More models will be added soon. By default, the package will use the SFD face detector. However, the users can alternatively use dlib, BlazeFace, or pre-existing ground truth bounding boxes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    video2robot

    video2robot

    End-to-end pipeline converting generative videos

    video2robot is an end-to-end open-source pipeline that converts generative video or prompt-driven motion content into executable humanoid robot motion sequences, enabling researchers and developers to go from high-level action descriptions or videos to robot-ready motion data. The pipeline supports both prompt-to-video generation using models like Veo/Sora and video upload processing, followed by human pose extraction through a 3D pose model and retargeting of that motion to robot joints using a general motion retargeting system. This workflow allows users to generate robot motion files that specify joint angles, root positions, and orientations that can be deployed on supported robot platforms (e.g., Unitree models). Video2robot includes scripts for each stage of the pipeline (generation, extraction, conversion, visualization) and can run as a CLI or through a basic web UI.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    Protenix

    Protenix

    A trainable PyTorch reproduction of AlphaFold 3

    Protenix is an open-source, trainable PyTorch reimplementation of AlphaFold 3, developed by ByteDance with the goal of democratizing high-accuracy protein structure prediction for computational biology and drug-discovery research. Protenix provides a complete pipeline for turning protein sequences (with optional MSA / sequence alignment) or structural inputs (e.g. PDB/CIF) into full 3D atomic-level structure predictions. It supports both “full” models and lightweight variants such as “Protenix-Mini,” offering a trade-off between speed/compute cost and predictive accuracy — making structure prediction accessible even in resource-constrained environments. The project also includes support for constraints (e.g., specifying residue- or atom-level contact constraints, or pocket constraints) to guide predictions toward biologically or experimentally relevant conformations, which enhances its utility for tasks like modeling complexes, ligands, or antibody–antigen interactions.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Auth0 Logo