Showing 53 open source projects for "atom 3d model"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    ...The model flexibly accepts different input combinations (images, intrinsics, poses, sparse or dense depth) and produces a rich set of outputs including per-pixel 3D points, camera intrinsics, camera poses, ray directions, confidence maps, and validity masks. Its inference path is fully feed-forward with optional mixed-precision and memory-efficient modes, making it practical to scale to long image sequences while keeping latency predictable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tracking Any Point (TAP)

    Tracking Any Point (TAP)

    DeepMind model for tracking arbitrary points across videos & robotics

    TAPNet is the official Google DeepMind repository for Tracking Any Point (TAP), bundling datasets, models, benchmarks, and demos for precise point tracking in videos. The project includes the TAP-Vid and TAPVid-3D benchmarks, which evaluate long-range tracking of arbitrary points in 2D and 3D across diverse real and synthetic videos. Its flagship models—TAPIR, BootsTAPIR, and the latest TAPNext—use matching plus temporal refinement or next-token style propagation to achieve state-of-the-art...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    video2robot

    video2robot

    End-to-end pipeline converting generative videos

    video2robot is an end-to-end open-source pipeline that converts generative video or prompt-driven motion content into executable humanoid robot motion sequences, enabling researchers and developers to go from high-level action descriptions or videos to robot-ready motion data. The pipeline supports both prompt-to-video generation using models like Veo/Sora and video upload processing, followed by human pose extraction through a 3D pose model and retargeting of that motion to robot joints using a general motion retargeting system. This workflow allows users to generate robot motion files that specify joint angles, root positions, and orientations that can be deployed on supported robot platforms (e.g., Unitree models). Video2robot includes scripts for each stage of the pipeline (generation, extraction, conversion, visualization) and can run as a CLI or through a basic web UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    VoxelMorph

    VoxelMorph

    Unsupervised Learning for Image Registration

    VoxelMorph is an open-source deep learning framework designed for medical image registration, a process that aligns multiple medical scans into a common spatial coordinate system. Traditional image registration techniques typically rely on optimization procedures that must be executed separately for each pair of images, which can be computationally expensive and slow. VoxelMorph approaches the problem using neural networks that learn to predict deformation fields that transform one image so...
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 Free Credits to Build on Google Cloud Icon
    $300 Free Credits to Build on Google Cloud

    New to Google Cloud? Get $300 in credits to explore Compute Engine, BigQuery, Cloud Run, Gemini Enterprise Agent Platform, and more.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query petabytes in BigQuery, or build agents with Gemini Enterprise Agent Platform. Once your credits are used, keep building with 20+ always-free tier products including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. No commitment required—just sign up and start building.
    Claim $300 Free
  • 5
    DeepSeed

    DeepSeed

    Deep learning optimization library making distributed training easy

    DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. DeepSpeed delivers extreme-scale model training for everyone, from data scientists training on massive supercomputers to those training on low-end clusters or even on a single GPU. Using current generation of GPU clusters with hundreds of devices, 3D parallelism of DeepSpeed can efficiently train deep learning models with trillions of parameters. With just a single GPU, ZeRO-Offload of DeepSpeed can train models with over 10B parameters, 10x bigger than the state of arts, democratizing multi-billion-parameter model training such that many deep learning scientists can explore bigger and better models. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Lingvo

    Lingvo

    Framework for building neural networks

    ...Lingvo includes reference models and configurations for domains like machine translation, automatic speech recognition, language modeling, image understanding, and 3D object detection. Centralized hyperparameter configuration files allow researchers to share exact experiment setups so others can retrain and compare results reliably.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    PyG

    PyG

    Graph Neural Network Library for PyTorch

    ...In addition, it consists of easy-to-use mini-batch loaders for operating on many small and single giant graphs, multi GPU-support, DataPipe support, distributed graph learning via Quiver, a large number of common benchmark datasets (based on simple interfaces to create your own), the GraphGym experiment manager, and helpful transforms, both for learning on arbitrary graphs as well as on 3D meshes or point clouds. All it takes is 10-20 lines of code to get started with training a GNN model (see the next section for a quick tour).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    MLX-Audio

    MLX-Audio

    A text-to-speech, speech-to-text and speech-to-speech library

    MLX-Audio is a speech library built on Apple’s MLX framework and optimized for Apple Silicon machines (M-series Macs). It focuses on text-to-speech and speech-to-speech workflows, with APIs and a command-line interface that make it easy to generate high-quality audio from text. Because it uses MLX and targets Apple Silicon, inference is fast and can take advantage of hardware acceleration and quantization for efficient on-device performance. The project provides a straightforward CLI...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Diffusers

    Diffusers

    State-of-the-art diffusion models for image and audio generation

    Diffusers is the go-to library for state-of-the-art pretrained diffusion models for generating images, audio, and even 3D structures of molecules. Whether you're looking for a simple inference solution or training your own diffusion models, Diffusers is a modular toolbox that supports both. Our library is designed with a focus on usability over performance, simple over easy, and customizability over abstractions. State-of-the-art diffusion pipelines that can be run in inference with just a...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure Icon
    Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure

    Native application identity and user-based security for your Azure cloud

    Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.
    Get a free trial
  • 10
    DeepSpeed

    DeepSpeed

    Deep learning optimization library: makes distributed training easy

    ...Train/Inference on resource constrained GPU systems 4. Achieve unprecedented low latency and high throughput for inference 5. Achieve extreme compression for an unparalleled inference latency and model size reduction with low costs DeepSpeed offers a confluence of system innovations, that has made large scale DL training effective, and efficient, greatly improved ease of use, and redefined the DL training landscape in terms of scale that is possible. These innovations such as ZeRO, 3D-Parallelism, DeepSpeed-MoE, ZeRO-Infinity, etc. fall under the training pillar.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Make-A-Video - Pytorch (wip)

    Make-A-Video - Pytorch (wip)

    Implementation of Make-A-Video, new SOTA text to video generator

    Implementation of Make-A-Video, new SOTA text to video generator from Meta AI, in Pytorch. They combine pseudo-3d convolutions (axial convolutions) and temporal attention and show much better temporal fusion. The pseudo-3d convolutions isn't a new concept. It has been explored before in other contexts, say for protein contact prediction as "dimensional hybrid residual networks". The gist of the paper comes down to, take a SOTA text-to-image model (here they use DALL-E2, but the same learning points would easily apply to Imagen), make a few minor modifications for attention across time and other ways to skimp on the compute cost, do frame interpolation correctly, get a great video model out. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    Video Diffusion - Pytorch

    Video Diffusion - Pytorch

    Implementation of Video Diffusion Models

    Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. Implementation of Video Diffusion Models, Jonathan Ho's new paper extending DDPMs to Video Generation - in Pytorch. It uses a special space-time factored U-net, extending generation from 2D images to 3D videos. 14k for difficult moving mnist (converging much faster and better than NUWA) - wip. Any new developments for text-to-video synthesis will be centralized at...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Shap-E

    Shap-E

    Generate 3D objects conditioned on text or images

    The shap-e repository provides the official code and model release for Shap-E, a conditional generative model designed to produce 3D assets (implicit functions, meshes, neural radiance fields) from text or image prompts. The model is built with a two-stage architecture: first an encoder that maps existing 3D assets into parameterizations of implicit functions, and then a conditional diffusion model trained on those parameterizations to generate new assets. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    towhee

    towhee

    Framework that is dedicated to making neural data processing

    ...You can use our Python API to build a prototype of your pipeline and use Towhee to automatically optimize it for production-ready environments. From images to text to 3D molecular structures, Towhee supports data transformation for nearly 20 different unstructured data modalities. We provide end-to-end pipeline optimizations, covering everything from data decoding/encoding, to model inference, making your pipeline execution 10x faster. Towhee provides out-of-the-box integration with your favorite libraries, tools, and frameworks, making development quick and easy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Aphantasia

    Aphantasia

    CLIP + FFT/DWT/RGB = text to image/video

    ...Direct RGB pixels optimization (very stable) depth-based 3D look (courtesy of deKxi, based on AdaBins), complex queries: text and/or image as main prompts, separate text prompts for style and to subtract (avoid) topics. Starting/resuming process from saved parameters or from an image.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    hloc

    hloc

    Visual localization made easy with hloc

    ...We provide step-by-step guides to localize with Aachen, InLoc, and to generate reference poses for your own data using SfM. Just download the datasets and you're reading to go! The notebook pipeline_InLoc.ipynb shows the steps for localizing with InLoc. It's much simpler since a 3D SfM model is not needed. We show in pipeline_SfM.ipynb how to run 3D reconstruction for an unordered set of images. This generates reference poses, and a nice sparse 3D model suitable for localization with the same pipeline as Aachen.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    Stable-Dreamfusion

    Stable-Dreamfusion

    Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion

    A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. This project is a work-in-progress and contains lots of differences from the paper. The current generation quality cannot match the results from the original paper, and many prompts still fail badly! Since the Imagen model is not publicly available, we use Stable Diffusion to replace it (implementation from diffusers).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a mesh via marching cubes. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 19
    Point-E

    Point-E

    Point cloud diffusion for 3D model synthesis

    point-e is the official repository for Point-E, a generative model developed by OpenAI that produces 3D point clouds from textual (or image) prompts. Its principal advantage is speed: it can generate 3D assets in just 1–2 minutes on a single GPU, which is significantly faster than many competing text-to-3D models. The model works via a two-stage diffusion approach: first, it uses a text → image diffusion network to produce a synthetic 2D view consistent with the prompt; then a second diffusion model converts that image into a 3D point cloud. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    FrankMocap

    FrankMocap

    A Strong and Easy-to-use Single View 3D Hand+Body Pose Estimator

    FrankMocap is a monocular 3D human capture system that estimates body, hand, and optionally face pose from a single RGB image or video. It regresses parametric human models (e.g., SMPL/SMPL-X) directly, producing temporally stable meshes and joint angles suitable for animation or analytics. The pipeline couples a robust 2D keypoint detector with 3D mesh regression networks and priors that keep results anatomically plausible.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Menagerie

    Menagerie

    A collection of high-quality models for the MuJoCo physics engine

    ...The repository aims to improve reproducibility and quality across robotics research by providing verified models that adhere to consistent design and physical standards. Each model directory contains its 3D assets, MJCF XML definitions, licensing information, and example scenes for visualization and testing. The collection spans a wide range of categories including robotic arms, humanoids, quadrupeds, mobile manipulators, drones, and biomechanical systems. Users can access models directly via the robot_descriptions Python package or by cloning the repository for use in interactive MuJoCo simulations.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    CIPS-3D

    CIPS-3D

    3D-aware GANs based on NeRF (arXiv)

    3D-aware GANs based on NeRF (arXiv). This repository contains the code of the paper, CIPS-3D: A 3D-Aware Generator of GANs Based on Conditionally-Independent Pixel Synthesis. The problem of mirror symmetry refers to the sudden change of the direction of the bangs near the yaw angle of pi/2. We propose to use an auxiliary discriminator to solve this problem. Note that in the initial stage of training, the auxiliary discriminator must dominate the generator more than the main discriminator...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    MeshCNN in PyTorch

    MeshCNN in PyTorch

    Convolutional Neural Network for 3D meshes in PyTorch

    MeshCNN is a deep learning framework designed specifically for processing 3D triangular mesh data using convolutional neural networks. Unlike traditional CNNs that operate on images or voxel grids, MeshCNN performs convolution operations directly on the edges of mesh structures. This design allows the model to capture geometric relationships between mesh elements while preserving the underlying topology of 3D shapes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Pytorch Points 3D

    Pytorch Points 3D

    Pytorch framework for doing deep learning on point clouds

    Torch Points 3D is a framework for developing and testing common deep learning models to solve tasks related to unstructured 3D spatial data i.e. Point Clouds. The framework currently integrates some of the best-published architectures and it integrates the most common public datasets for ease of reproducibility. It heavily relies on Pytorch Geometric and Facebook Hydra library thanks for the great work!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    ...DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.
    Downloads: 49 This Week
    Last Update:
    See Project
Auth0 Logo