Showing 53 open source projects for "3d in visual basic"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    ComfyUI-3D-Pack

    ComfyUI-3D-Pack

    An extensive node suite that enables ComfyUI to process 3D inputs

    ComfyUI-3D-Pack is an extension package for the ComfyUI visual AI workflow environment that enables users to generate and manipulate 3D assets using advanced machine learning techniques. ComfyUI itself is a node-based interface for designing and executing generative AI pipelines, and this extension expands its capabilities by introducing nodes specifically designed for working with three-dimensional data.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 2
    HunyuanWorld 1.0

    HunyuanWorld 1.0

    Generating Immersive, Explorable, and Interactive 3D Worlds

    ...This approach enables 360° immersive experiences, seamless mesh export for graphics pipelines, and disentangled object representations for enhanced interactivity. The architecture integrates panoramic proxy generation, semantic layering, and hierarchical 3D reconstruction to produce high-quality scene-scale 3D worlds from both text and images. HunyuanWorld-1.0 surpasses existing open-source methods in visual quality and geometric consistency, demonstrated by superior scores in BRISQUE, NIQE, Q-Align, and CLIP metrics.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    AliceVision

    AliceVision

    3D Computer Vision Framework

    AliceVision is an open-source photogrammetric computer vision framework designed to reconstruct detailed 3D scenes and camera motion from collections of images or videos. It provides a complete pipeline for structure-from-motion (SfM), multi-view stereo (MVS), and mesh generation, allowing users to convert 2D imagery into accurate 3D models. The framework is built with a strong emphasis on research-grade algorithms while maintaining the robustness required for production environments, making it suitable for industries such as visual effects, cultural heritage preservation, and robotics. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    OpenClaw Office

    OpenClaw Office

    OpenClaw Office is the visual monitoring and management frontend

    OpenClaw Office is a visual monitoring and management interface designed for the OpenClaw multi-agent system, providing an immersive and interactive way to observe and control autonomous AI agents. It presents agent activity through a virtual office environment, where each agent is represented as an animated entity within a 2D or 3D workspace. The platform enables real-time visualization of agent states, interactions, and workflows, making complex multi-agent coordination easier to understand and debug. ...
    Downloads: 17 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Depth Anything 3

    Depth Anything 3

    Recovering the Visual Space from Any Views

    Depth Anything 3 is a research-driven project that brings accurate and dense depth estimation to any input image or video, enabling foundational understanding of 3D structure from 2D visual content. Designed to work across diverse scenes, lighting conditions, and image types, it uses advanced neural networks trained on large, heterogeneous datasets, producing depth maps that reveal scene depth relationships and object surfaces with strong fidelity. The model can be applied to photography, AR/VR content creation, robotics perception, and 3D reconstruction workflows, making it versatile across industries and research domains. ...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    eos

    eos

    A lightweight 3D Morphable Face Model library in modern C++

    eos is a lightweight 3D Morphable Face Model fitting library that provides basic functionality to use face models, as well as camera and shape fitting functionality. It's written in modern C++11/14. MorphableModel and PcaModel classes to represent 3DMMs, with basic operations like draw_sample(). Supports the Surrey Face Model (SFM), 4D Face Model (4DFM), Basel Face Model (BFM) 2009 and 2017, and the Liverpool-York Head Model (LYHM) out-of-the-box.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    PyTorch3D

    PyTorch3D

    PyTorch3D is FAIR's library of reusable components for deep learning

    ...Researchers use it for tasks like shape generation, reconstruction, view synthesis, and visual reasoning. PyTorch3D also includes utilities for loading, transforming, and sampling 3D assets, so models can be trained end-to-end from 2D supervision or partial data. Its modular design allows easy extension—components like differentiable rasterizers, mesh blending, or signed distance field (SDF) modules can be swapped or combined to test new architectures quickly.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Claw3D

    Claw3D

    Claw3D is an open source 3D engine built on OpenClaw

    Claw3D is an experimental open-source platform that combines elements of 3D simulation, developer tooling, and AI orchestration by creating an interactive virtual workspace where AI agents can be visualized as active participants in a shared environment. It is designed as a 3D “virtual office” where users can observe, manage, and interact with multiple AI agents performing tasks such as coding, reviewing pull requests, and coordinating workflows in real time.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    LLaMA-Mesh

    LLaMA-Mesh

    Unifying 3D Mesh Generation with Language Models

    LLaMA-Mesh is a research framework that extends large language models so they can understand and generate 3D mesh data alongside text. The system introduces a method for representing 3D meshes in a textual format by encoding vertex coordinates and face definitions as sequences that can be processed by a language model. By serializing 3D geometry into text tokens, the approach allows existing transformer architectures to generate and interpret 3D models without requiring specialized visual tokenizers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Custom VMs From 1 to 96 vCPUs With 99.95% Uptime Icon
    Custom VMs From 1 to 96 vCPUs With 99.95% Uptime

    General-purpose, compute-optimized, or GPU/TPU-accelerated. Built to your exact specs.

    Live migration and automatic failover keep workloads online through maintenance. One free e2-micro VM every month.
    Try Free
  • 10
    ArtCraft

    ArtCraft

    Crafting engine for artists, designers, and filmmakers

    ...The project positions itself as an intentional “crafting engine” for artists, designers, and filmmakers who want deeper control over generative media pipelines. Rather than relying purely on text prompts, ArtCraft emphasizes visual manipulation, compositional control, and iterative refinement so creators can treat AI output more like a malleable creative medium. The application is built with performance and responsiveness in mind, enabling users to move between different creative canvases and asset workflows within a unified interface. It aims to support complex multimedia generation workflows including image, video, and potentially 3D content creation, making it useful for experimental filmmaking and advanced visual design.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 11
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    Map-Anything is a universal, feed-forward transformer for metric 3D reconstruction that predicts a scene’s geometry and camera parameters directly from visual inputs. Instead of stitching together many task-specific models, it uses a single architecture that supports a wide range of 3D tasks—multi-image structure-from-motion, multi-view stereo, monocular metric depth, registration, depth completion, and more.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    ManiSkill

    ManiSkill

    SAPIEN Manipulation Skill Framework

    ManiSkill is a benchmark platform for training and evaluating reinforcement learning agents on dexterous manipulation tasks using physics-based simulations. Developed by Hao Su Lab, it focuses on robotic manipulation with diverse, high-quality 3D tasks designed to challenge perception, control, and planning in robotics. ManiSkill provides both low-level control and visual observation spaces for realistic learning scenarios.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Qwen3-VL

    Qwen3-VL

    Qwen3-VL, the multimodal large language model series by Alibaba Cloud

    Qwen3-VL is the latest multimodal large language model series from Alibaba Cloud’s Qwen team, designed to integrate advanced vision and language understanding. It represents a major upgrade in the Qwen lineup, with stronger text generation, deeper visual reasoning, and expanded multimodal comprehension. The model supports dense and Mixture-of-Experts (MoE) architectures, making it scalable from edge devices to cloud deployments, and is available in both instruction-tuned and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    VGGSfM

    VGGSfM

    VGGSfM: Visual Geometry Grounded Deep Structure From Motion

    VGGSfM is an advanced structure-from-motion (SfM) framework jointly developed by Meta AI Research (GenAI) and the University of Oxford’s Visual Geometry Group (VGG). It reconstructs 3D geometry, dense depth, and camera poses directly from unordered or sequential images and videos. The system combines learned feature matching and geometric optimization to generate high-quality camera calibrations, sparse/dense point clouds, and depth maps in standard COLMAP format. Version 2.0 adds support for dynamic scene handling, dense point cloud export, video-based reconstruction (1000+ frames), and integration with Gaussian Splatting pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    comfyui-mixlab-nodes

    comfyui-mixlab-nodes

    Workflow and speech recognition app

    comfyui-mixlab-nodes is a large collection of custom nodes for ComfyUI that turns workflows into interactive apps and adds real-time multimedia, LLM, and TTS capabilities. It introduces a “Workflow-to-APP” concept, where a ComfyUI graph can be transformed into a Web App through an AppInfo node, complete with categories, batch prompts, and editable configurations. The project also brings Real-time Design features like screen capture and floating video nodes, enabling creative pipelines that...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 17
    ViZDoom

    ViZDoom

    Doom-based AI research platform for reinforcement learning

    ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular. ViZDoom is based on ZDOOM, the most popular modern source-port of DOOM. This means compatibility with a huge range of tools and resources that can be used to create custom scenarios, availability of detailed documentation of the engine and tools and support of Doom community. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    LTX-2

    LTX-2

    Python inference and LoRA trainer package for the LTX-2 audio–video

    LTX-2 is a powerful, open-source toolkit developed by Lightricks that provides a modular, high-performance base for building real-time graphics and visual effects applications. It is architected to give developers low-level control over rendering pipelines, GPU resource management, shader orchestration, and cross-platform abstractions so they can craft visually compelling experiences without starting from scratch. Beyond basic rendering scaffolding, LTX-2 includes optimized math libraries, resource loaders, utilities for texture and buffer handling, and integration points for native event loops and input systems. ...
    Downloads: 67 This Week
    Last Update:
    See Project
  • 19
    video2robot

    video2robot

    End-to-end pipeline converting generative videos

    video2robot is an end-to-end open-source pipeline that converts generative video or prompt-driven motion content into executable humanoid robot motion sequences, enabling researchers and developers to go from high-level action descriptions or videos to robot-ready motion data. The pipeline supports both prompt-to-video generation using models like Veo/Sora and video upload processing, followed by human pose extraction through a 3D pose model and retargeting of that motion to robot joints using a general motion retargeting system. This workflow allows users to generate robot motion files that specify joint angles, root positions, and orientations that can be deployed on supported robot platforms (e.g., Unitree models). Video2robot includes scripts for each stage of the pipeline (generation, extraction, conversion, visualization) and can run as a CLI or through a basic web UI.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    PaddleX

    PaddleX

    PaddlePaddle End-to-End Development Toolkit

    PaddleX is a deep learning full-process development tool based on the core framework, development kit, and tool components of Paddle. It has three characteristics opening up the whole process, integrating industrial practice, and being easy to use and integrate. Image classification and labeling is the most basic and simplest labeling task. Users only need to put pictures belonging to the same category in the same folder. When the model is trained, we need to divide the training set, the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 21
    Book6_First-Course-in-Data-Science

    Book6_First-Course-in-Data-Science

    From Addition, Subtraction, Multiplication, and Division to ML

    Book6_First-Course-in-Data-Science is an open-source educational project that serves as part of the “Iris Book” series focused on teaching data science and machine learning concepts through a combination of mathematics, programming, and visualization. The repository contains draft chapters, supporting Python code, and visual materials designed to guide readers from basic mathematical operations toward practical machine learning understanding. The goal of the project is to make complex topics such as statistics, algorithms, and data analysis more accessible to learners by breaking concepts into clear explanations supported by code examples and diagrams. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    zvt

    zvt

    Modular quant framework

    For practical trading, a complex algorithm is fragile, a complex algorithm building on a complex facility is more fragile, complex algorithm building on a complex facility by a complex team is more and more fragile. zvt wants to provide a simple facility for building a straightforward algorithm. Technologies come and technologies go, but market insight is forever. Your world is built by core concepts inside you, so it’s you. zvt world is built by core concepts inside the market, so it’s zvt....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    A series of open source files and programs available to use for developing programs to work with the WowWee Robotics RSMedia Robot. These include a USB serial console, a cross-compiler, a firmware dump program, text-to-speech and source code.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    hloc

    hloc

    Visual localization made easy with hloc

    ...Just download the datasets and you're reading to go! The notebook pipeline_InLoc.ipynb shows the steps for localizing with InLoc. It's much simpler since a 3D SfM model is not needed. We show in pipeline_SfM.ipynb how to run 3D reconstruction for an unordered set of images. This generates reference poses, and a nice sparse 3D model suitable for localization with the same pipeline as Aachen.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    DIG

    DIG

    A library for graph deep learning research

    The key difference with current graph deep learning libraries, such as PyTorch Geometric (PyG) and Deep Graph Library (DGL), is that, while PyG and DGL support basic graph deep learning operations, DIG provides a unified testbed for higher level, research-oriented graph deep learning tasks, such as graph generation, self-supervised learning, explainability, 3D graphs, and graph out-of-distribution. If you are working or plan to work on research in graph deep learning, DIG enables you to develop your own methods within our extensible framework, and compare with current baseline methods using common datasets and evaluation metrics without extra efforts. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB