Showing 28 open source projects for "coordinates"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 1
    Claude Code Bridge

    Claude Code Bridge

    Real-time multi-AI collaboration: Claude, Codex & Gemini

    ...By maintaining persistent shared context between these models, the tool reduces redundant prompts and minimizes token usage while allowing each AI system to contribute specialized capabilities. The architecture functions as a unified launcher that manages communication between multiple AI providers and coordinates their responses within the same development session. Developers can run the tool in terminal environments and integrate it with terminal multiplexers such as tmux or advanced terminal emulators.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ML Ferret

    ML Ferret

    Refer and Ground Anything Anywhere at Any Granularity

    Ferret is Apple’s end-to-end multimodal large language model designed specifically for flexible referring and grounding: it can understand references of any granularity (boxes, points, free-form regions) and then ground open-vocabulary descriptions back onto the image. The core idea is a hybrid region representation that mixes discrete coordinates with continuous visual features, so the model can fluidly handle “any-form” referring while maintaining precise spatial localization. The repo presents the vision-language pipeline, model assets, and paper resources that show how Ferret answers questions, follows instructions, and returns grounded outputs rather than just text. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Norfair

    Norfair

    Lightweight Python library for adding real-time multi-object tracking

    Norfair is a customizable lightweight Python library for real-time multi-object tracking. Using Norfair, you can add tracking capabilities to any detector with just a few lines of code. Any detector expressing its detections as a series of (x, y) coordinates can be used with Norfair. This includes detectors performing tasks such as object or keypoint detection. It can easily be inserted into complex video processing pipelines to add tracking to existing projects. At the same time, it is possible to build a video inference loop from scratch using just Norfair and a detector. Supports moving camera, re-identification with appearance embeddings, and n-dimensional object tracking. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Osaurus

    Osaurus

    AI edge infrastructure for macOS. Run local or cloud models

    ...Osaurus supports running both local and remote models, enabling developers to build AI-powered applications that can operate offline or leverage external APIs when needed. The platform acts as an always-on runtime that coordinates AI tasks, tools, and workflows while enabling applications to communicate with models through standardized interfaces. Developers can extend the system through plugins that expose additional capabilities, tools, or services to the runtime using a structured plugin architecture. Osaurus also supports the Model Context Protocol, allowing tools and AI services to share context and interact with multiple applications simultaneously.
    Downloads: 10 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    Mentat

    Mentat

    Mentat - The AI Coding Assistant

    Mentat is the AI tool that assists you with any coding task, right from your command line. Unlike Copilot, Mentat coordinates edits across multiple locations and files. And unlike ChatGPT, Mentat already has the context of your project, no copy and pasting is required. Run Mentat from within your project directory. Mentat uses Git, so if your project doesn't already have Git set up, run git init. List the files you would like Mentat to read and edit as arguments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    CogVLM

    CogVLM

    A state-of-the-art open visual language model

    ...The flagship CogVLM-17B combines ~10B visual parameters with ~7B language parameters and supports 490×490 inputs; CogAgent-18B extends this to 1120×1120 and adds plan/next-action outputs plus grounded operation coordinates for GUI tasks. The repo provides multiple ways to run models (CLI, web demo, and OpenAI-Vision–style APIs), along with quantization options that reduce VRAM needs (e.g., 4-bit). It includes checkpoints for chat, base, and grounding variants, plus recipes for model-parallel inference and LoRA fine-tuning. The documentation covers task prompts for general dialogue, visual grounding (box→caption, caption→box, caption+boxes), and GUI agent workflows that produce structured actions with bounding boxes.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 7
    Sprite Fusion Pixel Snapper

    Sprite Fusion Pixel Snapper

    A tool to snap pixels to a perfect grid

    Sprite Fusion Pixel Snapper is a utility designed to eliminate sub-pixel rendering issues that often arise in pixel art, UI icons, and 2D sprite graphics when displayed on screens with high DPI or during motion animations. The tool works by adjusting sprite rendering coordinates and texture sampling so that every pixel aligns cleanly to the screen’s pixel grid, avoiding blurring, distortion, or unintended smoothing artifacts. This is especially important in pixel art games, retro-styled interactive media, or precise UI designs where crisp edges and predictable alignment are essential. SpriteFusion Pixel Snapper integrates with popular game engines and rendering pipelines to ensure that assets remain sharp across a broad range of resolutions and aspect ratios without requiring manual fiddling from artists or developers. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Parallax

    Parallax

    Parallax is a distributed model serving framework

    ...Instead of relying on centralized GPU clusters in data centers, the system allows multiple heterogeneous machines to collaborate in serving AI inference workloads. Parallax divides model layers across different nodes and dynamically coordinates them to form a complete inference pipeline. A two-stage scheduling architecture determines how model layers are allocated to available hardware and how requests are routed across nodes during execution. This scheduling system optimizes latency, throughput, and hardware utilization even when nodes have different computational capabilities. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 10
    Stable Diffusion Web UI Extensions

    Stable Diffusion Web UI Extensions

    Extension index for stable-diffusion-webui

    ...It also standardizes submission format so extension authors can contribute entries that the Web UI can parse reliably. For end users, this turns the Web UI into a modular platform where new features appear without manual cloning or guesswork. The project effectively coordinates a thriving plugin ecosystem, keeping discovery and updates lightweight and centralized.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    MLE-bench

    MLE-bench

    AI multi-agent framework for automating data-driven R&D workflows

    RD-Agent is an open source AI framework designed to automate research and development workflows in data-driven domains. It uses large language models and multiple collaborating agents to simulate the typical cycle of research, experimentation, and improvement that human data scientists follow. It separates the process into two core phases: a research stage that proposes hypotheses and ideas, and a development stage that implements and evaluates them through code execution and experiments. By...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    LLaMA-Mesh

    LLaMA-Mesh

    Unifying 3D Mesh Generation with Language Models

    LLaMA-Mesh is a research framework that extends large language models so they can understand and generate 3D mesh data alongside text. The system introduces a method for representing 3D meshes in a textual format by encoding vertex coordinates and face definitions as sequences that can be processed by a language model. By serializing 3D geometry into text tokens, the approach allows existing transformer architectures to generate and interpret 3D models without requiring specialized visual tokenizers. The project includes a supervised fine-tuning dataset composed of interleaved text and mesh data, allowing the model to learn relationships between textual descriptions and 3D structures. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    AI Employe

    AI Employe

    Create browser automation as if you were teaching a human using GPT-4

    ...There are several techniques for this, ranging from sending a shortened form of HTML to GPT-3, creating a bounding box with IDs and sending it to GPT-4-vision to take actions, or directly asking GPT-4-vision to obtain the X and Y coordinates of the element. However, none of these methods were reliable; they all led to hallucinations. To prevent GPT from derailing from tasks, we use a technique that is akin to retrieval-augmented generation, but we kind of call it Actions Augmented Generation. Essentially, when a user creates a workflow, we don't record the screen, microphone, or camera, but we do record the DOM element changes for every action (clicking, typing, etc.) the user takes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Face Alignment

    Face Alignment

    2D and 3D Face alignment library build using pytorch

    Detect facial landmarks from Python using the world's most accurate face alignment network, capable of detecting points in both 2D and 3D coordinates. Build using FAN's state-of-the-art deep learning-based face alignment method. For numerical evaluations, it is highly recommended to use the lua version which uses identical models with the ones evaluated in the paper. More models will be added soon. By default, the package will use the SFD face detector. However, the users can alternatively use dlib, BlazeFace, or pre-existing ground truth bounding boxes. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Hyperformer

    Hyperformer

    Hypergraph Transformer for Skeleton-based Action Recognition

    This is the official implementation of our paper "Hypergraph Transformer for Skeleton-based Action Recognition." Skeleton-based action recognition aims to recognize human actions given human joint coordinates with skeletal interconnections. By defining a graph with joints as vertices and their natural connections as edges, previous works successfully adopted Graph Convolutional networks (GCNs) to model joint co-occurrences and achieved superior performance. More recently, a limitation of GCNs is identified, i.e., the topology is fixed after training. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Alphafold2

    Alphafold2

    Unofficial Pytorch implementation / replication of Alphafold2

    To eventually become an unofficial working Pytorch implementation of Alphafold2, the breathtaking attention network that solved CASP14. Will be gradually implemented as more details of the architecture is released. Once this is replicated, I intend to fold all available amino acid sequences out there in-silico and release it as an academic torrent, to further science. Deepmind has open sourced the official code in Jax, along with the weights! This repository will now be geared towards a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Yoha

    Yoha

    A practical hand tracking engine

    Yoha is a browser-based hand tracking engine designed to enable real-time gesture recognition and interaction using standard webcams, making it accessible for web applications without specialized hardware. Built using JavaScript and TensorFlow.js, it runs directly in the browser and performs inference on-device, eliminating the need for server-side processing. The engine is capable of detecting 21 two-dimensional hand landmarks, allowing developers to build applications that respond to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    PyTTI-Notebook

    PyTTI-Notebook

    PyTTI-Notebook

    Recent advances in machine learning have created opportunities for “AI” technologies to assist unlocking creativity in powerful ways. PyTTI is a toolkit that facilitates image generation, animation, and manipulation using processes that could be thought of as a human artist collaborating with AI assistants. The underlying technology is complex, but you don’t need to be a deep learning expert or even know coding of any kind to use these tools. Understanding the underlying technology can be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    ...DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Semantic Segmentation Editor

    Semantic Segmentation Editor

    Web labeling tool for bitmap images and point clouds

    A web-based labeling tool for creating AI training data sets (2D and 3D). The tool has been developed in the context of autonomous driving research. It supports images (.jpg or .png) and point clouds (.pcd). It is a Meteor app developed with React, Paper.js, and three.js.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    VideoPose3D

    VideoPose3D

    Efficient 3D human pose estimation in video using 2D keypoint

    VideoPose3D is a deep learning framework that reconstructs 3D human poses from 2D keypoint sequences extracted from videos. It builds on top of convolutional and temporal networks that map 2D joint coordinates over time to consistent 3D skeletons, enabling robust motion capture without specialized sensors. The model is trained on large motion capture datasets and can generalize well to unseen environments by leveraging temporal context for smoothing and error correction. By using only 2D detections (such as those from OpenPose or Detectron), it enables markerless 3D pose estimation with relatively lightweight computational requirements. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 22
    imgaug

    imgaug

    Image augmentation for machine learning experiments

    imgaug is a library for image augmentation in machine learning experiments. It supports a wide range of augmentation techniques, allows to easily combine these and to execute them in random order or on multiple CPU cores, has a simple yet powerful stochastic interface and can not only augment images but also key points/landmarks, bounding boxes, heatmaps and segmentation maps. Affine transformations, perspective transformations, contrast changes, gaussian noise, dropout of regions,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    ChainerCV

    ChainerCV

    ChainerCV: a Library for Deep Learning in Computer Vision

    ...Bounding boxes in an image are represented as a two-dimensional array of shape (R,4), where R is the number of bounding boxes and the second axis corresponds to the coordinates of bounding boxes. ChainerCV supports dataset loaders, which can be used to easily index examples with list-like interfaces. Dataset classes whose names end with BboxDataset contain annotations of where objects locate in an image and which categories they are assigned to. These datasets can be indexed to return a tuple of an image, bounding boxes and labels. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    PokemonGo-Bot

    PokemonGo-Bot

    The Pokemon Go Bot, baking with community

    ...Allow custom hash service provider, if any. GPS Location configuration. Search & spin Pokestops / Gyms. Diverse options for humanlike behavior from movement to overall game play. Ability to add multiple coordinates to select between your favorite botting locations. Support self defined path / route. Advanced catch, evolve and transfer confuration using our PokemonOptimizer settings. Determine which pokeball to use. Rules to determine the use of Razz and Pinap Berries. Exchange, evolve and catch Pokemon base on pre-configured rules. Transfer Pokemon in bulk. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    FormRead

    FormRead

    Free OMR - OCR web sofware based on javascript and PHP

    https://formread.org FormRead is a completely free OMR (optical mark recognition) web software for scanning and grading user-filled, multiple choice forms. Create your formats with any of your office or drawing tools, scan them and parameterize their coordinates in an easy way. Once you have parameterized your form, you can print many of them, give it to your students/respondents, scan and recognize them with formread, and you can finally export the data in your preferred formats (excel, pdf, csv)
    Downloads: 6 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB