Showing 212 open source projects for "computer vision"

View related business solutions
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Computer Vision Annotation Tool (CVAT)

    Computer Vision Annotation Tool (CVAT)

    Interactive video and image annotation tool for computer vision

    Computer Vision Annotation Tool (CVAT) is a free and open source, interactive online tool for annotating videos and images for Computer Vision algorithms. It offers many powerful features, including automatic annotation using deep learning models, interpolation of bounding boxes between key frames, LDAP and more. It is being used by its own professional data annotation team to annotate millions of objects with different properties.
    Downloads: 26 This Week
    Last Update:
    See Project
  • 2
    Raster Vision

    Raster Vision

    Open source framework for deep learning satellite and aerial imagery

    Raster Vision is an open source framework for Python developers building computer vision models on satellite, aerial, and other large imagery sets (including oblique drone imagery). There is built-in support for chip classification, object detection, and semantic segmentation using PyTorch. Raster Vision allows engineers to quickly and repeatably configure pipelines that go through core components of a machine learning workflow: analyzing training data, creating training chips, training models, creating predictions, evaluating models, and bundling the model files and configuration for easy deployment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Vision Transformer Pytorch

    Vision Transformer Pytorch

    Implementation of Vision Transformer, a simple way to achieve SOTA

    This repository provides a from-scratch, minimalist implementation of the Vision Transformer (ViT) in PyTorch, focusing on the core architectural pieces needed for image classification. It breaks down the model into patch embedding, positional encoding, multi-head self-attention, feed-forward blocks, and a classification head so you can understand each component in isolation. The code is intentionally compact and modular, which makes it easy to tinker with hyperparameters, depth, width, and...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 4
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    OpenCV (Open Source Computer Vision Library) is a comprehensive open-source library for computer vision, machine learning, and image processing. It enables developers to build real-time vision applications ranging from facial recognition to object tracking. OpenCV supports a wide range of programming languages including C++, Python, and Java, and is optimized for both CPU and GPU operations.
    Downloads: 41 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5
    Kornia

    Kornia

    Open Source Differentiable Computer Vision Library

    ...With Kornia we fill the gap between classical and deep computer vision that implements standard and advanced vision algorithms for AI. Our libraries and initiatives are always according to the community needs.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 7
    Albumentations

    Albumentations

    Fast image augmentation library and an easy-to-use wrapper

    Albumentations is a computer vision tool that boosts the performance of deep convolutional neural networks. Albumentations is a Python library for fast and flexible image augmentations. Albumentations efficiently implements a rich variety of image transform operations that are optimized for performance, and does so while providing a concise, yet powerful image augmentation interface for different computer vision tasks, including object classification, segmentation, and detection. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    MIVisionX

    MIVisionX

    Set of comprehensive computer vision & machine intelligence libraries

    ...AMD OpenVX is a highly optimized open-source implementation of the Khronos OpenVX™ 1.3 computer vision specification. It allows for rapid prototyping as well as fast execution on a wide range of computer hardware, including small embedded x86 CPUs and large workstation discrete GPUs.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    TorchIO

    TorchIO

    Medical imaging toolkit for deep learning

    ...Transforms include typical computer vision operations such as random affine transformations and also domain-specific ones such as simulation of intensity artifacts due to MRI magnetic field inhomogeneity.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    MESHROOM

    MESHROOM

    3D reconstruction software

    ...Photography is the projection of a 3D scene onto a 2D plane, losing depth information. The goal of photogrammetry is to reverse this process. The dense modeling of the scene is the result yielded by chaining two computer vision-based pipelines, “Structure-from-Motion” (SfM) and “Multi View Stereo” (MVS). Fusion of Multi-bracketing LDR images into HDR. Alignment of panorama images. Support for fisheye optics. Automatically estimate fisheye circle or manually edit it. Take advantage of motorized-head file. Easy to integrate in your Renderfarm System. ...
    Downloads: 108 This Week
    Last Update:
    See Project
  • 11
    R1-V

    R1-V

    Witness the aha moment of VLM with less than $3

    R1-V is an initiative aimed at enhancing the generalization capabilities of Vision-Language Models (VLMs) through Reinforcement Learning in Visual Reasoning (RLVR). The project focuses on building a comprehensive framework that emphasizes algorithm enhancement, efficiency optimization, and task diversity to achieve general vision-language intelligence and visual/GUI agents. The team's long-term goal is to contribute impactful open-source research in this domain.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SAM 2

    SAM 2

    The repository provides code for running inference with SAM 2

    SAM2 is a next-generation version of the Segment Anything Model (SAM), designed to improve performance, generalization, and efficiency in promptable image segmentation tasks. It retains the core promptable interface—accepting points, boxes, or masks—but incorporates architectural and training enhancements to produce higher-fidelity masks, better boundary adherence, and robustness to complex scenes. The updated model is optimized for faster inference and lower memory use, enabling real-time...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 13
    GoCV

    GoCV

    Go package for computer vision using OpenCV 4 and beyond

    GoCV gives programmers who use the Go programming language access to the OpenCV 4 computer vision library. The GoCV package supports the latest releases of Go and OpenCV v4.5.4 on Linux, macOS, and Windows. Our mission is to make the Go language a “first-class” client compatible with the latest developments in the OpenCV ecosystem. Computer Vision (CV) is the ability of computers to process visual information, and perform tasks normally associated with those performed by humans. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    COLMAP

    COLMAP

    Structure-from-Motion and Multi-View Stereo

    COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface. It offers a wide range of features for the reconstruction of ordered and unordered image collections. The software is licensed under the new BSD license.
    Downloads: 66 This Week
    Last Update:
    See Project
  • 15
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. ...
    Downloads: 50 This Week
    Last Update:
    See Project
  • 16
    torchvision

    torchvision

    Datasets, transforms and models specific to Computer Vision

    The torchvision package consists of popular datasets, model architectures, and common image transformations for computer vision. We recommend Anaconda as Python package management system. Torchvision currently supports Pillow (default), Pillow-SIMD, which is a much faster drop-in replacement for Pillow with SIMD, if installed will be used as the default. Also, accimage, if installed can be activated by calling torchvision.set_image_backend('accimage'), libpng, which can be installed via conda conda install libpng or any of the package managers for debian-based and RHEL-based Linux distributions, and libjpeg, which can be installed via conda conda install jpeg or any of the package managers for debian-based and RHEL-based Linux distributions. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    SAHI

    SAHI

    A lightweight vision library for performing large object detection

    A lightweight vision library for performing large-scale object detection & instance segmentation. Object detection and instance segmentation are by far the most important fields of applications in Computer Vision. However, detection of small objects and inference on large images are still major issues in practical usage. Here comes the SAHI to help developers overcome these real-world problems with many vision utilities.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    GoogleTest

    GoogleTest

    Google Testing and Mocking Framework

    GoogleTest is Google's C++ mocking and test framework. It's used by many internal projects at Google, as well as a number of notable projects such as The Chromium projects, the OpenCV computer vision library, and the LLVM compiler. This GoogleTest project is actually a union of what used to be two separate projects: the old GoogleTest and GoogleMock, an extension of GoogleTest for writing and using C++ mock classes. Since they were so closely related, they were merged to create an even better GoogleTest. GoogleTest features an xUnit test framework, a rich set of assertions, user-defined assertions, death tests, among many others. ...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 19
    Colossal-AI

    Colossal-AI

    Making large AI models cheaper, faster and more accessible

    The Transformer architecture has improved the performance of deep learning models in domains such as Computer Vision and Natural Language Processing. Together with better performance come larger model sizes. This imposes challenges to the memory wall of the current accelerator hardware such as GPU. It is never ideal to train large models such as Vision Transformer, BERT, and GPT on a single GPU or a single machine. There is an urgent demand to train models in a distributed environment. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively. Improving data quality and understanding your model’s failure modes are the most impactful ways to boost the performance of your model. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 21
    ArrayFire

    ArrayFire

    ArrayFire, a general purpose GPU library

    ArrayFire is a general-purpose tensor library that simplifies the process of software development for the parallel architectures found in CPUs, GPUs, and other hardware acceleration devices. The library serves users in every technical computing market. Data structures in ArrayFire are smartly managed to avoid costly memory transfers and to take advantage of each performance feature provided by the underlying hardware. The community of ArrayFire developers invites you to build with us if...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 22
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    JavaCV

    JavaCV

    Java interface to OpenCV, FFmpeg, and more

    JavaCV uses wrappers from the JavaCPP Presets of commonly used libraries by researchers in the field of computer vision (OpenCV, FFmpeg, libdc1394, FlyCapture, Spinnaker, OpenKinect, librealsense, CL PS3 Eye Driver, videoInput, ARToolKitPlus, flandmark, Leptonica, and Tesseract) and provides utility classes to make their functionality easier to use on the Java platform, including Android. JavaCV also comes with hardware accelerated full-screen image display (CanvasFrame and GLCanvasFrame), easy-to-use methods to execute code in parallel on multiple cores (Parallel), user-friendly geometric and color calibration of cameras and projectors (GeometricCalibrator, ProCamGeometricCalibrator, ProCamColorCalibrator), detection and matching of feature points (ObjectFinder), a set of classes that implement direct image alignment of projector-camera systems (mainly GNImageAligner, ProjectiveTransformer, ProjectiveColorTransformer, ProCamTransformer, and ReflectanceInitializer), and more.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 24
    Datasets

    Datasets

    Hub of ready-to-use datasets for ML models

    Datasets is a library for easily accessing and sharing datasets, and evaluation metrics for Natural Language Processing (NLP), computer vision, and audio tasks. Load a dataset in a single line of code, and use our powerful data processing methods to quickly get your dataset ready for training in a deep learning model. Backed by the Apache Arrow format, process large datasets with zero-copy reads without any memory constraints for optimal speed and efficiency. We also feature a deep integration with the Hugging Face Hub, allowing you to easily load and share a dataset with the wider NLP community. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 25
    AWS IoT FleetWise Edge

    AWS IoT FleetWise Edge

    AWS IoT FleetWise Edge Agent

    Easily collect, transform, and transfer vehicle data to the cloud in near-real-time. AWS IoT FleetWise makes it easy and cost-effective for automakers to collect, transform, and transfer vehicle data to the cloud in near-real-time and use it to build applications with analytics and machine learning that improve vehicle quality, safety, and autonomy. Train autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) with camera data collected from a fleet of production vehicles....
    Downloads: 7 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB