Computer Vision Libraries for Linux

View 20 business solutions

Browse free open source Computer Vision Libraries and projects for Linux below. Use the toggles on the left to filter open source Computer Vision Libraries by OS, license, language, programming language, and project status.

  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 1
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    The Open Source Computer Vision Library has >2500 algorithms, extensive documentation and sample code for real-time computer vision. It works on Windows, Linux, Mac OS X, Android, iOS in your browser through JavaScript. Languages: C++, Python, Julia, Javascript Homepage: https://opencv.org Q&A forum: https://forum.opencv.org/ Documentation: https://docs.opencv.org Source code: https://github.com/opencv Please pay special attention to our tutorials! https://docs.opencv.org/master Books about the OpenCV are described here: https://opencv.org/books.html
    Leader badge
    Downloads: 3,227 This Week
    Last Update:
    See Project
  • 2
    MESHROOM

    MESHROOM

    3D reconstruction software

    Photogrammetry is the science of making measurements from photographs. It infers the geometry of a scene from a set of unordered photographies or videos. Photography is the projection of a 3D scene onto a 2D plane, losing depth information. The goal of photogrammetry is to reverse this process. The dense modeling of the scene is the result yielded by chaining two computer vision-based pipelines, “Structure-from-Motion” (SfM) and “Multi View Stereo” (MVS). Fusion of Multi-bracketing LDR images into HDR. Alignment of panorama images. Support for fisheye optics. Automatically estimate fisheye circle or manually edit it. Take advantage of motorized-head file. Easy to integrate in your Renderfarm System. Add specific rules to select the most suitable machines regarding CPU, RAM, GPU requirements of each Node.
    Downloads: 112 This Week
    Last Update:
    See Project
  • 3
    Armadillo

    Armadillo

    fast C++ library for linear algebra & scientific computing

    * Fast C++ library for linear algebra (matrix maths) and scientific computing * Easy to use functions and syntax, deliberately similar to Matlab / Octave * Uses template meta-programming techniques to increase efficiency * Provides user-friendly wrappers for OpenBLAS, Intel MKL, LAPACK, ATLAS, ARPACK, SuperLU and FFTW libraries * Useful for machine learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. * Downloads: http://arma.sourceforge.net/download.html * Documentation: http://arma.sourceforge.net/docs.html * Bug reports: http://arma.sourceforge.net/faq.html * Git repo: https://gitlab.com/conradsnicta/armadillo-code
    Leader badge
    Downloads: 2,727 This Week
    Last Update:
    See Project
  • 4
    COLMAP

    COLMAP

    Structure-from-Motion and Multi-View Stereo

    COLMAP is a general-purpose Structure-from-Motion (SfM) and Multi-View Stereo (MVS) pipeline with a graphical and command-line interface. It offers a wide range of features for the reconstruction of ordered and unordered image collections. The software is licensed under the new BSD license.
    Downloads: 66 This Week
    Last Update:
    See Project
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 5

    IIDC Camera Control Library

    Capture and control API for IIDC compliant cameras

    libdc1394 is a library that provides a high level programming interface for application developers who wish to control and capture streams from IEEE 1394 based cameras that conform to the 1394-based Digital Camera Specifications (also known as the IIDC or DCAM Specifications). libdc1394 also supports some USB cameras that are IIDC compliant. Besides capture and control, libdc1394 provides a full set of colour space conversion functions (including RAW decoding), vendor specific functions and direct camera register access. Keywords: ieee1394, IIDC, DCAM, firewire, USB, machine vision, computer vision, video capture, library
    Leader badge
    Downloads: 278 This Week
    Last Update:
    See Project
  • 6
    AirSim

    AirSim

    A simulator for drones, cars and more, built on Unreal Engine

    AirSim is an open-source, cross platform simulator for drones, cars and more vehicles, built on Unreal Engine with an experimental Unity release in the works. It supports software-in-the-loop simulation with popular flight controllers such as PX4 & ArduPilot and hardware-in-loop with PX4 for physically and visually realistic simulations. It is developed as an Unreal plugin that can simply be dropped into any Unreal environment. AirSim's development is oriented towards the goal of creating a platform for AI research to experiment with deep learning, computer vision and reinforcement learning algorithms for autonomous vehicles. For this purpose, AirSim also exposes APIs to retrieve data and control vehicles in a platform independent way. AirSim is fully enabled for multiple vehicles. This capability allows you to create multiple vehicles easily and use APIs to control them.
    Downloads: 53 This Week
    Last Update:
    See Project
  • 7
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    OpenCV (Open Source Computer Vision Library) is a comprehensive open-source library for computer vision, machine learning, and image processing. It enables developers to build real-time vision applications ranging from facial recognition to object tracking. OpenCV supports a wide range of programming languages including C++, Python, and Java, and is optimized for both CPU and GPU operations.
    Downloads: 50 This Week
    Last Update:
    See Project
  • 8
    OpenPose

    OpenPose

    Real-time multi-person keypoint detection library for body, face, etc.

    OpenPose has represented the first real-time multi-person system to jointly detect human body, hand, facial, and foot keypoints (in total 135 keypoints) on single images. It is authored by Ginés Hidalgo, Zhe Cao, Tomas Simon, Shih-En Wei, Yaadhav Raaj, Hanbyul Joo, and Yaser Sheikh. It is maintained by Ginés Hidalgo and Yaadhav Raaj. OpenPose would not be possible without the CMU Panoptic Studio dataset. We would also like to thank all the people who has helped OpenPose in any way. 15, 18 or 25-keypoint body/foot keypoint estimation, including 6 foot keypoints. Runtime invariant to number of detected people. 2x21-keypoint hand keypoint estimation. Runtime depends on number of detected people. 70-keypoint face keypoint estimation. Runtime depends on number of detected people. Input: Image, video, webcam, Flir/Point Grey, IP camera, and support to add your own custom input source (e.g., depth camera).
    Downloads: 38 This Week
    Last Update:
    See Project
  • 9

    OpenFace

    A state-of-the-art facial behavior analysis toolkit

    OpenFace is an advanced facial behavior analysis toolkit intended for computer vision and machine learning researchers, those in the affective computing community, and those who are simply interested in creating interactive applications based on facial behavior analysis. The OpenFace toolkit is capable of performing several complex facial analysis tasks, including facial landmark detection, eye-gaze estimation, head pose estimation and facial action unit recognition. OpenFace is able to deliver state-of-the-art results in all of these mentioned tasks. OpenFace is available for Windows, Ubuntu and macOS installations. It is capable of real-time performance and does not need to run on any specialist hardware, a simple webcam will suffice.
    Downloads: 34 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. It supports pre-trained models from the Open Model Zoo, along with 100+ open source and public models in popular formats such as TensorFlow, ONNX, PaddlePaddle, MXNet, Caffe, Kaldi.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 11
    Computer Vision Annotation Tool (CVAT)

    Computer Vision Annotation Tool (CVAT)

    Interactive video and image annotation tool for computer vision

    Computer Vision Annotation Tool (CVAT) is a free and open source, interactive online tool for annotating videos and images for Computer Vision algorithms. It offers many powerful features, including automatic annotation using deep learning models, interpolation of bounding boxes between key frames, LDAP and more. It is being used by its own professional data annotation team to annotate millions of objects with different properties. The UX and UI were also specially developed by the team for computer vision tasks. CVAT supports several annotation formats. Format selection can be done after clicking on the Upload annotation and Dump annotation buttons.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 12
    ImageAI

    ImageAI

    A python library built to empower developers

    ImageAI is an easy-to-use Computer Vision Python library that empowers developers to easily integrate state-of-the-art Artificial Intelligence features into their new and existing applications and systems. It is used by thousands of developers, students, researchers, tutors and experts in corporate organizations around the world. You will find features supported, links to official documentation as well as articles on ImageAI. ImageAI is widely used around the world by professionals, students, research groups and businesses. ImageAI provides API to recognize 1000 different objects in a picture using pre-trained models that were trained on the ImageNet-1000 dataset. The model implementations provided are SqueezeNet, ResNet, InceptionV3 and DenseNet. ImageAI provides API to detect, locate and identify 80 most common objects in everyday life in a picture using pre-trained models that were trained on the COCO Dataset.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 13
    AWS IoT FleetWise Edge

    AWS IoT FleetWise Edge

    AWS IoT FleetWise Edge Agent

    Easily collect, transform, and transfer vehicle data to the cloud in near-real-time. AWS IoT FleetWise makes it easy and cost-effective for automakers to collect, transform, and transfer vehicle data to the cloud in near-real-time and use it to build applications with analytics and machine learning that improve vehicle quality, safety, and autonomy. Train autonomous vehicles (AVs) and advanced driver assistance systems (ADAS) with camera data collected from a fleet of production vehicles. Improve electric vehicle (EV) battery range estimates with crowdsourced environmental data, such as weather and driving conditions, from nearby vehicles. Collect select data from nearby vehicles and use it to notify drivers of changing road conditions, such as lane closures or construction. Use near real-time data to proactively detect and mitigate fleet-wide quality issues.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 14
    GIMP ML

    GIMP ML

    AI for GNU Image Manipulation Program

    This repository introduces GIMP3-ML, a set of Python plugins for the widely popular GNU Image Manipulation Program (GIMP). It enables the use of recent advances in computer vision to the conventional image editing pipeline. Applications from deep learning such as monocular depth estimation, semantic segmentation, mask generative adversarial networks, image super-resolution, de-noising and coloring have been incorporated with GIMP through Python-based plugins. Additionally, operations on images such as edge detection and color clustering have also been added. GIMP-ML relies on standard Python packages such as numpy, scikit-image, pillow, pytorch, open-cv, scipy. In addition, GIMP-ML also aims to bring the benefits of using deep learning networks used for computer vision tasks to routine image processing workflows.
    Downloads: 16 This Week
    Last Update:
    See Project
  • 15
    JavaCV

    JavaCV

    Java interface to OpenCV, FFmpeg, and more

    JavaCV uses wrappers from the JavaCPP Presets of commonly used libraries by researchers in the field of computer vision (OpenCV, FFmpeg, libdc1394, FlyCapture, Spinnaker, OpenKinect, librealsense, CL PS3 Eye Driver, videoInput, ARToolKitPlus, flandmark, Leptonica, and Tesseract) and provides utility classes to make their functionality easier to use on the Java platform, including Android. JavaCV also comes with hardware accelerated full-screen image display (CanvasFrame and GLCanvasFrame), easy-to-use methods to execute code in parallel on multiple cores (Parallel), user-friendly geometric and color calibration of cameras and projectors (GeometricCalibrator, ProCamGeometricCalibrator, ProCamColorCalibrator), detection and matching of feature points (ObjectFinder), a set of classes that implement direct image alignment of projector-camera systems (mainly GNImageAligner, ProjectiveTransformer, ProjectiveColorTransformer, ProCamTransformer, and ReflectanceInitializer), and more.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 16
    GoogleTest

    GoogleTest

    Google Testing and Mocking Framework

    GoogleTest is Google's C++ mocking and test framework. It's used by many internal projects at Google, as well as a number of notable projects such as The Chromium projects, the OpenCV computer vision library, and the LLVM compiler. This GoogleTest project is actually a union of what used to be two separate projects: the old GoogleTest and GoogleMock, an extension of GoogleTest for writing and using C++ mock classes. Since they were so closely related, they were merged to create an even better GoogleTest. GoogleTest features an xUnit test framework, a rich set of assertions, user-defined assertions, death tests, among many others. It's been used on a variety of platforms, including Cygwin, Symbian, MinGW and PlatformIO.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 17
    reacTIVision
    reacTIVision is a computer vision framework for the fast and robust tracking of markers attached on physical objects, and the creation of multi-touch surfaces. It was designed for the rapid development of table-based tangible user interfaces.
    Downloads: 51 This Week
    Last Update:
    See Project
  • 18
    DensePose

    DensePose

    A real-time approach for mapping all human pixels of 2D RGB images

    DensePose is a computer vision system that maps all human pixels in an RGB image to the 3D surface of a human body model. It extends human pose estimation from predicting joint keypoints to providing dense correspondences between 2D images and a canonical 3D mesh (such as the SMPL model). This enables detailed understanding of human shape, motion, and surface appearance directly from images or videos. The repository includes the DensePose network architecture, training code, pretrained models, and dataset tools for annotation and visualization. DensePose is widely used in augmented reality, motion capture, virtual try-on, and visual effects applications because it enables real-time 3D human mapping from 2D inputs. The model architecture builds on Mask R-CNN, using additional regression heads to predict UV coordinates that map image pixels to 3D surfaces.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    Hello AI World

    Hello AI World

    Guide to deploying deep-learning inference networks

    Hello AI World is a great way to start using Jetson and experiencing the power of AI. In just a couple of hours, you can have a set of deep learning inference demos up and running for realtime image classification and object detection on your Jetson Developer Kit with JetPack SDK and NVIDIA TensorRT. The tutorial focuses on networks related to computer vision, and includes the use of live cameras. You’ll also get to code your own easy-to-follow recognition program in Python or C++, and train your own DNN models onboard Jetson with PyTorch. Ready to dive into deep learning? It only takes two days. We’ll provide you with all the tools you need, including easy to follow guides, software samples such as TensorRT code, and even pre-trained network models including ImageNet and DetectNet examples. Follow these directions to integrate deep learning into your platform of choice and quickly develop a proof-of-concept design.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 20
    LLaVA

    LLaVA

    Visual Instruction Tuning: Large Language-and-Vision Assistant

    Visual instruction tuning towards large language and vision models with GPT-4 level capabilities.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    PIFuHD

    PIFuHD

    High-Resolution 3D Human Digitization from A Single Image

    PIFuHD (Pixel-Aligned Implicit Function for 3D human reconstruction at high resolution) is a method and codebase to reconstruct high-fidelity 3D human meshes from a single image. It extends prior PIFu work by increasing resolution and detail, enabling fine geometry in cloth folds, hair, and subtle surface features. The method operates by learning an implicit occupancy / surface function conditioned on the image and camera projection; at inference time it queries dense points to reconstruct a mesh via marching cubes. It also uses a two-stage architecture: a coarse global model followed by local refinement patches to capture fine detail, balancing global consistency and local detail. The repo includes training pipelines, dataset loaders (for Multi-POP, etc.), and inference scripts for mesh output including depth maps for postprocessing. To help practical use, there are utilities for normal estimation, texture back-projection, mesh cleanup, and integration with rendering pipelines.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 22
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This includes the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Accord.NET Framework

    Accord.NET Framework

    Machine learning, computer vision, statistics and computing for .NET

    The Accord.NET Framework is a .NET machine learning framework combined with audio and image processing libraries completely written in C#. It is a complete framework for building production-grade computer vision, computer audition, signal processing and statistics applications even for commercial use. A comprehensive set of sample applications provide a fast start to get up and running quickly, and extensive documentation and a wiki help fill in the details. The Accord.NET project provides machine learning, statistics, artificial intelligence, computer vision and image processing methods to .NET. It can be used on Microsoft Windows, Xamarin, Unity3D, Windows Store applications, Linux or mobile. After merging with the AForge.NET project, the framework now offers a unified API for learning/training machine learning models that is both easy to use and extensible.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    FiftyOne

    FiftyOne

    The open-source tool for building high-quality datasets

    The open-source tool for building high-quality datasets and computer vision models. Nothing hinders the success of machine learning systems more than poor-quality data. And without the right tools, improving a model can be time-consuming and inefficient. FiftyOne supercharges your machine learning workflows by enabling you to visualize datasets and interpret models faster and more effectively. Improving data quality and understanding your model’s failure modes are the most impactful ways to boost the performance of your model. FiftyOne provides the building blocks for optimizing your dataset analysis pipeline. Use it to get hands-on with your data, including visualizing complex labels, evaluating your models, exploring scenarios of interest, identifying failure modes, finding annotation mistakes, and much more! Surveys show that machine learning engineers spend over half of their time wrangling data, but it doesn't have to be that way.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Netvlad

    Netvlad

    NetVLAD: CNN architecture for weakly supervised place recognition

    NetVLAD is a deep learning-based image descriptor framework developed by Relja Arandjelović for place recognition and image retrieval. It extends standard CNNs with a trainable VLAD (Vector of Locally Aggregated Descriptors) layer to create compact, robust global descriptors from image features. This implementation includes training code and pretrained models using the Pittsburgh and Tokyo datasets.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB