Search Results for "computer vision" - Page 2

Showing 118 open source projects for "computer vision"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    Encord Active

    Encord Active

    The toolkit to test, validate, and evaluate your models and surface

    Encord Active is an open-source toolkit to test, validate, and evaluate your models and surface, curate, and prioritize the most valuable data for labeling to supercharge model performance. Encord Active has been designed as a all-in-one open source toolkit for improving your data quality and model performance. Use the intuitive UI to explore your data or access all the functionalities programmatically. Discover errors, outliers, and edge-cases within your data - all in one open source...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    autoMate

    autoMate

    AI tool for automating desktop tasks via natural language input

    autoMate is an AI-powered local automation tool designed to enable users to control and automate their computers using natural language instructions instead of traditional scripting or rule-based systems. It combines large language models with computer vision techniques to interpret user intent and understand on-screen content, allowing it to interact with graphical interfaces similarly to a human user. autoMate follows an observe-decide-act workflow, where it analyzes the screen, plans actions, and executes them through simulated input such as mouse clicks and keyboard events. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 3
    Skyvern

    Skyvern

    Automate browser-based workflows with LLMs and Computer Vision

    Skyvern uses a combination of computer vision and AI to understand content on a webpage, making it adaptable to any website. Skyvern takes instructions in natural language, allowing it to execute complex objectives with simple commands. Skyvern is an API-first product. Workflows execute in the cloud, allowing it to run hundreds of workflows at the same time. Skyvern's AI decisions come with built-in explanations, providing clear summaries and justifications for every action. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    RF-DETR

    RF-DETR

    RF-DETR is a real-time object detection and segmentation

    RF-DETR is an open-source computer vision framework that implements a real-time object detection and instance segmentation model based on transformer architectures. Developed by Roboflow, the project builds upon modern vision transformer backbones such as DINOv2 to achieve strong accuracy while maintaining efficient inference speeds suitable for real-time applications.
    Downloads: 8 This Week
    Last Update:
    See Project
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    AI-Tutorials/Implementations Notebooks

    AI-Tutorials/Implementations Notebooks

    Codes/Notebooks for AI Projects

    ...The repository contains numerous Jupyter notebooks and code samples that demonstrate modern techniques in machine learning, deep learning, data science, and large language model workflows. It includes implementations for a wide range of AI topics such as computer vision, agent systems, federated learning, distributed systems, adversarial attacks, and generative AI. Many of the tutorials focus on building AI agents, multi-agent systems, and workflows that integrate language models with external tools or APIs. The codebase acts as a hands-on learning resource, allowing users to experiment with new frameworks, architectures, and machine learning workflows through guided examples.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 6
    YOLOv5

    YOLOv5

    YOLOv5 is the world's most loved vision AI

    Introducing Ultralytics YOLOv8, the latest version of the acclaimed real-time object detection and image segmentation model. YOLOv8 is built on cutting-edge advancements in deep learning and computer vision, offering unparalleled performance in terms of speed and accuracy. Its streamlined design makes it suitable for various applications and easily adaptable to different hardware platforms, from edge devices to cloud APIs. Explore the YOLOv8 Docs, a comprehensive resource designed to help you understand and utilize its features and capabilities. ...
    Downloads: 66 This Week
    Last Update:
    See Project
  • 7
    fastai

    fastai

    Deep learning library

    fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. It aims to do both things without substantial compromises in ease of use, flexibility, or performance. This is possible thanks to a carefully layered architecture, which expresses common underlying...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    OpenCV

    OpenCV

    Open Source Computer Vision Library

    The Open Source Computer Vision Library has >2500 algorithms, extensive documentation and sample code for real-time computer vision. It works on Windows, Linux, Mac OS X, Android, iOS in your browser through JavaScript. Languages: C++, Python, Julia, Javascript Homepage: https://opencv.org Q&A forum: https://forum.opencv.org/ Documentation: https://docs.opencv.org Source code: https://github.com/opencv Please pay special attention to our tutorials! ...
    Leader badge
    Downloads: 3,389 This Week
    Last Update:
    See Project
  • 9
    Torch Pruning

    Torch Pruning

    DepGraph: Towards Any Structural Pruning

    ...Torch-Pruning physically removes parameters rather than masking them, which results in smaller and faster models during both training and inference. The toolkit supports a wide variety of architectures used in computer vision and large language models, making it a flexible solution for model compression tasks.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 10
    InternGPT

    InternGPT

    Open source demo platform where you can easily showcase your AI models

    InternGPT is an open-source multimodal AI framework designed to extend large language models beyond text interactions into visual reasoning and image manipulation tasks. The system integrates conversational AI with computer vision models so users can interact with images, videos, and visual environments through natural language instructions. Unlike traditional chat systems that rely solely on text prompts, InternGPT allows users to interact with visual content using both language and nonverbal signals such as pointing or highlighting objects within images. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    InternVL

    InternVL

    A Pioneering Open-Source Alternative to GPT-4o

    InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OpenVINO Notebooks

    OpenVINO Notebooks

    Jupyter notebook tutorials for OpenVINO

    openvino_notebooks is a collection of interactive Jupyter notebooks designed to demonstrate how to build, optimize, and deploy artificial intelligence applications using the OpenVINO toolkit. The repository provides practical tutorials that guide developers through various AI workflows including computer vision, natural language processing, and generative AI tasks. Each notebook demonstrates how to run pre-trained models, optimize inference performance, and deploy models across hardware such as CPUs, GPUs, and specialized accelerators. The tutorials also illustrate how OpenVINO integrates with models from frameworks like PyTorch, TensorFlow, and ONNX to accelerate inference workloads. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    qxresearch-event-1

    qxresearch-event-1

    Python hands on tutorial with 50+ Python Application

    ...The repository contains dozens of small programs, many implemented with minimal lines of code, covering topics such as machine learning, graphical user interfaces, computer vision, and API integration. Each example is designed to illustrate a single concept or application in a clear and concise manner so that learners can quickly understand the underlying logic. The project emphasizes practical experimentation, allowing beginners to modify and extend the example programs to explore new ideas. Many of the examples are accompanied by video explanations that guide learners through the code and demonstrate how the programs work in practice.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    imgclsmob Deep learning networks

    imgclsmob Deep learning networks

    Sandbox for training deep learning networks

    imgclsmob is a deep learning research repository focused on implementing and experimenting with convolutional neural networks for computer vision tasks. The project serves as a sandbox for training and evaluating a wide variety of neural network architectures used in image analysis. It includes implementations of models used for tasks such as image classification, object detection, semantic segmentation, and pose estimation. The repository also contains scripts that help train models, evaluate performance, and convert trained networks between different frameworks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    CUDA Containers for Edge AI & Robotics

    CUDA Containers for Edge AI & Robotics

    Machine Learning Containers for NVIDIA Jetson and JetPack-L4T

    ...These containers simplify the deployment of complex machine learning environments by bundling libraries such as CUDA, TensorRT, and deep learning frameworks into reproducible container images. The project is particularly useful for developers building edge AI and robotics systems that rely on GPU-accelerated inference and real-time computer vision. By using containerized environments, developers can ensure that their applications run consistently across different Jetson platforms and JetPack versions. The repository also includes build tools and package management utilities that help automate the process of assembling machine learning environments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    crème de la crème of AI courses

    crème de la crème of AI courses

    This repository is a curated collection of links to various courses

    ...The repository organizes courses by topic, difficulty level, format, and release year, allowing learners to quickly identify relevant material depending on their experience and interests. Topics covered include deep learning, natural language processing, computer vision, large language models, linear algebra, reinforcement learning, and machine learning engineering. Because the repository links to well-known educational content such as university lecture series and professional training materials, it functions as a structured roadmap for individuals who want to develop expertise in artificial intelligence.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    VGGT

    VGGT

    [CVPR 2025 Best Paper Award] VGGT

    VGGT is a transformer-based framework aimed at unifying classic visual geometry tasks—such as depth estimation, camera pose recovery, point tracking, and correspondence—under a single model. Rather than training separate networks per task, it shares an encoder and leverages geometric heads/decoders to infer structure and motion from images or short clips. The design emphasizes consistent geometric reasoning: outputs from one head (e.g., correspondences or tracks) reinforce others (e.g., pose...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Image-Editor

    Image-Editor

    AI based photo editing website for changing image background

    ...Then to, create a new Django project using django-admin startproject Website1, replacing 'Website1' with the name of your choice. Image-Editor uses Python's cv2 library, which provides an easy and efficient way to work with images and videos, including a wide range of image processing and computer vision algorithms. With cv2, you can easily read, write, filter, and display images, and much more. Image-Editor uses Mediapipe's selfie_segmentation model for background removal in real-time video streams. This advanced model uses deep neural networks to detect and remove the background.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 19
    Windows-MCP

    Windows-MCP

    MCP server enabling AI agents to control and automate Windows OS

    ...Windows-MCP provides capabilities such as file navigation, application management, UI interaction, and QA testing workflows, making it suitable for building autonomous desktop agents. It focuses on native interaction with Windows UI elements rather than relying on traditional computer vision techniques, which simplifies integration and improves efficiency. It includes a set of tools that simulate user inputs like keyboard and mouse actions while also capturing the current state of windows and interfaces. It is designed to be extensible and adaptable, allowing developers to customize or expand its functionality for different automation or AI use cases.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 20
    Kaggle Solutions

    Kaggle Solutions

    Collection of Kaggle Solutions and Ideas

    ...The repository also highlights important machine learning concepts such as feature engineering, cross-validation strategies, ensemble modeling, and post-processing methods commonly used in winning solutions. Because the content is organized by competition categories such as computer vision, natural language processing, tabular data, and time-series forecasting, users can explore techniques relevant to specific problem types.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    CogView4

    CogView4

    CogView4, CogView3-Plus and CogView3(ECCV 2024)

    ...It emphasizes bilingual usability, making it well-suited for cross-lingual multimodal applications. The model also supports fine-tuning and downstream customization, extending its applicability to creative content generation, human–computer interaction, and research on vision-language alignment.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OAGI Python SDK

    OAGI Python SDK

    Python SDK for the Computer Use model Lux, developed by OpenAGI

    OAGI Python SDK is a Python client library for the Lux computer-use model that turns Lux into a programmable automation layer for operating human-facing software via vision and actions. It exposes the OAGI API in an ergonomic way, letting you trigger Lux in three main modes: Tasker for precise scripted sequences, Actor for fast one-shot tasks, and Thinker for open-ended, multi-step objectives.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    BoxMOT

    BoxMOT

    Pluggable SOTA multi-object tracking modules for segmentation

    BoxMOT is an open-source framework designed to provide modular implementations of state-of-the-art multi-object tracking algorithms for computer vision applications. The project focuses on the tracking-by-detection paradigm, where objects detected by vision models are continuously tracked across frames in a video sequence. It provides a pluggable architecture that allows developers to combine different object detectors with multiple tracking algorithms without modifying the core codebase. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Hiera

    Hiera

    A fast, powerful, and simple hierarchical vision transformer

    Hiera is a hierarchical vision transformer designed to be fast, simple, and strong across image and video recognition tasks. The core idea is to use straightforward hierarchical attention with a minimal set of architectural “bells and whistles,” achieving competitive or superior accuracy while being markedly faster at inference and often faster to train. The repository provides installation options (from source or Torch Hub), a model zoo with pre-trained checkpoints, and code for evaluation...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 25
    StarVector

    StarVector

    StarVector is a foundation model for SVG generation

    StarVector is a multimodal foundation model designed for generating Scalable Vector Graphics (SVG) from images or textual descriptions. The system treats vector graphics creation as a code generation problem, producing SVG code that can render detailed vector images. Its architecture combines computer vision techniques with language modeling capabilities so it can understand visual inputs and textual prompts simultaneously. The model converts raster images or text instructions into structured vector representations, enabling high-quality vectorization and design generation. This approach allows StarVector to create scalable graphics that maintain visual quality regardless of resolution, which is especially useful for design tools and illustration workflows. ...
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB