Showing 487 open source projects for "visual\"

View related business solutions
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Qtile

    Qtile

    A full-featured, hackable tiling window manager written in Python

    A full-featured, hackable tiling window manager written and configured in Python. Optimize your workflow by configuring your environment to fit how you work. Efficiently use screen real-estate by automatically arranging windows with minimal visual cruft. Save your wrists from RSI by ditching the mouse and driving with the keyboard. Qtile is simple, small, and extensible. It's easy to write your own layouts, widgets, and built-in commands. Qtile is written and configured entirely in Python. Leverage the full power and flexibility of the language to make it fit your needs. The Qtile community is active and growing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    PersonaLive

    PersonaLive

    Expressive Portrait Image Animation for Live Streaming

    ...The framework prioritizes low-latency and streamable output, making it suitable for real-time creative workflows, broadcast overlays, or interactive avatars on consumer-grade GPUs. PersonaLive’s architecture balances visual quality and efficiency by combining motion encoding, temporal modules, and hybrid implicit control signals to preserve identity and stable expression through long sequences.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    ERAlchemy

    ERAlchemy

    Entity Relation Diagrams generation tool

    ERAlchemy is a tool that generates Entity-Relationship (ER) diagrams from databases or SQLAlchemy models and vice versa. It’s useful for database documentation, reverse engineering, and understanding complex schemas. ERAlchemy can export diagrams in formats like Graphviz and Mermaid, making it easy to include in reports or markdown files.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Open-AutoGLM

    Open-AutoGLM

    An open phone agent model & framework

    ...It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off completion of multi-step tasks such as navigating apps, filling forms, and more. Unlike traditional automation scripts that depend on brittle heuristics, Open-AutoGLM uses pretrained large language and vision-language models to interpret visual context and natural language instructions, giving the agent robust adaptability across apps and interfaces.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    PaddleX

    PaddleX

    PaddlePaddle End-to-End Development Toolkit

    PaddleX is a deep learning full-process development tool based on the core framework, development kit, and tool components of Paddle. It has three characteristics opening up the whole process, integrating industrial practice, and being easy to use and integrate. Image classification and labeling is the most basic and simplest labeling task. Users only need to put pictures belonging to the same category in the same folder. When the model is trained, we need to divide the training set, the...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 6
    Writer Framework

    Writer Framework

    No-code in the front, Python in the back. An open-source framework

    Writer Framework is an open source platform designed to help developers build AI-powered applications by combining a visual interface builder with a Python-based backend architecture. It follows a hybrid approach where user interfaces are created using a drag-and-drop editor while business logic is implemented in Python, allowing teams to balance speed and flexibility without sacrificing control. The framework is particularly focused on AI use cases, enabling developers to integrate large language models, knowledge graphs, and custom machine learning workflows into user-facing applications. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    machine-learning-refined

    machine-learning-refined

    Master the fundamentals of machine learning, deep learning

    machine-learning-refined is an educational repository designed to help students and practitioners understand machine learning algorithms through intuitive explanations and interactive examples. The project accompanies a series of textbooks and teaching materials that focus on making machine learning concepts accessible through visual demonstrations and simple code implementations. Instead of presenting algorithms purely through mathematical derivations, the repository emphasizes geometric intuition, visualization, and step-by-step experimentation. It includes Jupyter notebooks and scripts that illustrate core machine learning topics such as regression, classification, optimization methods, and neural networks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    VLMEvalKit

    VLMEvalKit

    Open-source evaluation toolkit of large multi-modality models (LMMs)

    ...VLMEvalKit supports generation-based evaluation methods, allowing models to produce textual responses to visual inputs while measuring performance through techniques such as exact matching or language-model-assisted answer extraction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    A.I.G

    A.I.G

    Full-stack AI Red Teaming platform

    ...Users can deploy it via Docker or scripts to get a modern web UI that guides them through tasks like scanning third-party frameworks for known CVEs and experimenting with prompt security against attack vectors. The tool provides both a visual interface and a comprehensive API, making integration with internal security systems or CI/CD pipelines practical for ongoing risk management.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure File Transfer for Windows with Cerberus by Redwood Icon
    Secure File Transfer for Windows with Cerberus by Redwood

    Protect and share files over FTP/S, SFTP, HTTPS and SCP with the #1 rated Windows file transfer server.

    Cerberus supports unlimited users and connections on a single IP, with built-in encryption, 2FA, and a browser-based web client — all deployable in under 15 minutes with a 25-day free trial.
    Try for Free
  • 10
    Videomass

    Videomass

    Videomass is a free, open source and cross-platform GUI for FFmpeg

    Videomass is a free, open-source graphical interface for FFmpeg designed to make advanced video and audio processing accessible to both beginners and experienced users. Built in Python using wxPython, it provides a cross-platform environment for managing encoding, conversion, and editing tasks through a visual interface. The software supports multitasking operations, allowing users to process multiple media files simultaneously. It offers extensive configuration options while also providing presets to simplify common workflows. Videomass integrates closely with FFmpeg, exposing powerful capabilities such as transcoding, filtering, and format conversion without requiring command-line interaction. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 11
    TimeTracker

    TimeTracker

    Open-source and free to self-host

    TimeTracker by DRYTRIX is a comprehensive self-hosted time tracking and project management application that helps individuals, teams, and small businesses take control of their productivity and billing. Built on a modern web stack (Python/Flask backend with a responsive frontend), TimeTracker enables users to track hours, manage projects and clients, visualize data with interactive charts, and even generate professional invoices directly from tracked time. It embraces privacy and flexibility...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 12
    InvokeAI

    InvokeAI

    InvokeAI is a leading creative engine for Stable Diffusion models

    ...It runs on Windows, Mac and Linux machines, and runs on GPU cards with as little as 4 GB or RAM. InvokeAI is a leading creative engine built to empower professionals and enthusiasts alike. Generate and create stunning visual media using the latest AI-driven technologies. InvokeAI offers an industry leading Web Interface, interactive Command Line Interface, and also serves as the foundation for multiple commercial products. This fork is supported across Linux, Windows and Macintosh. Linux users can use either an Nvidia-based card (with CUDA support) or an AMD card (using the ROCm driver). ...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 13
    PyVista

    PyVista

    3D plotting and mesh analysis through a streamlined interface

    ...PyVista is a helper module for the Visualization Toolkit (VTK) that takes a different approach on interfacing with VTK through NumPy and direct array access. This package provides a Pythonic, well-documented interface exposing VTK’s powerful visualization backend to facilitate rapid prototyping, analysis, and visual integration of spatially referenced datasets. This module can be used for scientific plotting for presentations and research papers as well as a supporting module for other mesh-dependent Python modules. Easily integrate with NumPy and create a variety of geometries and plot them. You could use any geometry to create your glyphs, or even plot the points directly. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    sparkmagic

    sparkmagic

    Jupyter magics and kernels for working with remote Spark clusters

    ...Sparkmagic interacts with remote Spark clusters through a REST server. Automatic visualization of SQL queries in the PySpark, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required. Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib). Send local files or dataframes to a remote cluster (e.g. sending pretrained local ML model straight to the Spark cluster) Authenticate to Livy via Basic Access authentication or via Kerberos.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 15
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling multimodal content in ways that preserve cross-modal relationships and semantic context, often treating content elements as interconnected knowledge entities rather than separate data silos. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 16
    Dask

    Dask

    Parallel computing with task scheduling

    Dask is a Python library for parallel and distributed computing, designed to scale analytics workloads from single machines to large clusters. It integrates with familiar tools like NumPy, Pandas, and scikit-learn while enabling execution across cores or nodes with minimal code changes. Dask excels at handling large datasets that don’t fit into memory and is widely used in data science, machine learning, and big data pipelines.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Flagsmith

    Flagsmith

    Open source feature flagging and remote config service

    Release features with confidence; manage feature flags across web, mobile, and server-side applications. Use our hosted API, deploy to your own private cloud, or run on-premises. Flagsmith provides an all-in-one platform for developing, implementing, and managing your feature flags. Whether you are moving off an in-house solution or using toggles for the first time, you will be amazed by the power and efficiency gained by using Flagsmith. Flagsmith makes it easy to create and manage feature...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 18
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    ...Support for lists of Numpy arrays, Pandas dataframes, text or (mixed) lists. Applying topic models and other text vectorization methods to text data. HyperTools is designed to facilitate dimensionality reduction-based visual explorations of high-dimensional data. The basic pipeline is to feed in a high-dimensional dataset (or a series of high-dimensional datasets) and, in a single function call, reduce the dimensionality of the dataset(s) and create a plot.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 19
    hCaptcha Challenger

    hCaptcha Challenger

    Gracefully face hCaptcha challenge with multimodal llms

    hCaptcha Challenger is an open-source automation framework designed to solve hCaptcha verification challenges using computer vision models and multimodal reasoning techniques. The project integrates machine learning models capable of analyzing visual captcha tasks and identifying the correct responses required to pass the verification process. Instead of relying on third-party captcha-solving services or browser scripts, the system operates independently by using pretrained neural networks that can classify images, detect objects, and interpret spatial relationships. The framework includes support for multiple types of captcha challenges such as object selection, drag-and-drop puzzles, and image labeling tasks. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 20
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    ...It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and 10 speech output languages. It achieves state-of-the-art results: across 36 audio and audio-visual benchmarks, it hits open-source SOTA on 32 and overall SOTA on 22, outperforming or matching strong closed-source models such as Gemini-2.5 Pro and GPT-4o. To reduce latency, especially in audio/video streaming, Talker predicts discrete speech codecs via a multi-codebook scheme and replaces heavier diffusion approaches.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    MaxKB

    MaxKB

    Open-source platform for building enterprise-grade agents

    MaxKB (Max Knowledge Brain) is an open-source platform for building enterprise-grade AI agents with strong knowledge retrieval, RAG pipelines, and workflow orchestration. It focuses on practical deployments such as customer support, internal knowledge bases, research assistants, and education, bundling tools for data ingestion, chunking, embedding, retrieval, and answer synthesis. The system exposes flexible tool-use (including MCP), supports multi-model backends, and provides dashboards for...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 22
    Droidrun

    Droidrun

    Powerful framework for controlling Android and iOS devices

    Droidrun is a native mobile agent platform that gives users natural-language control over real Android devices to automate any mobile app workflow, from logins and bookings to purchases and data extraction, including access to mobile-only content behind app logins, rate limits, or platform restrictions. Its cloud offering lets users spin up agents in seconds with preinstalled apps, run tasks in parallel across multiple devices, and compose complex, multi-step conditional workflows using...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    nunif

    nunif

    Misc; latest version of waifu2x; 2D video to stereo 3D video

    ...The project provides a collection of AI-powered utilities designed primarily for anime-style artwork, illustrations, and high-quality image restoration workflows. It includes command-line tools and graphical interfaces for applying trained neural models to improve image resolution and visual clarity while minimizing artifacts. nunif supports GPU acceleration and batch processing, making it suitable for creators, archivists, and enthusiasts handling large image collections. The framework is highly modular, allowing developers to experiment with custom models, inference pipelines, and image-processing workflows. Its emphasis on anime and illustration enhancement has made it especially popular in digital art and media preservation communities.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 24
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    ...Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual and textual information directly from document images. This allows the system to detect and extract structured elements such as tables, signatures, key fields, and layout information while maintaining semantic understanding of the document content. The toolkit can also convert complex documents into structured markdown representations that preserve formatting and contextual relationships.
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB