Showing 487 open source projects for "visual\"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    HTTPie Desktop

    HTTPie Desktop

    Cross-platform API testing client for humans

    HTTPie Desktop is a graphical API client built on top of the popular HTTPie terminal tool, offering a user-friendly interface for testing and interacting with APIs. It combines the simplicity of HTTPie’s CLI with a modern desktop and web UI for a more visual workflow. Developers can easily build, send, and preview HTTP requests without needing to memorize commands or write scripts. The platform supports organizing work into spaces, collections, and tabs, making it ideal for managing multiple APIs and projects. It also includes AI-assisted features to help streamline request creation and improve productivity. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Slither

    Slither

    Static Analyzer for Solidity

    Slither is a Solidity static analysis framework written in Python 3. It runs a suite of vulnerability detectors, prints visual information about contract details, and provides an API to easily write custom analyses. Slither enables developers to find vulnerabilities, enhance their code comprehension, and quickly prototype custom analyses. Slither is the first open-source static analysis framework for Solidity. Slither is fast and precise; it can find real vulnerabilities in a few seconds without user intervention. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 3
    Android Use

    Android Use

    Automate native Android apps with AI using accessibility APIs

    android-action-kernel is an open source Python library designed to let AI agents control and automate native Android applications running on real devices or emulators. It fills a gap in automation tooling by focusing on mobile-first workflows where traditional browser or desktop-based automation doesn’t work; such as logistics, gig work, field operations, and other industries reliant on phones or tablets. The project works by using Android’s accessibility API to extract structured UI state...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. It supports local deployment, enabling organizations concerned about privacy or latency to run the pipeline on-premises rather than send sensitive documents to third-party cloud services. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 5
    ART ASCII Library

    ART ASCII Library

    ASCII art library for Python

    ASCII art is also known as "computer text art". It involves the smart placement of typed special characters or letters to make a visual shape that is spread over multiple lines of text. ART is a Python lib for text converting to ASCII art fancy.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Nexent

    Nexent

    Zero-code platform for building AI agents from natural language input

    Nexent is an open source platform designed to enable users to create intelligent agents using natural language instead of traditional programming or visual orchestration tools. It focuses on a zero-code approach, allowing users to define workflows and agent behavior purely through language prompts, significantly lowering the barrier to entry for AI development. Built on the MCP ecosystem, Nexent integrates a wide range of tools, models, and data sources into a unified environment for agent creation and execution. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    City Map Poster Generator

    City Map Poster Generator

    Transform your favorite cities into beautiful, minimalist designs

    maptoposter is a code-driven poster generator that turns any city into a minimalist, print-style map artwork with consistent typography and themed color palettes. It is built around a simple command-line flow where you pass a city and country, and the tool fetches the relevant map geometry and renders it into a clean composition that looks like a design product rather than a raw GIS export. The repository includes a library of predefined themes that change the overall look (for example,...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 8
    LangExtract

    LangExtract

    A Python library for extracting structured information

    ...It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material. Each extracted entity is precisely grounded in its original context, allowing visual inspection and validation via automatically generated interactive HTML visualizations. LangExtract supports a wide range of models, including Google Gemini, OpenAI GPT, and local LLMs via Ollama, making it adaptable to different deployment environments and compliance needs. The system excels at handling long documents using optimized chunking, multi-pass extraction, and parallel processing to ensure both high recall and structured consistency.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    ImageReward

    ImageReward

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

    ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    ...It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    ComfyUI-LivePortraitKJ

    ComfyUI-LivePortraitKJ

    ComfyUI nodes for LivePortrait

    ...It integrates into ComfyUI as a set of nodes, allowing users to combine it with other tools for complex animation workflows. The project is particularly useful for creating talking avatars, animated characters, or expressive visual content. It allows fine control over animation parameters, enabling customization of movement intensity and style. By leveraging diffusion and motion transfer techniques, it produces smooth and coherent animations. Overall, it provides an accessible way to generate portrait animations within a node-based pipeline.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package. This package is still backward compatible with the original progressbar package so you can safely use it as a drop-in replacement for existing projects. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    Audiblez

    Audiblez

    Generate audiobooks from e-books

    Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 14
    Gemma

    Gemma

    Gemma open-weight LLM library, from Google DeepMind

    ...This repository provides the official implementation of the Gemma PyPI package, a JAX-based library that enables users to load, interact with, and fine-tune Gemma models. The framework supports both text and multi-modal input, allowing natural language conversations that incorporate visual content such as images. It includes APIs for conversational sampling, parameter management, and integration with fine-tuning methods like LoRA. The Gemma library can operate efficiently on CPUs, GPUs, or TPUs, with recommended configurations depending on model size. Through included tutorials and Colab notebooks, users can explore examples covering sampling, multi-modal interactions, and fine-tuning workflows. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    Map-Anything

    Map-Anything

    MapAnything: Universal Feed-Forward Metric 3D Reconstruction

    Map-Anything is a universal, feed-forward transformer for metric 3D reconstruction that predicts a scene’s geometry and camera parameters directly from visual inputs. Instead of stitching together many task-specific models, it uses a single architecture that supports a wide range of 3D tasks—multi-image structure-from-motion, multi-view stereo, monocular metric depth, registration, depth completion, and more. The model flexibly accepts different input combinations (images, intrinsics, poses, sparse or dense depth) and produces a rich set of outputs including per-pixel 3D points, camera intrinsics, camera poses, ray directions, confidence maps, and validity masks. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 16
    AI Data Science Team

    AI Data Science Team

    An AI-powered data science team of agents

    ...It provides a modular agent framework where each agent focuses on a step in the typical data science pipeline — for example, loading data from CSV/Excel files, cleaning and wrangling messy datasets, engineering predictive features, building models with AutoML, connecting to SQL databases, and producing visual outputs — all driven by natural language or programmatic instructions. The project includes ready-to-use applications that showcase these agents in action, such as an exploratory data analysis copilot that generates reports, a pandas data analyst that combines wrangling and plotting, and SQL database agents that can query business databases and output results directly.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    gopro-dashboard-overlay

    gopro-dashboard-overlay

    Programs to process GoPro MP4 & Generic GPX/FIT files

    ...The tool can also convert metadata into formats like GPX or CSV for further analysis. It is designed for both post-processing workflows and automated video generation pipelines. Overall, it enhances action footage by adding synchronized visual data overlays.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    Paper2Slides

    Paper2Slides

    From Paper to Presentation in One Click

    ...It is designed to replace the repetitive work of turning dense technical documents into presentation-friendly structure by extracting key points, figures, and data into a coherent visual narrative. The system supports multiple input formats, so you can process PDFs and common office documents rather than being locked to a single file type. It uses an extraction approach intended to capture critical insights comprehensively, including important visuals and data points that often get missed in naive summarization. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    VisPy

    VisPy

    Main repository for Vispy

    Vispy is an open-source, high-performance interactive visualization library in Python, designed for creating scientific visualizations and interactive plots. It leverages the power of modern Graphics Processing Units (GPUs) through OpenGL to render large datasets efficiently. Vispy supports a wide range of visualization types, including 2D plots, 3D visualizations, volume rendering, and more, making it suitable for scientific research, data analysis, and educational purposes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MLJAR Studio

    MLJAR Studio

    Python package for AutoML on Tabular Data with Feature Engineering

    We are working on new way for visual programming. We developed a desktop application called MLJAR Studio. It is a notebook-based development environment with interactive code recipes and a managed Python environment. All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21
    GPTImage2Skill

    GPTImage2Skill

    GPT Image 2 prompt gallery, image prompt library, agentic skill

    ...It provides reusable image prompts across creative, technical, academic, interface, design, photography, typography, gaming, anime, map, tattoo, and reference-editing use cases. The project is designed to help agents and users produce stronger visual outputs without starting from a blank prompt every time. Its gallery is organized into category files so an agent can load only the relevant prompt references instead of overwhelming the context window. It also includes installation paths for skill-capable environments such as Claude Code, Codex, OpenClaw, and other agent runtimes. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    LongCat-Image

    LongCat-Image

    Foundation model for image generation

    LongCat-Image is an open-source foundation model for image generation and editing created by the LongCat team at Meituan, designed to deliver high-quality visual outputs while remaining efficient and accessible for developers and researchers. Rather than relying on massive parameter counts typical of many cutting-edge models, LongCat-Image achieves strong photorealism, stable structure, and accurate bilingual (Chinese and English) text rendering with a more compact ~6-billion parameter architecture, making it competitive with much larger alternatives despite its relatively lean design. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    MiniMind-V

    MiniMind-V

    "Big Model" trains a visual multimodal VLM with 26M parameters

    MiniMind-V is an experimental open-source project that aims to train a very small multimodal vision–language model (VLM) from scratch with extremely low compute and cost, making research and experimentation accessible to more people. The repository showcases training workflows and code designed to produce a 26-million parameter model—including both image and text capabilities—using minimal resources in very little time, reflecting a trend toward democratizing AI research. MiniMind-V combines...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Stable Virtual Camera

    Stable Virtual Camera

    Stable Virtual Camera: Generative View Synthesis with Diffusion Models

    ...Unlike traditional methods that require complex reconstruction or scene-specific optimization, this model allows users to generate novel views from any number of input images and define custom camera trajectories, enabling dynamic exploration of scenes. It supports various aspect ratios and can produce 3D-consistent videos up to 1,000 frames, making it a versatile tool for creators seeking to enhance visual storytelling. ​
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    armory

    armory

    3D Engine with Blender Integration

    ...Powered by Armory engine, ArmorPaint is a stand-alone software designed for physically-based texture painting. Drag & drop your 3D models and start painting. Receive instant visual feedback in the viewport as you paint. Powered by Armory engine, ArmorLab is stand-alone software designed for AI-powered texture authoring. Generate PBR materials by drag & dropping your photos. In development! Armory is an open-source 3D game engine with full Blender integration. The engine is currently available in a form of early preview.
    Downloads: 1 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB