Showing 532 open source projects for "visual python"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    Let your crypto work for you

    Put idle assets to work with competitive interest rates, borrow without selling, and trade with precision. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    HyperTools

    HyperTools

    A Python toolbox for gaining geometric insights

    HyperTools is a library for visualizing and manipulating high-dimensional data in Python. It is built on top of matplotlib (for plotting), seaborn (for plot styling), and scikit-learn (for data manipulation). Functions for plotting high-dimensional datasets in 2/3D. Static and animated plots. Simple API for customizing plot styles. Set of powerful data manipulation tools including hyperalignment, k-means clustering, normalizing and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    VOID

    VOID

    Video Object and Interaction Deletion

    VOID is an advanced AI video processing system developed by Netflix that focuses on removing objects from videos while preserving the physical and visual realism of the surrounding environment. Unlike traditional inpainting methods that only erase pixels or simple artifacts, VOID models the full interaction dynamics between objects and their environment, including shadows, reflections, and even physical consequences such as movement or balance changes. Built on top of transformer-based...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    machine-learning-refined

    machine-learning-refined

    Master the fundamentals of machine learning, deep learning

    machine-learning-refined is an educational repository designed to help students and practitioners understand machine learning algorithms through intuitive explanations and interactive examples. The project accompanies a series of textbooks and teaching materials that focus on making machine learning concepts accessible through visual demonstrations and simple code implementations. Instead of presenting algorithms purely through mathematical derivations, the repository emphasizes geometric...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    VLMEvalKit

    VLMEvalKit

    Open-source evaluation toolkit of large multi-modality models (LMMs)

    VLMEvalKit is an open-source evaluation toolkit designed for benchmarking large vision-language models that combine visual understanding with natural language reasoning. The toolkit provides a unified framework that allows researchers and developers to evaluate multimodal models across a wide range of datasets and standardized benchmarks with minimal setup. Instead of requiring complex data preparation pipelines or multiple repositories for each benchmark, the system enables evaluation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    alive-progress

    alive-progress

    A new kind of Progress Bar, with real-time throughput, ETA

    alive-progress is an advanced Python progress bar library that introduces a highly animated and adaptive approach to tracking long-running tasks. Unlike traditional static progress indicators, it dynamically adjusts spinner speed and visual feedback based on actual throughput, giving users a more intuitive sense of activity. The library is designed with performance efficiency in mind, using multithreaded updates that minimize CPU overhead and terminal noise.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    ImageReward

    ImageReward

    [NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences

    ImageReward is the first general-purpose human preference reward model (RM) designed for evaluating text-to-image generation, introduced alongside the NeurIPS 2023 paper ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation. Trained on 137k expert-annotated image pairs, ImageReward significantly outperforms existing scoring methods like CLIP, Aesthetic, and BLIP in capturing human visual preferences. It is provided as a Python package (image-reward) that enables quick scoring of generated images against textual prompts, with APIs for ranking, scoring, and filtering outputs. Beyond evaluation, ImageReward supports Reward Feedback Learning (ReFL), a method for directly fine-tuning diffusion models such as Stable Diffusion using human-preference feedback, leading to demonstrable improvements in image quality.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    Audiblez

    Audiblez

    Generate audiobooks from e-books

    Audiblez is a tool for generating high-quality .m4b audiobooks directly from .epub e-books using the Kokoro-82M neural text-to-speech model. It focuses on making audiobook creation easy and fast: from a single command, the tool splits an e-book into chapters, synthesizes audio for each section, and then merges the results into a structured audiobook with chapter-based WAV files and a final .m4b container. The Kokoro-82M model it uses is compact (82M parameters) yet natural sounding, trained...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 8
    OSWorld

    OSWorld

    Benchmarking Multimodal Agents for Open-Ended Tasks

    OSWorld is an open-source synthetic world environment designed for embodied AI research and multi-agent learning. It provides a richly simulated 3D world where multiple agents can interact, perform tasks, and learn complex behaviors. OSWorld emphasizes multi-modal interaction, enabling agents to process visual, auditory, and symbolic data for grounded learning in a simulated world.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Browser Use MCP Server

    Browser Use MCP Server

    Browse the web, directly from Cursor etc.

    A browser automation server implementing the Model Context Protocol, designed to allow AI assistants to browse the web directly from applications like Cursor. It supports natural language commands for web navigation and interaction. ​
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    Zerox OCR

    Zerox OCR

    PDF to Markdown with vision models

    A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Sa2VA

    Sa2VA

    Official Repo For "Sa2VA: Marrying SAM2 with LLaVA

    Sa2VA is a cutting-edge open-source multi-modal large language model (MLLM) developed by ByteDance that unifies dense segmentation, visual understanding, and language-based reasoning across both images and videos. It merges the segmentation power of a state-of-the-art video segmentation model (based on SAM‑2) with the vision-language reasoning capabilities of a strong LLM backbone (derived from models like InternVL2.5 / Qwen-VL series), yielding a system that can answer questions about...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    InternVL

    InternVL

    A Pioneering Open-Source Alternative to GPT-4o

    InternVL is a large-scale multimodal foundation model designed to integrate computer vision and language understanding within a unified architecture. The project focuses on scaling vision models and aligning them with large language models so that they can perform tasks involving both visual and textual information. InternVL is trained on massive collections of image-text data, enabling it to learn representations that capture both visual patterns and semantic meaning. The model supports a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Pixal3D

    Pixal3D

    Pixel-Aligned 3D Generation from Images

    Pixal3D is a TencentARC research project for generating high-fidelity 3D assets from a single input image. It addresses a key weakness in image-to-3D generation: many models produce plausible 3D shapes but fail to preserve pixel-level faithfulness to the original image. Pixal3D improves this by explicitly lifting image features into 3D through back-projection, creating clearer correspondences between the input pixels and the generated asset. The system is designed to produce detailed...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    PaperBanana

    PaperBanana

    Extension of Google Research’s PaperBanana

    PaperBanana is an open-source agentic framework designed to automatically generate publication-quality academic diagrams and statistical plots directly from text descriptions. The project focuses on helping researchers, educators, and data scientists transform conceptual descriptions of figures into structured visual outputs suitable for research papers, presentations, and technical reports. Instead of manually designing charts or diagrams using traditional visualization tools, users can...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    dnstwist

    dnstwist

    Detects phishing and lookalike domains using DNS fuzzing techniques

    dnstwist is an open source cybersecurity tool designed to identify malicious or suspicious domain names that imitate legitimate websites. It works by generating a large set of domain name permutations based on a target domain and analyzing whether any of those variants are actively registered or used. These permutations simulate common techniques used in phishing attacks, typosquatting, and brand impersonation campaigns. Security teams can use the tool to discover potential threats where...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    AppAgent

    AppAgent

    Multimodal Agents as Smartphone Users, an LLM-based multimodal agent

    AppAgent is an open-source multimodal agent framework designed to enable large language models to operate smartphone applications through natural interactions with graphical user interfaces. The system allows an AI agent to interpret visual information from the screen and translate natural language instructions into actions such as tapping, swiping, and navigating between application screens. Instead of requiring backend access to application APIs, the framework interacts with apps the same...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates. Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can...
    Downloads: 17 This Week
    Last Update:
    See Project
  • 19
    TimeTracker

    TimeTracker

    Open-source and free to self-host

    TimeTracker by DRYTRIX is a comprehensive self-hosted time tracking and project management application that helps individuals, teams, and small businesses take control of their productivity and billing. Built on a modern web stack (Python/Flask backend with a responsive frontend), TimeTracker enables users to track hours, manage projects and clients, visualize data with interactive charts, and even generate professional invoices directly from tracked time. It embraces privacy and flexibility...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    Perf Book

    Perf Book

    The book "Performance Analysis and Tuning on Modern CPU"

    This project is a practical guide to performance analysis and tuning on modern CPUs, bridging microarchitecture details with hands-on profiling. It explains how caches, TLBs, prefetchers, branch predictors, and out-of-order execution influence real program speed, then connects those concepts to concrete optimization strategies. Readers learn how to design trustworthy benchmarks, avoid measurement traps (warmup, turbo, frequency scaling), and interpret hardware performance counters. The book...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 21
    PyVista

    PyVista

    3D plotting and mesh analysis through a streamlined interface

    ...PyVista is a helper module for the Visualization Toolkit (VTK) that takes a different approach on interfacing with VTK through NumPy and direct array access. This package provides a Pythonic, well-documented interface exposing VTK’s powerful visualization backend to facilitate rapid prototyping, analysis, and visual integration of spatially referenced datasets. This module can be used for scientific plotting for presentations and research papers as well as a supporting module for other mesh-dependent Python modules. Easily integrate with NumPy and create a variety of geometries and plot them. You could use any geometry to create your glyphs, or even plot the points directly. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 22
    DocFX

    DocFX

    Static site generator for .NET API documentation

    DocFX can produce documentation from source code (including C#, F#, Visual Basic, REST, JavaScript, Java, Python and TypeScript) as well as raw Markdown files. DocFX can run on Linux, macOS, and Windows. The generated static website can be deployed to any host such as GitHub Pages or Azure Websites with no additional configuration. DocFX provides a flexible way to customize templates and themes. DocFX makes it extremely easy to generate your developer hub with a landing page, API reference, and conceptual documentation, from a variety of sources. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 23
    Context Engineering

    Context Engineering

    A frontier, first-principles handbook

    Context Engineering is a comprehensive, open-source project serving as a first-principles handbook for the emerging discipline of context design and optimization in AI. Moving beyond traditional prompt engineering, this repository defines and explores how to craft and provide complete context payloads — not just single prompts — to large language models so they can perform tasks more reliably and intelligently. It takes inspiration from thought leaders like Andrej Karpathy and bridges theory...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 24
    zvt

    zvt

    Modular quant framework

    For practical trading, a complex algorithm is fragile, a complex algorithm building on a complex facility is more fragile, complex algorithm building on a complex facility by a complex team is more and more fragile. zvt wants to provide a simple facility for building a straightforward algorithm. Technologies come and technologies go, but market insight is forever. Your world is built by core concepts inside you, so it’s you. zvt world is built by core concepts inside the market, so it’s zvt....
    Downloads: 4 This Week
    Last Update:
    See Project
  • 25
    armory

    armory

    3D Engine with Blender Integration

    Armory is an open-source 3D engine focused on portability, minimal footprint and performance. The renderer is fully scriptable with deferred and forward paths supported out of the box. Armory provides a full Blender integration add-on, turning it into a complete game development tool. The result is a unified workflow from start to finish. Powered by Armory engine, ArmorPaint is a stand-alone software designed for physically-based texture painting. Drag & drop your 3D models and start...
    Downloads: 4 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB