Showing 413 open source projects for "visual python"

View related business solutions
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Web Dev for Beginners

    Web Dev for Beginners

    About 24 Lessons, 12 Weeks, Get Started as a Web Developer

    Web-Dev-For-Beginners is Microsoft’s open source, project-based curriculum for learning web development from scratch. Designed as a 12-week, 24-lesson course, it covers HTML, CSS, and JavaScript fundamentals through hands-on projects like terrariums, browser extensions, and space games. Each lesson includes a mix of pre-lecture quizzes, written content, assignments, challenges, and post-lecture quizzes to reinforce learning. The course also offers global accessibility with translations in...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    AutoCrop-Vertical

    AutoCrop-Vertical

    Smart video converter using YOLOv8 and FFmpeg

    AutoCrop-Vertical is a Python-based video processing tool that automatically converts horizontal videos into vertical formats optimized for social media platforms. It uses computer vision techniques and AI models such as YOLOv8 to analyze each frame, detect subjects, and dynamically adjust cropping decisions. Instead of applying a static center crop, the system intelligently tracks people or key objects to preserve visual focus and composition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    sparkmagic

    sparkmagic

    Jupyter magics and kernels for working with remote Spark clusters

    ...Sparkmagic interacts with remote Spark clusters through a REST server. Automatic visualization of SQL queries in the PySpark, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required. Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib). Send local files or dataframes to a remote cluster (e.g. sending pretrained local ML model straight to the Spark cluster) Authenticate to Livy via Basic Access authentication or via Kerberos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 5
    City Map Poster Generator

    City Map Poster Generator

    Transform your favorite cities into beautiful, minimalist designs

    maptoposter is a code-driven poster generator that turns any city into a minimalist, print-style map artwork with consistent typography and themed color palettes. It is built around a simple command-line flow where you pass a city and country, and the tool fetches the relevant map geometry and renders it into a clean composition that looks like a design product rather than a raw GIS export. The repository includes a library of predefined themes that change the overall look (for example,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Slither

    Slither

    Static Analyzer for Solidity

    Slither is a Solidity static analysis framework written in Python 3. It runs a suite of vulnerability detectors, prints visual information about contract details, and provides an API to easily write custom analyses. Slither enables developers to find vulnerabilities, enhance their code comprehension, and quickly prototype custom analyses. Slither is the first open-source static analysis framework for Solidity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    ViZDoom

    ViZDoom

    Doom-based AI research platform for reinforcement learning

    ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular. ViZDoom is based on ZDOOM, the most popular modern source-port of DOOM. This means compatibility with a huge range of tools and resources that can be used to create custom scenarios, availability of detailed documentation of the engine and tools and support of Doom community....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 10
    AI-Codereview-Gitlab

    AI-Codereview-Gitlab

    GitLab automatic code review tool based on large models

    AI-Codereview-Gitlab is an open-source automation tool that integrates large language models into the GitLab development workflow to perform automated code reviews. The system monitors GitLab repositories and analyzes commits or merge requests using AI models to identify potential issues, coding mistakes, and quality improvements before the code is merged. By leveraging multiple large language model providers—including OpenAI, DeepSeek, ZhipuAI, or local models through Ollama—the platform...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 11
    Computer Vision in Action

    Computer Vision in Action

    A computer vision closed-loop learning platform

    Computer Vision in Action is a practical, example-rich repository that demonstrates real-world applications of computer vision techniques and algorithms in Python, often using OpenCV, deep learning models, and related tooling. It serves as a hands-on companion for learners and engineers who want to understand not just the theory, but how computer vision is actually implemented for tasks like object detection, image classification, feature tracking, optical flow, and image segmentation. The repository includes structured code examples, scripts, and notebooks that cover pipeline construction, preprocessing, model inference, and visual output rendering, making it easy for newcomers or intermediate practitioners to adapt patterns to their own projects. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    Agent Sprite Forge

    Agent Sprite Forge

    Agent Skill for generating 2D sprite sheets and map, transparent PNG

    Agent Sprite Forge is an AI-powered asset generation toolkit designed to create 2D game sprites, transparent PNG frames, animated GIFs, and sprite sheets directly from text prompts. The project functions as an “agent skill” that can integrate with coding assistants and AI workflows to automate parts of the game asset creation pipeline. It focuses on generating production-friendly pixel art and animation assets that can be used in indie games, prototypes, and rapid iteration workflows. The...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    FireRed-Image-Edit

    FireRed-Image-Edit

    General-purpose image editing model that delivers high-fidelity

    FireRed-Image-Edit is an open-source general-purpose image editing model and toolset designed to deliver high-fidelity, visually coherent edits across a wide range of editing tasks, from simple object modifications to complex enhancements like restoration and style preservation. It is built on a flexible text-to-image foundation model that has been extended with training paradigms including pretraining, supervised fine-tuning, and reinforcement learning to imbue the system with strong...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 14
    VGGT-Ω

    VGGT-Ω

    [CVPR 2026 Oral] VGGT Omega

    VGGT-Omega is a Facebook Research computer vision project for feed-forward camera and depth reconstruction. It takes images as input and predicts camera parameters, depth maps, confidence values, and related scene tokens. The project is associated with 3D understanding workflows where models infer scene geometry without a traditional multi-stage reconstruction pipeline. It includes pretrained model variants with different resolutions and text-alignment capabilities, though checkpoint access...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    MLJAR Studio

    MLJAR Studio

    Python package for AutoML on Tabular Data with Feature Engineering

    We are working on new way for visual programming. We developed a desktop application called MLJAR Studio. It is a notebook-based development environment with interactive code recipes and a managed Python environment. All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    Luigi is a Python (3.6, 3.7, 3.8, 3.9 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more. The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    GPTImage2Skill

    GPTImage2Skill

    GPT Image 2 prompt gallery, image prompt library, agentic skill

    GPTImage2Skill is a curated prompt gallery, agent skill, and command-line workflow for working with GPT Image 2 generation and editing. It provides reusable image prompts across creative, technical, academic, interface, design, photography, typography, gaming, anime, map, tattoo, and reference-editing use cases. The project is designed to help agents and users produce stronger visual outputs without starting from a blank prompt every time. Its gallery is organized into category files so an...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    SimpleHTR is an open-source implementation of a handwriting text recognition system based on deep learning techniques. The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting. It also employs connectionist temporal classification (CTC) to align predicted...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    Wan Move

    Wan Move

    Motion-controllable Video Generation via Latent Trajectory Guidance

    Wan Move is an open-source research codebase for motion-controllable video generation that focuses on enabling fine-grained control of motion within generative video models. It is designed to guide the temporal evolution of visual content by leveraging latent trajectory guidance, allowing users to manipulate how objects move over time without modifying the underlying generative architecture. By representing motion information as dense point trajectories and integrating them into the latent...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Open-AutoGLM

    Open-AutoGLM

    An open phone agent model & framework

    Open-AutoGLM is an open-source framework and model designed to empower autonomous mobile intelligent assistants by enabling AI agents to understand and interact with phone screens in a multimodal manner, blending vision and language capability to control real devices. It aims to create an “AI phone agent” that can perceive on-screen content, reason about user goals, and execute sequences of taps, swipes, and text input via automated device control interfaces like ADB, enabling hands-off...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 21
    Better Chatbot

    Better Chatbot

    Just a Better Chatbot. Powered by MCP Client & Workflows

    Better‑chatbot is an AI chatbot framework powered by MCP protocols and workflows, allowing developers to deploy and integrate AI-powered chat systems with ease. Integrates all major LLMs: OpenAI, Anthropic, Google, xAI, Ollama, and more. MCP protocol, web search, JS/Python code execution, data visualization. Custom agents, visual workflows, artifact generation. Custom agents, visual workflows, artifact generation. Realtime voice chat with full MCP tool integration.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    # Digital Visual Computer (DVC) The Digital Visual Computer (DVC) is an experimental computing platform where programs and data are represented visually as images. This project contains the specifications, tools, and examples for the DVC ecosystem. ## Introduction DVC explores the concept of screen-to-screen computation. Instead of text-based code, DVC uses a "color language" where sequences of colors represent instructions (opcodes) and data. The state of the computer's memory is...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    RAG Anything

    RAG Anything

    RAG-Anything: All-in-One RAG Framework

    RAG-Anything is an open-source unified framework that extends the Retrieval-Augmented Generation (RAG) paradigm to fully multimodal document and knowledge retrieval, enabling systems to ingest, parse, represent, and query rich content that includes text, images, tables, formulas, and other structured or visual elements. Traditional RAG systems are typically limited to text and cannot effectively work across heterogeneous document layouts, but RAG-Anything addresses this by modeling...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Tally

    Tally

    Let agents classify your bank transactions

    Tally is an open-source, AI-assisted tool designed to automate the classification of personal financial transactions, helping users turn raw bank data into meaningful categories without manual tagging. At its core, Tally pairs a local rule engine with large language models so that an AI assistant (like Claude Code, Copilot, or any CLI agent) interprets, suggests, and categorizes expenses, savings, subscriptions, and income events based on your own rules and behavior. It generates...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25
    MaxKB

    MaxKB

    Open-source platform for building enterprise-grade agents

    MaxKB (Max Knowledge Brain) is an open-source platform for building enterprise-grade AI agents with strong knowledge retrieval, RAG pipelines, and workflow orchestration. It focuses on practical deployments such as customer support, internal knowledge bases, research assistants, and education, bundling tools for data ingestion, chunking, embedding, retrieval, and answer synthesis. The system exposes flexible tool-use (including MCP), supports multi-model backends, and provides dashboards for...
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB