Showing 532 open source projects for "visual python"

View related business solutions
  • Forever Free Full-Stack Observability | Grafana Cloud Icon
    Forever Free Full-Stack Observability | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 1
    LatentSync

    LatentSync

    Taming Stable Diffusion for Lip Sync

    LatentSync is an open-source framework from ByteDance that produces high-quality lip-synchronization for video by using an audio-conditioned latent diffusion model, bypassing traditional intermediate motion representations. In effect, given a source video (with masked or reference frames) and an audio track, LatentSync directly generates frames whose lip motions and expressions align with the audio, producing convincing talking-head or animated lip-sync output. The system leverages a U-Net...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 2
    ROMM

    ROMM

    A beautiful, powerful, self-hosted rom manager and player

    ROMM is an Android productivity launcher replacement that focuses on giving users faster, easier access to apps, contacts, and information through intuitive gestures, smart search, and contextual suggestions. It reimagines the home screen with adaptive layouts, predictive app recommendations, and dynamic organization so that frequently used tools are always within reach. The launcher includes a powerful universal search that combs through installed apps, contacts, messages, and web results...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    PersonaLive

    PersonaLive

    Expressive Portrait Image Animation for Live Streaming

    PersonaLive is an open-source diffusion-based portrait animation framework focused on generating expressive, long-duration animated sequences in real time, primarily for live streaming or interactive applications. It leverages deep generative models that condition on a static reference image and a driving input (such as motion or expression cues) to produce a seamless animated portrait sequence that can run indefinitely without segmentation artifacts. The framework prioritizes low-latency...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Web Dev for Beginners

    Web Dev for Beginners

    About 24 Lessons, 12 Weeks, Get Started as a Web Developer

    Web-Dev-For-Beginners is Microsoft’s open source, project-based curriculum for learning web development from scratch. Designed as a 12-week, 24-lesson course, it covers HTML, CSS, and JavaScript fundamentals through hands-on projects like terrariums, browser extensions, and space games. Each lesson includes a mix of pre-lecture quizzes, written content, assignments, challenges, and post-lecture quizzes to reinforce learning. The course also offers global accessibility with translations in...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    AutoCrop-Vertical

    AutoCrop-Vertical

    Smart video converter using YOLOv8 and FFmpeg

    AutoCrop-Vertical is a Python-based video processing tool that automatically converts horizontal videos into vertical formats optimized for social media platforms. It uses computer vision techniques and AI models such as YOLOv8 to analyze each frame, detect subjects, and dynamically adjust cropping decisions. Instead of applying a static center crop, the system intelligently tracks people or key objects to preserve visual focus and composition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    sparkmagic

    sparkmagic

    Jupyter magics and kernels for working with remote Spark clusters

    ...Sparkmagic interacts with remote Spark clusters through a REST server. Automatic visualization of SQL queries in the PySpark, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required. Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib). Send local files or dataframes to a remote cluster (e.g. sending pretrained local ML model straight to the Spark cluster) Authenticate to Livy via Basic Access authentication or via Kerberos.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Microsoft Azure CLI

    Microsoft Azure CLI

    Azure command-line interface

    A great cloud needs great tools; we're excited to introduce Azure CLI, our next-generation multi-platform command-line experience for Azure. Take a test run now from Azure Cloud Shell! We support tab completion for groups, commands, and some parameters. You can use the --query parameter and the JMESPath query syntax to customize your output. With the Azure CLI Tools Visual Studio Code extension, you can create .azcli files and use these features. IntelliSense for commands and their...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 9
    City Map Poster Generator

    City Map Poster Generator

    Transform your favorite cities into beautiful, minimalist designs

    maptoposter is a code-driven poster generator that turns any city into a minimalist, print-style map artwork with consistent typography and themed color palettes. It is built around a simple command-line flow where you pass a city and country, and the tool fetches the relevant map geometry and renders it into a clean composition that looks like a design product rather than a raw GIS export. The repository includes a library of predefined themes that change the overall look (for example,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    LangExtract

    LangExtract

    A Python library for extracting structured information

    LangExtract is a Python library developed by Google that leverages large language models (LLMs) to extract structured information from unstructured text—such as clinical notes, research papers, or literary works—based on user-defined instructions. It is designed to transform free-form text into reliable, schema-constrained data while maintaining traceability back to the source material.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Slither

    Slither

    Static Analyzer for Solidity

    Slither is a Solidity static analysis framework written in Python 3. It runs a suite of vulnerability detectors, prints visual information about contract details, and provides an API to easily write custom analyses. Slither enables developers to find vulnerabilities, enhance their code comprehension, and quickly prototype custom analyses. Slither is the first open-source static analysis framework for Solidity.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    ViZDoom

    ViZDoom

    Doom-based AI research platform for reinforcement learning

    ViZDoom allows developing AI bots that play Doom using only the visual information (the screen buffer). It is primarily intended for research in machine visual learning, and deep reinforcement learning, in particular. ViZDoom is based on ZDOOM, the most popular modern source-port of DOOM. This means compatibility with a huge range of tools and resources that can be used to create custom scenarios, availability of detailed documentation of the engine and tools and support of Doom community....
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Qwen3-Omni

    Qwen3-Omni

    Qwen3-omni is a natively end-to-end, omni-modal LLM

    Qwen3-Omni is a natively end-to-end multilingual omni-modal foundation model that processes text, images, audio, and video and delivers real-time streaming responses in text and natural speech. It uses a Thinker-Talker architecture with a Mixture-of-Experts (MoE) design, early text-first pretraining, and mixed multimodal training to support strong performance across all modalities without sacrificing text or image quality. The model supports 119 text languages, 19 speech input languages, and...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    Computer Vision in Action

    Computer Vision in Action

    A computer vision closed-loop learning platform

    Computer Vision in Action is a practical, example-rich repository that demonstrates real-world applications of computer vision techniques and algorithms in Python, often using OpenCV, deep learning models, and related tooling. It serves as a hands-on companion for learners and engineers who want to understand not just the theory, but how computer vision is actually implemented for tasks like object detection, image classification, feature tracking, optical flow, and image segmentation. The repository includes structured code examples, scripts, and notebooks that cover pipeline construction, preprocessing, model inference, and visual output rendering, making it easy for newcomers or intermediate practitioners to adapt patterns to their own projects. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    AI-Codereview-Gitlab

    AI-Codereview-Gitlab

    GitLab automatic code review tool based on large models

    AI-Codereview-Gitlab is an open-source automation tool that integrates large language models into the GitLab development workflow to perform automated code reviews. The system monitors GitLab repositories and analyzes commits or merge requests using AI models to identify potential issues, coding mistakes, and quality improvements before the code is merged. By leveraging multiple large language model providers—including OpenAI, DeepSeek, ZhipuAI, or local models through Ollama—the platform...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 16
    Agent Sprite Forge

    Agent Sprite Forge

    Agent Skill for generating 2D sprite sheets and map, transparent PNG

    Agent Sprite Forge is an AI-powered asset generation toolkit designed to create 2D game sprites, transparent PNG frames, animated GIFs, and sprite sheets directly from text prompts. The project functions as an “agent skill” that can integrate with coding assistants and AI workflows to automate parts of the game asset creation pipeline. It focuses on generating production-friendly pixel art and animation assets that can be used in indie games, prototypes, and rapid iteration workflows. The...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 17
    FireRed-Image-Edit

    FireRed-Image-Edit

    General-purpose image editing model that delivers high-fidelity

    FireRed-Image-Edit is an open-source general-purpose image editing model and toolset designed to deliver high-fidelity, visually coherent edits across a wide range of editing tasks, from simple object modifications to complex enhancements like restoration and style preservation. It is built on a flexible text-to-image foundation model that has been extended with training paradigms including pretraining, supervised fine-tuning, and reinforcement learning to imbue the system with strong...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    VGGT-Ω

    VGGT-Ω

    [CVPR 2026 Oral] VGGT Omega

    VGGT-Omega is a Facebook Research computer vision project for feed-forward camera and depth reconstruction. It takes images as input and predicts camera parameters, depth maps, confidence values, and related scene tokens. The project is associated with 3D understanding workflows where models infer scene geometry without a traditional multi-stage reconstruction pipeline. It includes pretrained model variants with different resolutions and text-alignment capabilities, though checkpoint access...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    nunif

    nunif

    Misc; latest version of waifu2x; 2D video to stereo 3D video

    nunif is a deep learning–based image processing framework focused on image upscaling, restoration, denoising, and enhancement tasks using neural network models. The project provides a collection of AI-powered utilities designed primarily for anime-style artwork, illustrations, and high-quality image restoration workflows. It includes command-line tools and graphical interfaces for applying trained neural models to improve image resolution and visual clarity while minimizing artifacts. nunif...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 20
    ComfyUI-LivePortraitKJ

    ComfyUI-LivePortraitKJ

    ComfyUI nodes for LivePortrait

    The ComfyUI-LivePortraitKJ project is a ComfyUI extension focused on generating animated portraits from static images. It enables users to create lifelike facial animations by driving a portrait with motion data or reference inputs. The system uses advanced generative techniques to simulate realistic facial expressions and movements. It integrates into ComfyUI as a set of nodes, allowing users to combine it with other tools for complex animation workflows. The project is particularly useful...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    MLJAR Studio

    MLJAR Studio

    Python package for AutoML on Tabular Data with Feature Engineering

    We are working on new way for visual programming. We developed a desktop application called MLJAR Studio. It is a notebook-based development environment with interactive code recipes and a managed Python environment. All running locally on your machine. We are waiting for your feedback. The mljar-supervised is an Automated Machine Learning Python package that works with tabular data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Luigi

    Luigi

    Python module that helps you build complex pipelines of batch jobs

    Luigi is a Python (3.6, 3.7, 3.8, 3.9 tested) package that helps you build complex pipelines of batch jobs. It handles dependency resolution, workflow management, visualization, handling failures, command line integration, and much more. The purpose of Luigi is to address all the plumbing typically associated with long-running batch processes. You want to chain many tasks, automate them, and failures will happen. These tasks can be anything, but are typically long running things like Hadoop...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    GPTImage2Skill

    GPTImage2Skill

    GPT Image 2 prompt gallery, image prompt library, agentic skill

    GPTImage2Skill is a curated prompt gallery, agent skill, and command-line workflow for working with GPT Image 2 generation and editing. It provides reusable image prompts across creative, technical, academic, interface, design, photography, typography, gaming, anime, map, tattoo, and reference-editing use cases. The project is designed to help agents and users produce stronger visual outputs without starting from a blank prompt every time. Its gallery is organized into category files so an...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    SimpleHTR is an open-source implementation of a handwriting text recognition system based on deep learning techniques. The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting. It also employs connectionist temporal classification (CTC) to align predicted...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 25
    Wan Move

    Wan Move

    Motion-controllable Video Generation via Latent Trajectory Guidance

    Wan Move is an open-source research codebase for motion-controllable video generation that focuses on enabling fine-grained control of motion within generative video models. It is designed to guide the temporal evolution of visual content by leveraging latent trajectory guidance, allowing users to manipulate how objects move over time without modifying the underlying generative architecture. By representing motion information as dense point trajectories and integrating them into the latent...
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB