Showing 471 open source projects for "visual"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 1
    Pix2Text

    Pix2Text

    Open-Source Python3 tool for recognizing layouts, tables, and math

    An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical formulas, and integrate all of these contents into Markdown format. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 2
    Elyra

    Elyra

    Elyra extends JupyterLab with an AI centric approach

    Elyra is a set of AI-centric extensions to JupyterLab Notebooks. The Elyra Getting Started Guide includes more details on these features. A version-specific summary of new features is located on the releases page.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Microsoft Azure CLI

    Microsoft Azure CLI

    Azure command-line interface

    ...We support tab completion for groups, commands, and some parameters. You can use the --query parameter and the JMESPath query syntax to customize your output. With the Azure CLI Tools Visual Studio Code extension, you can create .azcli files and use these features. IntelliSense for commands and their arguments. Snippets for commands, inserting required arguments automatically. Run the current command in the integrated terminal. Run the current command and show its output in a side-by-side editor. Show documentation on mouse hover. ...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 4
    Writer Framework

    Writer Framework

    No-code in the front, Python in the back. An open-source framework

    Writer Framework is an open source platform designed to help developers build AI-powered applications by combining a visual interface builder with a Python-based backend architecture. It follows a hybrid approach where user interfaces are created using a drag-and-drop editor while business logic is implemented in Python, allowing teams to balance speed and flexibility without sacrificing control. The framework is particularly focused on AI use cases, enabling developers to integrate large language models, knowledge graphs, and custom machine learning workflows into user-facing applications. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 5
    machine-learning-refined

    machine-learning-refined

    Master the fundamentals of machine learning, deep learning

    machine-learning-refined is an educational repository designed to help students and practitioners understand machine learning algorithms through intuitive explanations and interactive examples. The project accompanies a series of textbooks and teaching materials that focus on making machine learning concepts accessible through visual demonstrations and simple code implementations. Instead of presenting algorithms purely through mathematical derivations, the repository emphasizes geometric intuition, visualization, and step-by-step experimentation. It includes Jupyter notebooks and scripts that illustrate core machine learning topics such as regression, classification, optimization methods, and neural networks. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    ydata-profiling

    ydata-profiling

    Create HTML profiling reports from pandas DataFrame objects

    ydata-profiling primary goal is to provide a one-line Exploratory Data Analysis (EDA) experience in a consistent and fast solution. Like pandas df.describe() function, that is so handy, ydata-profiling delivers an extended analysis of a DataFrame while allowing the data analysis to be exported in different formats such as html and json.
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    Book1_Python-For-Beginners

    Book1_Python-For-Beginners

    The Iris Book: Addition, Subtraction, Multiplication, and Division

    Book1_Python-For-Beginners is the introductory volume of the Visualize-ML series, designed to teach Python programming to newcomers with no prior coding experience. The repository emphasizes clarity and gradual skill building, starting from fundamental syntax and moving toward practical programming patterns. It integrates visual aids and annotated code examples to help learners understand not just how Python works but why certain patterns are used. The material is structured to support self-paced learning, making it suitable for students, career switchers, and hobbyists. Because the book is part of a larger data science pathway, it also prepares readers for later work in visualization and machine learning. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Book3_Elements-of-Mathematics

    Book3_Elements-of-Mathematics

    From Addition, Subtraction, Multiplication, and Division to ML

    Book3_Elements-of-Mathematics is an open learning resource in the Visualize-ML collection that introduces core mathematical foundations required for modern data science and AI. The repository presents topics such as algebra, calculus fundamentals, and mathematical reasoning using a highly visual and beginner-friendly approach. Its goal is to reduce the intimidation barrier often associated with formal mathematics by combining diagrams, structured explanations, and applied examples. The content is organized progressively so learners can build confidence before moving into more advanced quantitative subjects. It is particularly useful for self-taught developers and students transitioning into technical fields that require mathematical literacy. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    ...The frontend part of Label Studio app lies in the frontend/ folder and written in React JSX. Multi-user labeling sign up and login, when you create an annotation it's tied to your account. Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.
    Downloads: 29 This Week
    Last Update:
    See Project
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • 10
    VGGSfM

    VGGSfM

    VGGSfM: Visual Geometry Grounded Deep Structure From Motion

    VGGSfM is an advanced structure-from-motion (SfM) framework jointly developed by Meta AI Research (GenAI) and the University of Oxford’s Visual Geometry Group (VGG). It reconstructs 3D geometry, dense depth, and camera poses directly from unordered or sequential images and videos. The system combines learned feature matching and geometric optimization to generate high-quality camera calibrations, sparse/dense point clouds, and depth maps in standard COLMAP format. Version 2.0 adds support for dynamic scene handling, dense point cloud export, video-based reconstruction (1000+ frames), and integration with Gaussian Splatting pipelines. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Python Progressbar

    Python Progressbar

    Progressbar 2 - A progress bar for Python 2 and Python 3

    A text progress bar is typically used to display the progress of a long-running operation, providing a visual cue that processing is underway. The progressbar is based on the old Python progressbar package that was published on the now-defunct Google Code. Since that project was completely abandoned by its developer and the developer did not respond to my email, I decided to fork the package. This package is still backward compatible with the original progressbar package so you can safely use it as a drop-in replacement for existing projects. ...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 12
    Qwen-VL

    Qwen-VL

    Chat & pretrained large vision language model

    Qwen-VL is Alibaba Cloud’s vision-language large model family, designed to integrate visual and linguistic modalities. It accepts image inputs (with optional bounding boxes) and text, and produces text (and sometimes bounding boxes) as output. The model variants (VL-Plus, VL-Max, etc.) have been upgraded for better visual reasoning, text recognition from images, fine-grained understanding, and support for high image resolutions / extreme aspect ratios.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    PS2 Cover

    PS2 Cover

    PS2 Covers Collection

    ...Its scale and completeness make it one of the most comprehensive resources for retro gaming visuals. Overall, ps2-covers enhances the user experience of emulation by adding organized and accessible visual metadata.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    FramePack

    FramePack

    Lets make video diffusion practical

    FramePack explores compact representations for sequences of image frames, targeting tasks where many near-duplicate frames carry redundant information. The idea is to “pack” frames by detecting shared structure and storing differences efficiently, which can accelerate training or inference on video-like data. By reducing I/O and memory bandwidth, datasets become lighter to load while models still see the essential temporal variation. The repository demonstrates both packing and unpacking...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 15
    Frontend Slides

    Frontend Slides

    Create beautiful slides on the web using Claude's frontend skills

    Frontend Slides is a lightweight tool that enables users to create visually appealing, animation-rich web presentations without requiring knowledge of CSS or JavaScript by leveraging a guided, interactive workflow. It operates on a “show, don’t tell” philosophy, generating visual previews of styles so users can select their preferred design rather than describing it abstractly. The system produces fully self-contained HTML presentations with inline CSS and JavaScript, eliminating the need for external dependencies, build tools, or frameworks. It also supports converting existing PowerPoint files into web-based presentations while preserving content such as images, text, and structure. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    GLM-4.5V

    GLM-4.5V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.5V is the preceding iteration in the GLM-V series that laid much of the groundwork for general multimodal reasoning and vision-language understanding. It embodies the design philosophy of mixing visual and textual modalities into a unified model capable of general-purpose reasoning, content understanding, and generation, while already supporting a wide variety of tasks: from image captioning and visual question answering to content recognition, GUI-based agents, video understanding, and long-document interpretation. GLM-4.5V emerged from a training framework that leverages scalable reinforcement learning (with curriculum sampling) to boost performance across tasks ranging from STEM problem solving to long-context reasoning, giving it broad applicability beyond narrow benchmarks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Sa2VA

    Sa2VA

    Official Repo For "Sa2VA: Marrying SAM2 with LLaVA

    Sa2VA is a cutting-edge open-source multi-modal large language model (MLLM) developed by ByteDance that unifies dense segmentation, visual understanding, and language-based reasoning across both images and videos. It merges the segmentation power of a state-of-the-art video segmentation model (based on SAM‑2) with the vision-language reasoning capabilities of a strong LLM backbone (derived from models like InternVL2.5 / Qwen-VL series), yielding a system that can answer questions about visual content, perform referring segmentation, and maintain temporal consistency across frames in video. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    LatentSync

    LatentSync

    Taming Stable Diffusion for Lip Sync

    ...The system leverages a U-Net diffusion backbone, with cross-attention of audio embeddings (via an audio encoder) and reference video frames to guide generation, and applies a set of loss functions (temporal, perceptual, sync-net based) to enforce lip-sync accuracy, visual fidelity, and temporal consistency. Over versions, LatentSync has improved temporal stability and lowered resource requirements — making inference more practical (e.g. 8 GB VRAM for earlier versions, somewhat higher for latest models).
    Downloads: 2 This Week
    Last Update:
    See Project
  • 19
    ART ASCII Library

    ART ASCII Library

    ASCII art library for Python

    ASCII art is also known as "computer text art". It involves the smart placement of typed special characters or letters to make a visual shape that is spread over multiple lines of text. ART is a Python lib for text converting to ASCII art fancy.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Qtile

    Qtile

    A full-featured, hackable tiling window manager written in Python

    A full-featured, hackable tiling window manager written and configured in Python. Optimize your workflow by configuring your environment to fit how you work. Efficiently use screen real-estate by automatically arranging windows with minimal visual cruft. Save your wrists from RSI by ditching the mouse and driving with the keyboard. Qtile is simple, small, and extensible. It's easy to write your own layouts, widgets, and built-in commands. Qtile is written and configured entirely in Python. Leverage the full power and flexibility of the language to make it fit your needs. The Qtile community is active and growing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Phi-3-MLX

    Phi-3-MLX

    Phi-3.5 for Mac: Locally-run Vision and Language Models

    Phi-3-Vision-MLX is an Apple MLX (machine learning on Apple silicon) implementation of Phi-3 Vision, a lightweight multi-modal model designed for vision and language tasks. It focuses on running vision-language AI efficiently on Apple hardware like M1 and M2 chips.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    ROMM

    ROMM

    A beautiful, powerful, self-hosted rom manager and player

    ...The launcher includes a powerful universal search that combs through installed apps, contacts, messages, and web results to deliver quick answers without switching contexts. Romm also supports widgets, customization options, and theme choices so users can tailor the visual experience to their preferences while maintaining performance and responsiveness. Privacy is a highlight, with local indexing and search functions that operate without sending data to external servers unless explicitly permitted.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 23
    GLM-4.6V

    GLM-4.6V

    GLM-4.6V/4.5V/4.1V-Thinking, towards versatile multimodal reasoning

    GLM-4.6V represents the latest generation of the GLM-V family and marks a major step forward in multimodal AI by combining advanced vision-language understanding with native “tool-call” capabilities, long-context reasoning, and strong generalization across domains. Unlike many vision-language models that treat images and text separately or require intermediate conversions, GLM-4.6V allows inputs such as images, screenshots or document pages directly as part of its reasoning pipeline — and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Droidrun

    Droidrun

    Powerful framework for controlling Android and iOS devices

    Droidrun is a native mobile agent platform that gives users natural-language control over real Android devices to automate any mobile app workflow, from logins and bookings to purchases and data extraction, including access to mobile-only content behind app logins, rate limits, or platform restrictions. Its cloud offering lets users spin up agents in seconds with preinstalled apps, run tasks in parallel across multiple devices, and compose complex, multi-step conditional workflows using...
    Downloads: 15 This Week
    Last Update:
    See Project
  • 25
    sparkmagic

    sparkmagic

    Jupyter magics and kernels for working with remote Spark clusters

    ...Sparkmagic interacts with remote Spark clusters through a REST server. Automatic visualization of SQL queries in the PySpark, Spark and SparkR kernels; use an easy visual interface to interactively construct visualizations, no code required. Ability to capture the output of SQL queries as Pandas dataframes to interact with other Python libraries (e.g. matplotlib). Send local files or dataframes to a remote cluster (e.g. sending pretrained local ML model straight to the Spark cluster) Authenticate to Livy via Basic Access authentication or via Kerberos.
    Downloads: 7 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB