Search Results for "audio gui interface" - Page 3

Showing 504 open source projects for "audio gui interface"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 1
    Sherloq

    Sherloq

    An open source digital image forensic toolset

    ...The project emphasizes transparency and community collaboration, contrasting with proprietary forensic tools that often rely on secrecy. Initially developed in C++ in 2015 and later transitioned to a Qt-based GUI in 2017, Sherloq has since been ported to Python with PySide2, Matplotlib, and OpenCV to improve accessibility and ease of development. Its interface allows users to inspect images with real-time zoom, metadata exploration, noise analysis, and specialized algorithms for detecting forgeries and manipulations.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 2
    Toonily Downloader

    Toonily Downloader

    A python tool for downloading manga from Toonily

    Toonily Downloader is a Python-based scraping and downloading tool designed specifically for manga and manhwa hosted on Toonily, enabling users to fetch entire series efficiently while preserving original image quality and structure. It provides both a command-line interface and a graphical user interface, making it accessible for both technical and non-technical users. The software supports downloading full series or selected chapters by parsing Toonily URLs and organizing content into...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    ...Configurable label formats let you customize the visual interface to meet your specific labeling needs. Support for multiple data types including images, audio, text, HTML, time-series, and video.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 4
    ChatTTS webUI & API

    ChatTTS webUI & API

    A simple native web interface that uses ChatTTS to synthesize text

    ChatTTS-ui is a local web interface and API wrapper around the ChatTTS speech synthesis system, designed to make advanced TTS models easy to use from a browser. It runs a small backend server (Python + Torch + ffmpeg) and exposes a simple webpage where you can type text, adjust parameters, and generate audio. The project supports Chinese, English, and mixed text with digits and control symbols, making it suitable for bilingual content and numerically heavy text like announcements or prompts. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • AI-generated apps that pass security review Icon
    AI-generated apps that pass security review

    Stop waiting on engineering. Build production-ready internal tools with AI—on your company data, in your cloud.

    Retool lets you generate dashboards, admin panels, and workflows directly on your data. Type something like “Build me a revenue dashboard on my Stripe data” and get a working app with security, permissions, and compliance built in from day one. Whether on our cloud or self-hosted, create the internal software your team needs without compromising enterprise standards or control.
    Try Retool free
  • 5
    MDCx

    MDCx

    Movie metadata scraper and organizer for media libraries and NFO

    MDCx is an open source media metadata scraping and organization tool designed to automate the process of collecting detailed information for movie files. It retrieves metadata from multiple online sources and applies it to local media collections, helping users maintain structured and well-organized libraries. MDCx can download information such as titles, cast data, artwork, and other metadata, then generate standardized NFO files compatible with media management systems. It also supports...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    labelme Image Polygonal Annotation

    labelme Image Polygonal Annotation

    Image polygonal annotation with Python

    Labelme is a graphical image annotation tool. It is written in Python and uses Qt for its graphical interface. Image annotation for polygon, rectangle, circle, line and point. Image flag annotation for classification and cleaning. Video annotation. (video annotation). GUI customization (predefined labels / flags, auto-saving, label validation, etc). Exporting VOC-format dataset for semantic/instance segmentation. (semantic segmentation, instance segmentation).
    Downloads: 12 This Week
    Last Update:
    See Project
  • 7
    StreamSpeech

    StreamSpeech

    StreamSpeech is a seamless model for offline speech recognition

    StreamSpeech is an “all-in-one” speech model designed to perform offline and simultaneous speech recognition, speech translation, and speech synthesis within a single unified architecture. Developed as part of an ACL 2024 paper, it targets streaming and low-latency scenarios where intermediate results and final translations or synthetic speech must be produced continuously as audio is being received. The model supports eight tasks: offline ASR, speech-to-text translation, speech-to-speech...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Apprise

    Apprise

    Apprise - Push Notifications that work with just about every platform!

    ...Once you've saved your configuration, you'll be able to use the Notification tab to send you're messages to one or more of the services you defined in your configuration. You can use the tag all to notify all of your services regardless of what tag had otherwise been assigned to them. At the end of the day, the GUI just simply offers a user friendly interface to the same API developers can directly interface with if they wish to.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Pixeltable

    Pixeltable

    Data Infrastructure providing an approach to multimodal AI workloads

    ...Developers define data transformations and AI operations using computed columns on tables, allowing pipelines to evolve incrementally as new data or models are added. The framework supports multimodal content including images, video, text, and audio, enabling applications such as retrieval-augmented generation systems, semantic search, and multimedia analytics.
    Downloads: 2 This Week
    Last Update:
    See Project
  • Application Monitoring That Won't Slow Your App Down Icon
    Application Monitoring That Won't Slow Your App Down

    AppSignal's Rust-based agent is lightweight and stable. Already running in thousands of production apps.

    Full APM with errors, performance, logs, and uptime monitoring. 99.999% uptime SLA on the platform itself.
    Start Free
  • 10
    MediaCrate — Video/Audio Downloader

    MediaCrate — Video/Audio Downloader

    Download video and audio from over 1,000+ websites with one click

    MediaCrate is a lightweight desktop application for downloading video and audio from various websites, including YouTube, Instagram, TikTok, Facebook and many others. It's rather simple to use. Paste a link, select format and quality, and download. MediaCrate is designed with performance and simplicity in mind, maintaining minimal CPU usage while idle and a small memory footprint during downloads.
    Leader badge
    Downloads: 3 This Week
    Last Update:
    See Project
  • 11
    bilibili-manga-downloader

    bilibili-manga-downloader

    Download and manage Bilibili Manga chapters with GUI downloader

    BiliBili-Manga-Downloader is an open source desktop application designed to download manga chapters from the Bilibili Manga platform for offline reading and local management. It was created to address limitations of the web reading experience, such as intrusive advertisements, inconvenient image zooming, and inconsistent navigation during reading sessions. It provides a graphical user interface that allows users to search for manga titles using keywords, view detailed information about...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    h2oGPT

    h2oGPT

    Private chat with local GPT with document, images, video, etc.

    h2oGPT is an open-source platform that allows users to interact with local GPT models in a completely private environment. It supports a variety of document types, including PDFs, Word files, images, video frames, and even audio, enabling users to query and analyze their documents or engage in a private chat with AI. The platform is designed to be secure and offline, ensuring that all data remains private and under the user's control. h2oGPT supports several AI models, including oLLaMa and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Maestral

    Maestral

    Open-source Dropbox client for macOS and Linux

    ...It provides powerful command line tools, supports gitignore patterns to exclude local files from syncing, and allows syncing multiple Dropbox accounts. The CLI allows configuring an unlimited number of Dropbox accounts. Just pass a new config name when linking a new account. More fine-grained controls in the GUI and command line interface allow excluding individual files with selective sync. Maestral is not an official Dropbox App. It therefore does not count towards the three-device limit for Basic Dropbox accounts. Exclude local items from syncing by placing a .mignore file in the Dropbox root with patterns matching any number of items.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    npcpy

    npcpy

    The AI toolkit for the AI developer

    npcpy is a Python-based agent framework and command-line toolkit (the NPC Shell) for developers to build, test, and integrate AI agents into their workflows, including both command-line and GUI interfaces via NPC Studio. Welcome to npcpy, the core library of the NPC Toolkit that supercharges natural language processing pipelines and agent tooling. npcpy is a flexible framework for building state-of-the-art applications and conducting novel research with LLMs. The structure of npcpy also...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 15
    buku

    buku

    Personal mini-web in text

    buku is a powerful bookmark manager and a personal textual mini-web. For those who prefer the GUI, bukuserver exposes a browsable front-end on a local web host server. When I started writing it, I couldn't find a flexible command-line solution with a private, portable, merge-able database along with seamless GUI integration. Hence, buku. buku can import bookmarks from the browser(s) or fetch the title, tags and description of a URL from the web. Use your favorite editor to add, compose and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    OmAgent

    OmAgent

    Build multimodal language agents for fast prototype and production

    OmAgent is an open-source Python framework designed to simplify the development of multimodal language agents that can reason, plan, and interact with different types of data sources. The framework provides abstractions and infrastructure for building AI agents that operate on text, images, video, and audio while maintaining a relatively simple interface for developers. Instead of forcing developers to implement complex orchestration logic manually, the system manages task scheduling, worker coordination, and node optimization behind the scenes. Its architecture uses a graph-based workflow engine where tasks are represented as nodes in a directed workflow, enabling modular composition of complex reasoning pipelines. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    UFO³

    UFO³

    Weaving the Digital Agent Galaxy

    UFO is an open-source framework developed by Microsoft for building intelligent agents that automate interactions with graphical user interfaces on the Windows operating system. The system allows users to issue natural language instructions that are translated into automated actions across multiple desktop applications. Using a dual-agent architecture, the framework analyzes both visual interface elements and system control structures in order to understand how applications should be...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    AnyTool

    AnyTool

    AnyTool: Universal Tool-Use Layer for AI Agents

    AnyTool is an open-source universal tool-use layer for AI agents that addresses the critical problem of how autonomous agents reliably interact with external tools and environments. Rather than having each agent handle tool invocation logic on its own, AnyTool provides a standardized interface and orchestrator that intelligently selects and manages tools, reduces context overhead, and improves execution reliability across diverse capabilities like web APIs, local commands, and GUI automation. It uses progressive filtering and adaptive orchestration to ensure the right tools are retrieved efficiently and work cohesively with agents of varying complexity, scaling to thousands of tools with self-optimizing behavior. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Matcha-TTS

    Matcha-TTS

    A fast TTS architecture with conditional flow matching

    Matcha-TTS is a non-autoregressive neural text-to-speech architecture that uses conditional flow matching to generate speech quickly while maintaining natural quality. It models speech as an ODE-based generative process, and conditional flow matching lets it reach high-quality audio in only a few synthesis steps, which greatly reduces latency compared to score-matching diffusion approaches. The model is fully probabilistic, so it can generate diverse realizations of the same text while still sounding stable and intelligible. The repository provides an end-to-end TTS pipeline: a PyTorch/Lightning training stack, configuration files, pre-trained checkpoints, a command-line interface, and a Gradio app for interactive testing. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    MuJoCo

    MuJoCo

    Multi-Joint dynamics with Contact. A general purpose physics simulator

    MuJoCo, developed and maintained by Google DeepMind, is a high-performance physics engine designed for simulating complex, articulated systems that interact through contact. It is widely used in research fields such as robotics, biomechanics, computer graphics, animation, and machine learning, where fast and accurate physics simulations are essential. The engine provides a robust C API optimized for real-time computation, making it suitable for scientific research and advanced simulation...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    WhatsApp MCP Server

    WhatsApp MCP Server

    WhatsApp MCP server enabling AI access to chats and messaging

    ...It supports both sending and receiving messages, including various media types such as images, audio, videos, and documents. It integrates with AI applications like Claude through MCP, enabling conversational automation and contextual message retrieval.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    Agent S

    Agent S

    Agent S: an open agentic framework that uses computers like a human

    Agent S is an open-source agentic framework designed to enable autonomous computer use through an Agent-Computer Interface (ACI). Built to operate graphical user interfaces like a human, it allows AI agents to perceive screens, reason about tasks, and execute actions across macOS, Windows, and Linux systems. The latest version, Agent S3, surpasses human-level performance on the OSWorld benchmark, demonstrating state-of-the-art results in complex multi-step computer tasks. Agent S combines...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    IntelOwl

    IntelOwl

    Centralized platform for automated threat intelligence analysis

    IntelOwl is an open source platform designed to manage and enrich threat intelligence data at scale. It provides a centralized environment where security analysts can gather information about suspicious files and observables such as IP addresses, domains, URLs, or hashes using a single API request. The platform integrates numerous online intelligence sources and advanced malware analysis tools, enabling users to obtain comprehensive threat intelligence without manually querying multiple...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    sqlit

    sqlit

    A user friendly TUI for SQL databases

    sqlit is a keyboard-first terminal UI that lets you connect to, browse, and query SQL databases quickly without relying on heavyweight GUI clients. It positions itself as a “lazygit-style” experience for databases, aiming for fast startup, intuitive navigation, and developer-friendly workflows directly inside your terminal. The tool supports a wide range of database providers, so you can use one interface across local databases, remote servers, and cloud-hosted instances rather than juggling multiple clients. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    git-cola

    git-cola

    git-cola: The highly caffeinated Git GUI

    Git Cola is a sleek and powerful graphical user interface for Git. Git Cola is free software and written in Python (v2 + v3). Git Cola uses QtPy, so you can choose between PyQt6, PyQt5 and PySide2 by setting the QT_API environment variable to pyqt6, pyqt5 or pyside2 as desired. qtpy defaults to pyqt6 and falls back to pyqt6 and pyside2 if pyqt5 is not installed. Git Cola enables additional features when the following Python modules are installed. send2trash enables cross-platform "Send to...
    Downloads: 0 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB