Showing 15 open source projects for "dvd-audio"

View related business solutions
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 1
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip install SpeechRecognition. ...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 2
    Google AI Edge Gallery

    Google AI Edge Gallery

    A gallery that showcases on-device ML/GenAI use cases

    ...The project bundles runnable samples that show how to run TensorFlow Lite/Edge TPU models (and similar lightweight runtimes) on mobile and embedded platforms, demonstrating common tasks like image classification, object detection, audio recognition, and pose estimation. Each sample is intended to be both a learning aid and a practical starting point: code is organized to show model loading, pre/post-processing, performance measurement, and common optimization knobs (quantization, NNAPI/Delegate usage, and hardware accelerators). The repo also collects small, well-documented models and conversion scripts so developers can reproduce a pipeline from a full-size model down to a device-friendly artifact.
    Downloads: 142 This Week
    Last Update:
    See Project
  • 3
    Omi

    Omi

    AI that sees your screen and listens to conversations

    ...The platform operates across multiple environments, including wearable devices, mobile apps, and desktop applications, ensuring seamless integration into a user’s daily workflow. At its core, omi uses a pipeline of speech-to-text systems, large language models, and memory storage services to transform raw audio and context into meaningful outputs like tasks and reminders. The architecture is modular and extensible, featuring APIs, SDKs, and plugin-like capabilities that allow developers to build custom applications.
    Downloads: 27 This Week
    Last Update:
    See Project
  • 4
    Kitten TTS

    Kitten TTS

    State-of-the-art TTS model under 25MB

    KittenTTS is an open-source, ultra-lightweight, and high-quality text-to-speech model featuring just 15 million parameters and a binary size under 25 MB. It is designed for real-time CPU-based deployment across diverse platforms. Ultra-lightweight, model size less than 25MB. CPU-optimized, runs without GPU on any device. High-quality voices, several premium voice options available. Fast inference, optimized for real-time speech synthesis.
    Downloads: 13 This Week
    Last Update:
    See Project
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 5
    ChatGPT Telegram Bot

    ChatGPT Telegram Bot

    A Telegram bot that integrates with OpenAI's official ChatGPT APIs

    A Telegram bot that integrates with OpenAI's official ChatGPT, DALL·E and Whisper APIs to provide answers. Ready to use with minimal configuration required.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Amical

    Amical

    Open Source AI Dictation App

    Amical is an open source, AI-powered desktop dictation and note-taking application that enables users to dictate hands-free, transcribe meetings, and capture notes effortlessly with unmatched speed, accuracy, and privacy. It leverages both local and cloud-based AI models, letting users seamlessly switch between providers for the ideal balance of speed, precision, and control, and understands the context of each app in use to automatically format text in a tone and style appropriate to the...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    ...With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports both text and audio inputs to generate outputs in various forms, including voice cloning, emotion control, and interactive role-playing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Qwen Chat

    Qwen Chat

    An AI assistant for everyone, powered by the Qwen series models

    ...It can analyze and summarize large documents, extracting key insights and visualizing data for better decision-making. With multimodal understanding, Qwen Chat processes audio, images, and videos seamlessly within a single conversation. Users can also generate images, videos, and code, including real-time HTML and SVG visualizations. Available across web, mobile, and desktop, Qwen Chat offers a powerful, all-in-one AI experience for diverse user needs.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    Folo

    Folo

    Folo — AI-powered RSS reader for deep noise-free reading

    This AI RSS reader reads the internet for you, cutting through noise to surface the knowledge you actually care about.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Your monitoring isn't a stack. It's a pile. Fix that. Icon
    Your monitoring isn't a stack. It's a pile. Fix that.

    Errors, performance, logs, uptime. One install, one invoice, one UI.

    Replace Datadog, New Relic, and Sentry without adding three more dashboards.
    Free 30 days.
  • 10
    Ainee

    Ainee

    Ainee - AI Notetaking and Learning Companion

    Ainee is your ultimate AI-powered notetaking and learning companion. Capture lecture notes in real-time and effortlessly transform audio, text, files, and YouTube videos into formatted notes, mindmaps, quizzes, flashcards, podcasts, and more. Explore our AI meeting note taker, AI notes, video transcript generator, PDF to AI converter, and AI flashcard maker. Enhance your learning with our AI voice recorder, article summarizer AI, and AI quiz generator. Additionally, share your knowledge base with others to foster the flow of information and help new users benefit from collective insights. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11

    FastoCloud PRO

    IPTV/NVR/CCTV/Video cloud https://fastocloud.com

    IPTV/Video cloud Features: Cross-platform (Linux, MacOSX, FreeBSD, Raspbian/Armbian) GPU/CPU Encode/Decode/Post Processing Stream statistics CCTV Adaptive hls streams Load balancing Temporary urls HLS push EPG scanning Subtitles to text conversions AD insertion Logo overlay Video effects Relays Timeshifts Catchups Playlists Restream/Transcode from online streaming services like Youtube, Twitch ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 12
    NASH OS

    NASH OS

    Nash Operating System for Modern Ecommerce

    The all-built-in-one, automatic, ready-to-go out-of-box, easy-to-use state-of-the-art, and really awesome NASH OS! Over 25,000+ flexible features and controls and all scalable!! The most powerful solution ever built to instantly deliver new heights of online ecommerce enterprise to you.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 13
    JuliusModels

    JuliusModels

    Open source speech models for Julius in English and other languages.

    Open source speech models for Julius speech decoder. Its aim is to give access a wider community of speech recognition enthusiasts to quality models, which they can use in their own projects on different OS platforms (Unix, Windows, etc...) All of the models are based on HTK modelling software and data sets available freely on the Internet.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14

    tts-Bridge

    Scanned text to audio translation for written knowledge dissemination

    The objective of ttsBridge is to build open source applications that convert directly text in one language scanned by a camera to audio speech in another language, using generic and freely available text-to-speech engines, and without the need of using a tts voice specifically build for the target language. In other words, a word scanned in French or English can be spoken in any African language, using only tools that are already available on a generic mobile device.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Scalable Language API (SLAPI) The most comprehensive architecture for conversational natural-language applications including speech recognition/synthesis, semantics, & machine translation. Used on Android & other mobile app platforms.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo