Search Results for "open source speech to text software" - Page 12

Showing 542 open source projects for "open source speech to text software"

View related business solutions
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    q - Text as Data

    q - Text as Data

    Run SQL directly on CSV or TSV files

    q is a command line tool that allows direct execution of SQL-like queries on CSVs/TSVs (and any other tabular text files). q treats ordinary files as database tables, and supports all SQL constructs, such as WHERE, GROUP BY, JOINs etc. It supports automatic column name and column type detection, and provides full support for multiple encodings. q fully supports all types of encoding. Use -e data-encoding to set the input data encoding, -Q query-encoding to set the query encoding, and use -E...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Big Sleep

    Big Sleep

    A simple command line tool for text to image generation

    A simple command line tool for text to image generation, using OpenAI's CLIP and a BigGAN. Ryan Murdock has done it again, combining OpenAI's CLIP and the generator from a BigGAN! This repository wraps up his work so it is easily accessible to anyone who owns a GPU. You will be able to have the GAN dream-up images using natural language with a one-line command in the terminal. User-made notebook with bug fixes and added features, like google drive integration. Images will be saved to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Mycroft

    Mycroft

    Mycroft Core, the Mycroft Artificial Intelligence platform

    Mycroft is the world’s leading open source voice assistant. It is private by default and completely customizable. Our software runs on many platforms, on desktop, our reference hardware, a Raspberry Pi, or your own custom hardware. Our open-source, modular system can be ported to your device or environment, at any price point. Whether you make voice-assistants, televisions, or microwaves. Whether you have a 5-room BnB or a 1000-room hotel. Your customers will get access to all the...
    Downloads: 26 This Week
    Last Update:
    See Project
  • 4
    Flick App Development

    Flick App Development

    Have you ever just wanted to, code an app, easier?

    Code an app easier with Flick, a python parser which can make your console apps easier. Even though it has only one type of snippet, it's best when you can run it to make stories, text movies, whatever! Here's a line from the creator, mainly to a blog. "I'm going to make variable-based objects soon, because you can't really modify objects."
    Downloads: 0 This Week
    Last Update:
    See Project
  • $300 in Free Credit Towards Top Cloud Services Icon
    $300 in Free Credit Towards Top Cloud Services

    Build VMs, containers, AI, databases, storage—all in one place.

    Start your project in minutes. After credits run out, 20+ products include free monthly usage. Only pay when you're ready to scale.
    Get Started
  • 5
    VoiceFixer

    VoiceFixer

    General Speech Restoration

    VoiceFixer is a machine-learning framework for “speech restoration”: given a degraded or distorted audio recording — with noise, clipping, low sampling rate, reverberation, or other artifacts — it attempts to recover high-fidelity, clean speech. The architecture works in two stages: first an analysis stage that tries to extract “clean” intermediate features from the noisy audio (e.g. removing noise, denoising, dereverberation, upsampling), and then a neural vocoder-based synthesis stage that...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 6
    Mocking Bird

    Mocking Bird

    Clone a voice in 5 seconds to generate arbitrary speech in real-time

    MockingBird is an open-source voice cloning and real-time speech generation toolkit that lets you clone a speaker’s voice from a short audio sample (reportedly as little as 5 seconds) and then synthesize arbitrary speech in that voice. It builds on deep-learning based TTS / voice-cloning technology (in the lineage of projects such as Real-Time-Voice-Cloning), but extends it with support for Mandarin Chinese and multiple Chinese speech datasets — broadening its applicability beyond English....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Project Alice

    Project Alice

    Main repository of Project Alice, contains main unit source code

    Project Alice is a smart voice home assistant that is completely modular and extensible. It was first built around Snips therefore runs entirely offline and never sends or shares your voice interactions with anyone, Project Alice guarantees your privacy in your home or wherever you’re using Project Alice. However, as an option, since we've built Project Alice on top of Snips, Project Alice can be configured to use some online alternatives and fall backs (for example, using Amazon or Google’s...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8

    SEO Tool

    Seo Tool: AI Autoblogger for ArtikelSchreiber.com and UNAIQUE.net

    Seo Tool: AI Autoblogger for https://www.artikelschreiber.com/ and https://www.unaique.net/ Function: Create Mini site based on Config File on your server https://linktr.ee/textgenerator Use a Shared Hosting Server Change the "seo-marketing-tool.conf" config to fit your needs (eg: Create a Mini Site about "Cars" then change the AI Software API Keywords to "car") Start the Tool "python3 seo-marketing-tool.py" Your mini site will be created on your Shared Hosting Server HTML5...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Parakeet

    Parakeet

    PAddle PARAllel text-to-speech toolKIT

    PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN) Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    Voided-Git AutoTyper

    Voided-Git AutoTyper

    A simple autotyper that can be used to send multiple messages

    A simple autotyper that can be used to send multiple messages at the same time or with a slight delay. Run, enter the text you want repeatedly typed, enter the amount of time you want the text repeated (or sent) and enter the delay (can be 0). Please use this software responsibly as some applications can terminate your account by spamming messages (as per their individual ToS). No responsibility is held should your account be suspended, rate-limited or terminated. Most applications, however...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    TensorFlowTTS

    TensorFlowTTS

    Real-Time State-of-the-art Speech Synthesis for Tensorflow 2

    TensorFlowTTS is a state-of-the-art, open-source speech synthesis library built on TensorFlow 2. It offers a variety of architectures for text-to-speech, including classic and modern models such as Tacotron‑2, FastSpeech / FastSpeech2, and neural vocoders like MelGAN and Multiband‑MelGAN. Because it’s based on TensorFlow 2, it can leverage optimizations such as fake-quantization aware training and pruning — which allow models to run faster than real time and to be deployable on mobile or embedded platforms. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    PersonGen

    PersonGen

    A minor Project in Python which uses the RandomUser API .

    A Small Program in Python That Makes Use of RandomUser API To Generate Random Person Data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    VITS

    VITS

    Conditional Variational Autoencoder with Adversarial Learning

    VITS is a foundational research implementation of “VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech,” a well-known neural TTS architecture. Unlike traditional two-stage systems that separately train an acoustic model and a vocoder, VITS trains an end-to-end model that maps text directly to waveform using a conditional variational autoencoder combined with normalizing flows and adversarial training. This architecture enables parallel generation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    HistogramsApp

    HistogramsApp

    Application that generates KDE-PDP plots from geochronological data

    HistogramsApp is a Python 3.6 application that generates (KDE and PDP) from geochronological data .HistogramsApp allows to interactively setup plot parameters such as the bandwidth and the peak detection sensibility. To cite the application please refer to: 1) https://www.tandfonline.com/doi/abs/10.1080/00206814.2021.1954556?journalCode=tigr20 Rodriguez-Corcho, A. F., Rojas-Agramonte, Y., Barrera-Gonzalez, J. A., Marroquin-Gomez, M. P., Bonilla-Correa, S., Izquierdo-Camacho, D.,...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Kashgari

    Kashgari

    Kashgari is a production-level NLP Transfer learning framework

    Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    TransformerTTS is an implementation of a non-autoregressive Transformer-based neural network for text-to-speech, built with TensorFlow 2. It takes inspiration from architectures like FastSpeech, FastSpeech 2, FastPitch, and Transformer TTS, and extends them with its own aligner and forward models. The system separates alignment learning and acoustic modeling: an autoregressive Transformer is used as an aligner to extract phoneme-to-frame durations, while a non-autoregressive...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 19
    Pythopad

    Pythopad

    A free Python source code editor and Notepad replacement for Windows

    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    Glazier

    Glazier

    A tool for automating the installation of Windows OS

    Glazier is an automation framework developed by Google for deploying and managing Windows operating systems at scale. It streamlines the entire Windows imaging process by booting systems into the Windows Preinstallation Environment (WinPE), retrieving installation instructions from a web server, and automatically applying operating systems, software, and configurations. The tool is fully text-based and code-driven, with configurations written in YAML, allowing teams to leverage source...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Footcontroller

    Footcontroller

    Control your Linux PC with a standard foot pedal

    This python utility allows users on Linux to control their PC using a HID compatible USB foot pedal. Note: footcontroller does not support the new VEC Infinity in-USB3 pedal, which is not fully Linux compatible The foot pedal becomes an extra mouse or mini keyboard but footcontroller allows you to define multiple pedal sets which you can activate at the click of a button. It uses xdotool to provide you with the ability to assign commands to each pedal. Many foot pedals come with...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    OpenNum

    OpenNum

    OpenNum lets you distribute solvers with a nice graphical interface

    Typically, to program a GUI is time consuming and requires experience with graphic libraries. OpenNum lets you create a graphical interface adapted to your solvers by simply editing an XML configuration file. More specifically, OpenNum lets you · to collect a hierarchical dataset, · to call any executable file and · to visualize scalar and vector fields, plot graphs or show simple plain text files. It also has other useful utilities specifically designed for numerical...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    kamiFaka

    kamiFaka

    kamiFaka

    Applicable to all kinds of e-commerce, coupons, forum invitation codes, recharge cards, activation codes, registration codes, Tencent iQiyi points CDK, etc., support manual and automatic delivery, and a tiered wholesale model similar to 1688. Stisla UI: The web interface is beautiful Front-end uses VUE3.0, millisecond-level response. Has integrated Alipay face-to-face payment, WeChat official, Payjs, Hupijiao, YunGouOS, Yipay, Mugglepay, V visa-free and other more than a dozen payment...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24
    Big List of Naughty Strings

    Big List of Naughty Strings

    List of strings which have a high probability of causing issues

    The Big List of Naughty Strings is a community-maintained catalog of “gotcha” inputs that commonly break software, from unusual Unicode to SQL and script injection payloads. It exists so developers and QA engineers can easily test edge cases that normal test data would miss, such as zero-width characters, right-to-left marks, emojis, foreign alphabets, and long or malformed strings. By throwing these strings at forms, APIs, databases, and UIs, teams can discover encoding bugs, sanitizer...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    PaddlePaddle models

    PaddlePaddle models

    Pre-trained and Reproduced Deep Learning Models

    Pre-trained and Reproduced Deep Learning Models ("Flying Paddle" official model library, including a variety of academic frontier and industrial scene verification of deep learning models) Flying Paddle's industrial-level model library includes a large number of mainstream models that have been polished by industrial practice for a long time and models that have won championships in international competitions; it provides many scenarios for semantic understanding, image classification,...
    Downloads: 2 This Week
    Last Update:
    See Project
MongoDB Logo MongoDB