Search Results for "speech to text in java" - Page 3

Showing 146 open source projects for "speech to text in java"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 1
    eCxx

    eCxx

    A C++ library for AVR and NodeMCU

    NOTE: This project is marked with 'Status: Abandoned' on SourceForge because not enough time can be dedicated to this project. However it may still get sporadic commits to the repository. eCxx is a library for AVR and NodeMCU tailored for micro LED displays and lighting effects. eCxx is utilizing Makefile build system. Java and Python based applications/tools are also included to ease the development and debugging process using the host PC. On one side, eCxx supports the original...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    ASRT Speech Recognition

    ASRT Speech Recognition

    A Deep-Learning-Based Chinese Speech Recognition System

    ASRT is an end-to-end deep-learning Chinese ASR system built with TensorFlow/Keras, using convolution + CTC and a Max-Entropy HMM language model. It provides a REST/gRPC server backend and client SDKs in multiple languages (Python, Java, Go, Windows). Notably lightweight, it performs well without needing GPU acceleration and runs across platforms, targeting developers and researchers building Chinese voice interfaces.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 3
    Txt-2-Mp3  6.3 Mark 2 [I.S.A]

    Txt-2-Mp3 6.3 Mark 2 [I.S.A]

    Txt-2-Mp3 6.3 Mark 2 [Improved.Simplified.Alternative]

    'Txt2Mp3' an desktop application developed using python 3.6.8 and other add-on libaries. Can convert texts into audio (.mp3) files using gTTS (Google Text-to-speech) api module library. Compatible only for windows OS.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 4
    Repeat

    Repeat

    Mouse/keyboard record/replay and automation hotkeys/macros creation

    Full-fledged mouse/keyboard record/replay and automation hotkeys/macros creation using modern programming languages, and more advanced automation features. Working across three major OSes: Windows, OSX, and Linux. See more at https://github.com/repeats/Repeat Repeat yourself with some intelligence. This, if used correctly, can improve your productivity greatly.
    Leader badge
    Downloads: 66 This Week
    Last Update:
    See Project
  • MongoDB Atlas | Run databases anywhere Icon
    MongoDB Atlas | Run databases anywhere

    Ensure the availability of your data with coverage across AWS, Azure, and GCP on MongoDB Atlas—the multi-cloud database for every enterprise.

    MongoDB Atlas allows you to build and run modern applications across 125+ cloud regions, spanning AWS, Azure, and Google Cloud. Its multi-cloud clusters enable seamless data distribution and automated failover between cloud providers, ensuring high availability and flexibility without added complexity.
    Learn More
  • 5
    VoiceSmith

    VoiceSmith

    [WIP] VoiceSmith makes training text to speech models easy

    VoiceSmith makes it possible to train and infer on both single and multispeaker models without any coding experience. It fine-tunes a pretty solid text to speech pipeline based on a modified version of DelightfulTTS and UnivNet on your dataset. Both models were pretrained on a proprietary 5000 speaker dataset. It also provides some tools for dataset preprocessing like automatic text normalization. Windows (only CPU supported currently) or any Linux based operating system. If you want to run...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    Tensorflow Transformers

    Tensorflow Transformers

    State of the art faster Transformer with Tensorflow 2.0

    ... speech recognition and audio classification. Faster AutoReggressive Decoding, TFlite support, creating TFRecords is simple. Auto-Batching tf.data.dataset or tf.ragged tensors. Everything is dictionary (inputs and outputs) Multiple mask modes like causal, user-defined, prefix. tensorflow-text tokenizer support. Supports GPU, TPU, multi-GPU trainer with wandb, multiple callbacks, auto tensorboard.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    ... the robustness gaps of your model! We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, etc.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Mycroft

    Mycroft

    Mycroft Core, the Mycroft Artificial Intelligence platform

    Mycroft is the world’s leading open source voice assistant. It is private by default and completely customizable. Our software runs on many platforms, on desktop, our reference hardware, a Raspberry Pi, or your own custom hardware. Our open-source, modular system can be ported to your device or environment, at any price point. Whether you make voice-assistants, televisions, or microwaves. Whether you have a 5-room BnB or a 1000-room hotel. Your customers will get access to all the...
    Downloads: 83 This Week
    Last Update:
    See Project
  • 9
    Project Alice

    Project Alice

    Main repository of Project Alice, contains main unit source code

    ... Text to Speech engines), just like Snips. Since Snips (and the Project Alice team) strongly believe that decisions about your privacy should be made by you and you alone, these options are all disabled by default. The original code base was started at the end 2015 and several rewrites made it what it is today. It was entirely written by me Psycho until recently, where I decided to make the code openly available to the world.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Picsart Enterprise Background Removal API for Stunning eCommerce Visuals Icon
    Picsart Enterprise Background Removal API for Stunning eCommerce Visuals

    Instantly remove the background from your images in just one click.

    With our Remove Background API tool, you can access the transformative capabilities of automation , which will allow you to turn any photo asset into compelling product imagery. With elevated visuals quality on your digital platforms, you can captivate your audience, and therefore achieve higher engagement and sales.
    Learn More
  • 10
    Parakeet

    Parakeet

    PAddle PARAllel text-to-speech toolKIT

    PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN) Parakeet aims to provide a flexible, efficient and state-of-the-art text-to-speech toolkit for the open-source community. It is built on PaddlePaddle dynamic graph and includes many influential TTS models. In order to facilitate exploiting the existing TTS models directly and developing the new ones, Parakeet selects typical models and provides...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Kashgari

    Kashgari

    Kashgari is a production-level NLP Transfer learning framework

    Kashgari is a simple and powerful NLP Transfer learning framework, build a state-of-art model in 5 minutes for named entity recognition (NER), part-of-speech tagging (PoS), and text classification tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 13
    aseryla

    aseryla

    Aseryla code repositories

    This project describes a model of how the semantic human memory represents the information relevant to the objects of the world in text format. It provides a system and a GUI application capable of extracting and managing concepts and relations from English texts. https://aseryla2.sourceforge.io/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Multilingual Speech Synthesis

    Multilingual Speech Synthesis

    An implementation of Tacotron 2 that supports multilingual experiments

    This repository provides synthesized samples, training and evaluation data, source code, and parameters for the paper One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech. It contains an implementation of Tacotron 2 that supports multilingual experiments and that implements different approaches to encoder parameter sharing. It presents a model combining ideas from Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    PaddlePaddle models

    PaddlePaddle models

    Pre-trained and Reproduced Deep Learning Models

    ... detection, image segmentation, text recognition, speech synthesis, etc. An end-to-end development kit that meets the needs of enterprises for low-cost development and rapid integration. The model library of Flying Paddle is an industrial-level model library tailored around the actual R&D process of domestic enterprises, serving enterprises in many fields such as energy, finance, industry, and agriculture.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    commit-autosuggestions

    commit-autosuggestions

    A tool that AI automatically recommends commit messages

    This is implementation of CommitBERT: Commit Message Generation Using Pre-Trained Programming Language Model. CommitBERT is accepted in ACL workshop : NLP4Prog. Have you ever hesitated to write a commit message? Now get a commit message from Artificial Intelligence! CodeBERT: A Pre-Trained Model for Programming and Natural Languages introduces a pre-trained model in a combination of Program Language and Natural Language(PL-NL). It also introduces the problem of converting code into natural...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Delta ML

    Delta ML

    Deep learning based natural language and speech processing platform

    ..., develop, and deploy NLP and/or speech models. Use configuration files to easily tune parameters and network structures. What you see in training is what you get in serving: all data processing and features extraction are integrated into a model graph. Text classification, named entity recognition, question and answering, text summarization, etc. Uniform I/O interfaces and no changes for new models.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 18
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Free Queue Manager

    Free Queue Manager

    Web based python-flask Queue management system

    A web based management system developed for the purpose of easing the process of orgnizing queues and lines. Like many other (QMS)s Queue Management Systems, FQM does provide a basic dashboard to allow the users of the system and customers alike to interact with the system via a basic yet simple user interface . Brief user guide can be found on https://fqms.github.io/images/user_guide.pdf
    Leader badge
    Downloads: 6 This Week
    Last Update:
    See Project
  • 20
    jieba

    jieba

    Stuttering Chinese word segmentation

    "Jaba" Chinese word segmentation, do the best Python Chinese word segmentation component. Four word segmentation modes are supported. Precise mode, which tries to cut the sentence most precisely, suitable for text analysis. Full mode, scans all the words that can be formed into words in the sentence, the speed is very fast, but the ambiguity cannot be resolved. The search engine mode, on the basis of the precise mode, divides the long words again to improve the recall rate, which is suitable...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Dragonfire

    Dragonfire

    The open-source virtual assistant for Ubuntu based Linux distributions

    .... It will contain various software packages for controlling the helmet. It will be the first of its kind. Dragonfire uses Mozilla DeepSpeech to understand your voice commands and Festival Speech Synthesis System to handle text-to-speech tasks.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    SmartBody

    Character animation system for games and simulations.

    SmartBody is available for download for Windows, Linux and OSX users. SmartBody can also be used on Android and iOS platforms. SmartBody is a character animation platform that provides the following capabilities in real time: * Locomotion (walk, jog, run, turn, strafe, jump, etc.) * Steering - avoiding obstacles and moving objects * Object manipulation - reach, grasp, touch , pick up objects * Lip Syncing - characters can speak with simultaneous lip-sync using text-to-speech...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 24
    Bangla TTS

    Bangla TTS

    Bangla text to speech synthesis in python

    Bangla text to speech Multilingual (Bangla, English) real-time ([almost] in a GPU) speech synthesis library. Installation -------------------------------------- * Install Anaconda * conda create -n new_virtual_env python==3.6.8 * conda activate new_virtual_env * pip install -r requirements.txt * While running for the first time, keep your internet connection on to download the weights of the speech synthesis models (>500 MB) * For fast...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    Defox text to speech and downloader

    Defox text to speech and downloader

    Written or imported text offline read or online download.

    This software design to convert text to speech and download the converted speech. Description : • Installation setup with two languages (English, French) • Two areas called text reading and speech downloading • Many languages supported to download center Note 1: I'm a student yet and I'm not in the software designing industry. Therefore maybe I haven't software making skills. I'm worried about that. ! Note 2 : When you double click on the software maybe it will get some seconds...
    Downloads: 1 This Week
    Last Update:
    See Project
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.