Search Results for "speech to text in java" - Page 2

Sort By:

Relevance

Clear All Filters

Linux 122
Windows 121
Mac 97
More...
BSD 44
ChromeOS 35
Mobile Operating Systems 8
Desktop Operating Systems 2
Embedded Operating Systems 2

Showing 146 open source projects for "speech to text in java"

View related business solutions

Python Clear Filters & Widen Search

Simplify IT and security with a single endpoint management platform
Automate the hardest parts of IT

NinjaOne automates the hardest parts of IT, delivering visibility, security, and control over all endpoints for more than 20,000 customers. The NinjaOne automated endpoint management platform is proven to increase productivity, reduce security risk, and lower costs for IT teams and managed service providers. The company seamlessly integrates with a wide range of IT and security technologies. NinjaOne is obsessed with customer success and provides free and unlimited onboarding, training, and support.

Learn More
Powering the best of the internet | Fastly
Fastly's edge cloud platform delivers faster, safer, and more scalable sites and apps to customers.

Ensure your websites, applications and services can effortlessly handle the demands of your users with Fastly. Fastly’s portfolio is designed to be highly performant, personalized and secure while seamlessly scaling to support your growth.

Try for free
1

PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model

PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks in speech and audio, with state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey. We provide high...

Downloads: 1 This Week

Last Update: 2025-03-04
See Project
2

flair

A very simple framework for state-of-the-art NLP

A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical texts, sense disambiguation and classification, with support for a rapidly growing number of languages. A text embedding library. Flair has simple...

Downloads: 1 This Week

Last Update: 2025-02-05
See Project
3

ChatGPT Academic

ChatGPT extension for scientific research work

ChatGPT extension for scientific research work, specially optimized academic paper polishing experience, supports custom shortcut buttons, supports custom function plug-ins, supports markdown table display, double display of Tex formulas, complete code display function, new local Python/C++/Go project tree Analysis function/Project source code self-translation ability, newly added PDF and Word document batch summary function/PDF paper full-text translation function. All buttons are dynamically...

Downloads: 1 This Week

Last Update: 2024-12-19
See Project
4

tika-python

Python binding to the Apache Tika™ REST services

A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...

Downloads: 0 This Week

Last Update: 2025-03-22
See Project
MongoDB Atlas | Run databases anywhere
Ensure the availability of your data with coverage across AWS, Azure, and GCP on MongoDB Atlas—the multi-cloud database for every enterprise.

MongoDB Atlas allows you to build and run modern applications across 125+ cloud regions, spanning AWS, Azure, and Google Cloud. Its multi-cloud clusters enable seamless data distribution and automated failover between cloud providers, ensuring high availability and flexibility without added complexity.

Learn More
5

Text To Speech Unlimited

Chuyển đổi văn bản thành giọng nói không giới hạn

Chuyển đổi văn bản thành giọng nói không giới hạn số lượng từ và có thể điều chỉnh tốc độ đọc, giọng đọc

Downloads: 2 This Week

Last Update: 2025-03-19
See Project
6

Prime QA

State-of-the-art Multilingual Question Answering research

PrimeQA is a public open source repository that enables researchers and developers to train state-of-the-art models for question answering (QA). By using PrimeQA, a researcher can replicate the experiments outlined in a paper published in the latest NLP conference while also enjoying the capability to download pre-trained models (from an online repository) and run them on their own custom data. PrimeQA is built on top of the Transformers toolkit and uses datasets and models that are directly...

Downloads: 1 This Week

Last Update: 2023-08-21
See Project
7

CSM (Conversational Speech Model)

A Conversational Speech Generation Model

The CSM (Conversational Speech Model) is a speech generation model developed by Sesame AI that creates RVQ audio codes from text and audio inputs. It uses a Llama backbone and a smaller audio decoder to produce audio codes for realistic speech synthesis. The model has been fine-tuned for interactive voice demos and is hosted on platforms like Hugging Face for testing. CSM offers a flexible setup and is compatible with CUDA-enabled GPUs for efficient execution.

Downloads: 2 This Week

Last Update: 2025-03-19
See Project
8

VALL-E

PyTorch implementation of VALL-E (Zero-Shot Text-To-Speech)

We introduce a language modeling approach for text to speech synthesis (TTS). Specifically, we train a neural codec language model (called VALL-E) using discrete codes derived from an off-the-shelf neural audio codec model, and regard TTS as a conditional language modeling task rather than continuous signal regression as in previous work. During the pre-training stage, we scale up the TTS training data to 60K hours of English speech which is hundreds of times larger than existing systems. VALL...

Downloads: 14 This Week

Last Update: 2023-04-14
See Project
9

sense2vec

Contextually-keyed word vectors

sense2vec (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detailed word vectors. This library is a simple Python implementation for loading, querying and training sense2vec models. For more details, check out our blog post. To explore the semantic similarities across all Reddit comments of 2015 and 2019, see the interactive demo.

Downloads: 10 This Week

Last Update: 2024-08-16
See Project
Build Securely on Azure with Proven Frameworks
Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.

Download Now
10

Mice MX OS speech to text Voice Control

Mice speech to text with MX Cinnamon OS ISO

Note about this image This image contains a system based on Linux MX, which was created to improve accessibility within the Linux environment. The distribution uses the Cinnamon desktop interface, which is configured to be operated using voice commands and outputs. The user interface and the control of your own devices and home automation systems can be customized and extended. The voice control program MiceStTM.py was developed to enable easy adaptation to other languages. However, only...

Downloads: 1 This Week

Last Update: 2025-05-14
See Project
11

textacy

NLP, before and after spaCy

textacy is a Python library for performing a variety of natural language processing (NLP) tasks, built on the high-performance spaCy library. With the fundamentals, tokenization, part-of-speech tagging, dependency parsing, etc., delegated to another library, textacy focuses primarily on the tasks that come before and follow after.

Downloads: 1 This Week

Last Update: 2025-01-22
See Project
12

Tokenized Text Aligner

Aligns tokens in two versions of a text with differing tokenization.

This tool performs token-by-token alignment of two versions of a text with differing tokenization by interpreting the results of a file diff (https://docs.python.org/3/library/difflib.html). It is intended for use in the preparation of annotated linguistic corpora, where differences in tokenization may arise (i) following corrections or modifications to the source text or (ii) through the creation of different layers of annotation (part-of-speech, treebank) requiring different tokenization...

Downloads: 0 This Week

Last Update: 2024-07-31
See Project
13

Amiga Memories

A walk along memory lane

Amiga Memories is a project (started & released in 2013) that aims to make video programmes that can be published on the internet. The images and sound produced by Amiga Memories are 100% automatically generated. The generator itself is implemented in Squirrel, the 3D rendering is done on GameStart 3D. An Amiga Memories video is mostly based on a narrative. The purpose of the script is to define the spoken and written content. The spoken text will be read by a voice synthesizer (Text To Speech...

Downloads: 1 This Week

Last Update: 2023-03-22
See Project
14

KoboldCpp

Run GGUF models easily with a UI or API. One File. Zero Install.

KoboldCpp is an easy-to-use AI text-generation software for GGML and GGUF models, inspired by the original KoboldAI. It's a single self-contained distributable that builds off llama.cpp and adds many additional powerful features.

Downloads: 225 This Week

Last Update: 15 hours ago
See Project
15

VATSG

Video automatic transcribe and translated subtitle generator

It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your target language subtitle file(srt format) If you...

Downloads: 30 This Week

Last Update: 2023-09-19
See Project
16

SoundTranscriber

SoundTranscriber can be used to generate automatic transcription / aut

SoundTranscriber can be used to generate automatic transcription / aut

1 Review

Downloads: 9 This Week

Last Update: 2025-07-10
See Project
17

maXbox5

maXbox is a script tool engine, compiler and source lib all in one exe

The Art of Coding: maXbox is a script tool engine, compiler and source lib all in one exe to design and code your scripts in a shellbook! It supports Pascal, Delphi (VCL), Python (P4D), PowerShell and Java Script (Edge WebView).

Downloads: 28 This Week

Last Update: 19 hours ago
See Project
18

Whisper Batch Transcriber

Unlimited, private and free Speech-To-Text program

## About: Automatically transcribe all of your voice recordings into clean, organized, neat text files. It's free, fully automated, unlimited, using state-of-the-art speech-to-text technology. Works 100% offline on your computer, privately and locally. ## Usecases: Convert speeches, podcasts, webinars, monologues, storytellings and other audio speech into a formatted .txt file. One sentence per new line. ## Notes: - Its 2GB in size and requires 2-6GB of GPU VRAM too. (basically...

Downloads: 8 This Week

Last Update: 6 days ago
See Project
19

Advanced Trigonometry Calculator

Precision Trigonometry: Advanced Calculator for Complex Math

Advanced Trigonometry Calculator is equipped with a user-friendly interface that allows for easy input of problems and instant computation. Professionals such as engineers who need to perform advanced trigonometric calculations in their work will find this tool extremely useful. More info by clicking below: https://advantrigoncalc.sourceforge.io/ Advanced Trigonometry Calculator was only and always only developed by the Portuguese Renato Alexandre dos Santos Freitas. Also author of...

2 Reviews

Downloads: 22 This Week

Last Update: 1 day ago
See Project
20

EasyTTS

Text to Speech Utility

EasyTTS is a text to speech app for 64 bit Windows that offers online and offline text-to-speech, with settings for how fast the voice is. It supports languages other than English but only if you are connected to the Internet. These are Spanish, Portuguese, Russian, French, and Mandarin (?) Chinese.

1 Review

Downloads: 1 This Week

Last Update: 2024-05-01
See Project
21

myScite

The allRound pocket sized CodeEditor.

... source code: - [ Syntax highlighted ] - go, vala, pike, swift, flash, ch, rust - [ Calltip assisted ] - c/cpp11, js&jQuery, python, php, ruby, lua, c#, java, perl --Others-- - Restructured config files with inline docs - Scriptable via lua Extension. - Provides theming, a customizeable toolbar, a sidebar with a File Browser and a cleaned up options Menu. --- Summary --- Enjoy your vlc for sourceCode lecture ;) (30Langs)

Downloads: 7 This Week

Last Update: 2025-05-30
See Project
22

Maia

MAIA (MyApp Intelligence Artificial) is designed to provide a foundation for building your own voice-controlled assistant with Python. It uses various libraries and modules for speech recognition, text-to-speech synthesis, and custom functionality.

Downloads: 0 This Week

Last Update: 2024-04-21
See Project
23

Mice TTM

mice stt tts

Dieses Tool wird speziell für die Barrierefreiheit unter Linux entwickelt. Es ermöglicht das umwandeln/konvertieren/parsen von Texten die aus einer Spracherkennung stammen, in Diktate sowie das Ausführen von Makros. Dies funktioniert ohne Internet, da die Spracherkennung auf dem PC selbst erfolgt. Mausbewegungen auf benannte Wörter und dann entsprechend auswählen oder per Sprachbefehl klicken. Außerdem können Textpassagen z.B. unter Libreoffice Wirter per Sprachbefehl entsprechend...

Downloads: 0 This Week

Last Update: 2024-08-07
See Project
24

translate-gui

GUI for translate-shell, the cli tool for quick translation

GUI for the translate-shell, aims to be easy to use translator and a helpful tool for learning new languages. Most tools do a one way translation from source to target language, do to the reverse involves choosing the source and target languages again. This tool can do a 2 way translation accompanied by speech output of the target language text. Hence it can prove to be an indispensable aid when learning new languages

Downloads: 0 This Week

Last Update: 2025-06-21
See Project
25

SpeakLogPSU

SpeakLogPSU can speak chat messages with an individual voice if the NPC or player was configured or with a default one. You will never miss if someone talks to you. Voice cloning can be accomplished with cogui in less than five minutes without GPU. The result is archived and can be used the next time in game. Some TTS projects already started to add tag support to speak text with emotions or sing it. If a game designer has that in mind with a good chat log she can voiced her game...

Downloads: 0 This Week

Last Update: 2024-10-14
See Project