Showing 25 open source projects for "speech text"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Earn up to 16% annual interest with Nexo. Icon
    Earn up to 16% annual interest with Nexo.

    More flexibility. More control.

    Generate interest, access liquidity without selling, and execute trades seamlessly. All in one platform. Geographic restrictions, eligibility, and terms apply.
    Get started with Nexo.
  • 1
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 2
    AnySoftKeyboard

    AnySoftKeyboard

    Android (f/w 2.1+) on screen keyboard for multiple languages

    The only Android keyboard you'll ever need. Free as in speech and Free as in beer. Android (f/w 4.0.3+, API level 15+) on screen keyboard for multiple languages.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    Cookbook (Google Gemini)

    Cookbook (Google Gemini)

    Examples and guides for using the Gemini API

    ...It provides a structured learning path with quick-start tutorials for beginners and practical examples for advanced users. The repository covers a wide range of Gemini capabilities, including text, images, video, speech, robotics, and multimodal interactions. It highlights newly introduced features such as Gemini 2.5 models (Flash and Pro), Gemini’s native image generation, Veo for video generation, robotics-focused reasoning models, and Lyria for TTS and music generation. The Cookbook also includes tutorials on advanced API workflows such as grounding answers with external tools, batch-mode request handling, and live multimodal interactivity with LiveAPI. ...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • 5
    react-use

    react-use

    Component for React

    ...Tracks mouse hover state of some element. Display an element or video full-screen. Tracks location hash value. Tracks whether user is being inactive. Tracks an HTML element's intersection. Synthesizes speech from a text string. Tracks page navigation bar location state. Re-renders component, while tweening a number from 0 to 1. Tracks long press gesture of some element. Tracks state of a CSS media query. Tracks state of connected hardware devices. Returns a callback, which re-renders component when called. Tracks state of device's motion sensor. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    JSpeech

    JSpeech

    Java library designed to integrate Speech-to-Text

    jSpeech is a Java library designed to integrate Speech-to-Text (STT) capabilities, command control, and diarization (speaker identification) into applications in a simple, modular, and decoupled way.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    RemoteTTS

    RemoteTTS

    Tool to remotely activate Text-To-Speech (TTS) on a server

    The tool provides a simple TCP/UDP interface to let a remote machine perform TTS outputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    linViex is a graphical programming environment for home automation tasks and other applications. It supports remote controls, sensor devices, power switches/dimmers, e-mail, text-to-speech conversion, media players and many more. Graphic symbols of functional objects can be interconnected to exchange and process data.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. Video recognition does not need to upload the original video, which is convenient, fast and time-saving.
    Downloads: 12 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 10
    AugLy

    AugLy

    A data augmentations library for audio, image, text, and video

    ...We designed AugLy to include many specific data augmentations that users perform in real life on internet platforms like Facebook's -- for example making an image into a meme, overlaying text/emojis on images/videos, reposting a screenshot from social media. While AugLy contains more generic data augmentations as well, it will be particularly useful to you if you're working on a problem like copy detection, hate speech detection, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Euler

    Euler

    A distributed graph deep learning framework.

    As a general data structure with strong expressive ability, graphs can be used to describe many problems in the real world, such as user networks in social scenarios, user and commodity networks in e-commerce scenarios, communication networks in telecom scenarios, and transaction networks in financial scenarios. and drug molecule networks in medical scenarios, etc. Data in the fields of text, speech, and images is easier to process into a grid-like type of Euclidean space, which is suitable for processing by existing deep learning models. Graph is a data type in non-Euclidean space and cannot be directly applied to existing methods, requiring a specially designed graph neural network system. Graph-based learning methods such as graph neural networks combine end-to-end learning with inductive reasoning, and are expected to solve a series of problems such as relational reasoning and interpretability that deep learning cannot handle.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    GoodByeCatpcha

    GoodByeCatpcha

    Solver ReCaptcha v2 Free

    An async Python library to automate solving ReCAPTCHA v2 by images/audio using Mozilla's DeepSpeech, PocketSphinx, Microsoft Azure’s, Google Speech and Amazon's Transcribe Speech-to-Text API. Also image recognition to detect the object suggested in the captcha. Built with Pyppeteer for Chrome automation framework and similarities to Puppeteer, PyDub for easily converting MP3 files into WAV, aiohttp for async minimalistic web-server, and Python’s built-in AsyncIO for convenience.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    FolioReaderKit

    FolioReaderKit

    A Swift ePub reader and parser framework for iOS

    ...Go to your storyboard file, choose or create the view controller that should present the epub reader. In the identity, the inspector set StoryboardFolioReaderContrainer as a class. Media Overlays (Sync text rendering with audio playback). TTS - Text to Speech Support, parse epub cover image, RTL Support. Vertical or/and Horizontal scrolling, share Custom Image Quotes NEW, supports multiple instances at same time, like parallel reading.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    TextBlob

    TextBlob

    TextBlob is a Python library for processing textual data

    Simple, Pythonic, text processing, Sentiment analysis, part-of-speech tagging, noun phrase extraction, translation, and more. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and pattern, and plays nicely with both.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    DPRK pull is a script that pulls the English language North Korean news articles from the KCNA website and puts them into one file for reading by a Text to Speech program.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    YABT is a context sensative string manipulation library. YABT currently can be used for Braille translation, but it is possible to put it to other uses such as preparation of text to be spoken by speech synthesisers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Interactive4J
    Project aim to provide simple easy APIs for Java developers to use interactive abilities in their Java Applications like speech recognition, handwriting recognition, use of web cam , sound record/play, decision trees , text to speech and many others.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A phrase to phoneme code converter for the SpeakJet chip by Magnevation. Speakalator runs on Unix type operating systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    WOSH Framework - Wide Open Smart Home
    WOSH is a multi-platform message-oriented middleware written in ANSI C++. Service oriented architecture, designed for network distributed computing. Already working: Audio multimedia, X10, remote control (WinMobile, GTalk) and much more..
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    Speech based User Interface Components Library for Java is a project to create Java controls and applications that can be used not only by literate people but also by non-literates. Speech and visual element with minimal text is used to create components
    Downloads: 5 This Week
    Last Update:
    See Project
  • 23
    The jTTS project provides a refactored implementation of the FreeTTS software speech synthesizer. Rather than building upon the flite implementation of a TTS engine, it implements engine capabilities from Festival. It also supports British English.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Virtual News Reader is a computer desktop application that can convert text (idealy for online news) into Virtual Voice and can be saved on computer. The project is using Java, (JSAPI), FreeTTS (Text-To-Speech synthesis), DJProject, Substance, & other.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Libmluv is a C/C++ programmers library to provide the Czech text-to-speech synthesis and should be able to do transcription of Czech text to string of phonemes.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
Auth0 Logo