Showing 413 open source projects for "text to speech -gpt"

View related business solutions
  • Bright Data - All in One Platform for Proxies and Web Scraping Icon
    Bright Data - All in One Platform for Proxies and Web Scraping

    Say goodbye to blocks, restrictions, and CAPTCHAs

    Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.
    Get Started
  • The #1 Embedded Analytics Solution for SaaS Teams. Icon
    The #1 Embedded Analytics Solution for SaaS Teams.

    Qrvey saves engineering teams time and money with a turnkey multi-tenant solution connecting your data warehouse to your SaaS application.

    Qrvey’s comprehensive embedded analytics software enables you to design more customizable analytics experiences for your end users.
    Try Developer Playground
  • 1

    Omilo - a text to speech application

    Omilo is a simple text to speech application

    Omilo is a simple text to speech application for Windows and Linux using Festival, Flite, Marytts and Piper voices.
    Leader badge
    Downloads: 7 This Week
    Last Update:
    See Project
  • 2

    Russian Text-to-speech programs

    читание, чтение, говорение

    For Windows (on Linux trought Wine can work) 3 russian text-to-speech programs (Chitanie, Chtenie and Govorenie). If you want donate. paypal.me/alkbab Читание, Чтение, Говорение есть программы пробующие преобразовать русский текст в русскую речь . Для Windows. На Linux через Wine... Кто хочет может пожертвовать paypal.me/alkbab
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 83 This Week
    Last Update:
    See Project
  • 4
    TTS Voice Wizard

    TTS Voice Wizard

    Speech to Text to Speech, sends text as OSC messages

    Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) Use TTS Voice Wizard's accessibility features to improve your VRChat experience (it works outside of VRChat too!) You can convert your Speech-to-Text and back to Speech through various Speech Recognition and Text-to-Speech methods. You can send what you say as OSC messages to VRChat to be displayed on your avatar using KillFrenzyAvatarText or VRChats...
    Downloads: 25 This Week
    Last Update:
    See Project
  • Free CRM Software With Something for Everyone Icon
    Free CRM Software With Something for Everyone

    216,000+ customers in over 135 countries grow their businesses with HubSpot

    Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.
    Get free CRM
  • 5
    ChatGPT Desktop Application

    ChatGPT Desktop Application

    🔮 ChatGPT Desktop Application (Mac, Windows and Linux)

    ChatGPT Desktop Application (Mac, Windows and Linux)
    Downloads: 95 This Week
    Last Update:
    See Project
  • 6
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 7
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 8
    Whishper

    Whishper

    Transcribe any audio to text, translate and edit subtitles 100% locall

    Open-source, local-first audio transcription and subtitling suite with a simple web UI. Thanks to open-source technologies, Whishper can run 100% offline. Your data never leaves your computer. Whishper allows you to translate your transcriptions to and from more than 60 languages thanks to Argos Translate and LibreTranslate. Download the transcriptions in many formats (json, txt, vtt, srt). Easily edit your subtitles right in the Web-UI.
    Downloads: 24 This Week
    Last Update:
    See Project
  • 9
    Coqui TTS

    Coqui TTS

    A deep learning toolkit for Text-to-Speech, battle-tested in research

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. TTS comes with pre-trained models, tools for measuring dataset quality and is already used in 20+ languages for products and research projects. High-performance Deep Learning models for Text2Speech tasks. Text2Spec models (Tacotron, Tacotron2, Glow-TTS, SpeedySpeech). Speaker Encoder to compute speaker embeddings...
    Downloads: 33 This Week
    Last Update:
    See Project
  • Shift, the browser that merges all of your web apps into one powerful window. Icon
    Shift, the browser that merges all of your web apps into one powerful window.

    Your power browser.

    Streamline everything you do online when you install Shift and access thousands of apps without leaving your browser. Connect all of your Gmail, Outlook, and Office 365 accounts and manage everything from one centralized window. Build out your Shift browser with apps that integrate seamlessly so you have ultra-fast access to all the tools you use to stream, shop, work, browse, and stay connected. Shift brings it all together.
    Try for Free
  • 10
    Koodo Reader

    Koodo Reader

    A modern ebook manager and reader with sync and backup

    Koodo Reader is an all-in-one ebook reader that can help you better manage and study your ebooks. It's free and open-source. Save your data to Dropbox or Webdav. Customize the source folder and synchronize among multiple devices using OneDrive, iCloud, Dropbox, etc. Single-column, two-column, or continuous scrolling layouts. Text-to-speech, translation, progress slider, touch screen support, batch import. Add bookmarks, notes, highlights to your books. Adjust font size, font family, line...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 11
    Intelligent Java

    Intelligent Java

    Integrate with the latest language models, image generation and speech

    Intelligent java (IntelliJava) is the ultimate tool to integrate with the latest language models and deep learning frameworks using java. The library provides an intuitive functions for sending input to models like ChatGPT and DALL·E, and receiving generated text, speech or images. With just a few lines of code, you can easily access the power of cutting-edge AI models to enhance your projects. Access ChatGPT, GPT3 to generate text and DALL·E to generate images. OpenAI is preferred for quality...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 12
    Editor.js

    Editor.js

    A block-style editor with clean JSON output

    Editor.js is an open-source text editor offering a variety of features to help users create and format content efficiently. It has a modern, block-style interface that allows users to easily add and arrange different types of content, such as text, images, lists, quotes, etc. Each Block is provided via a separate plugin making Editor.js extremely flexible. Editor.js outputs clean JSON data instead of heavy HTML markup. Use it in the Web, iOS, Android, AMP, Instant Articles, speech readers...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 13
    OpenAI Translator

    OpenAI Translator

    Browser extension and cross-platform desktop app based on ChatGPT API

    .... You must press the shortcut key to trigger the translation after selecting a word. It offers three modes: translation, polishing and summarization. Our tool allows for mutual translation, polishing and summarization across 55 different languages. Streaming mode is supported! It allows users to customize their translation text. One-click copying, Text-to-Speech (TTS). Available on all platforms (Windows, macOS, and Linux) for both browsers and Desktop.
    Downloads: 9 This Week
    Last Update:
    See Project
  • 14
    HanLP

    HanLP

    Han Language Processing

    HanLP is a multilingual Natural Language Processing (NLP) library composed of a series of models and algorithms. Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. It comes...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 15
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Podcastfy.ai

    Podcastfy.ai

    Transforming Multimodal Content into Captivating Multilingual Audio

    Podcastfy is an open-source Python package that transforms multi-modal content (text, images) into engaging, multi-lingual audio conversations using GenAI. Input content includes websites, PDFs, youtube videos as well as images. Unlike UI-based tools focused primarily on note-taking or research synthesis (e.g. NotebookLM), Podcastfy focuses on the programmatic and bespoke generation of engaging, conversational transcripts and audio from a multitude of multi-modal sources enabling customization...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    PaddleSpeech

    PaddleSpeech

    Easy-to-use Speech Toolkit including Self-Supervised Learning model

    PaddleSpeech is an open-source toolkit on PaddlePaddle platform for a variety of critical tasks in speech and audio, with state-of-art and influential models. Via the easy-to-use, efficient, flexible and scalable implementation, our vision is to empower both industrial application and academic research, including training, inference & testing modules, and deployment process. Low barriers to install, CLI, Server, and Streaming Server is available to quick-start your journey. We provide high...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    AnkiDroid

    AnkiDroid

    Anki flashcards on Android

    Anki flashcards on Android. Your secret trick to achieve superhuman information retention. A semi-official port of the open source Anki spaced repetition flashcard system to Android. Memorize anything with AnkiDroid.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Moshi

    Moshi

    A speech-text foundation model for real time dialogue

    Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, down to a 12.5 Hz representation with a bandwidth of 1.1 kbps, in a fully streaming manner (latency of 80ms, the frame size), yet performs better than existing, non-streaming, codecs like SpeechTokenizer (50 Hz, 4kbps), or SemantiCodec (50 Hz, 1.3kbps). Moshi models two streams of audio: one corresponds to Moshi...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    sense2vec

    sense2vec

    Contextually-keyed word vectors

    sense2vec (Trask et. al, 2015) is a nice twist on word2vec that lets you learn more interesting and detailed word vectors. This library is a simple Python implementation for loading, querying and training sense2vec models. For more details, check out our blog post. To explore the semantic similarities across all Reddit comments of 2015 and 2019, see the interactive demo.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    PHP Client For NLP Cloud

    PHP Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    Nexa SDK

    Nexa SDK

    Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML

    Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), and speech-to-text (ASR), and text-to-speech (TTS) capabilities. Additionally, it offers an OpenAI-compatible API server with JSON schema mode for function calling and streaming support, and a user-friendly Streamlit UI. Users can run Nexa SDK in any device with Python environment, and GPU acceleration is supported, including CUDA, Metal, and ROCm...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Node.js Client For NLP Cloud

    Node.js Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models

    This is the Node.js client (with Typescript types) for the NLP Cloud API. NLP Cloud serves high-performance pre-trained or custom models for NER, sentiment analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, text generation, question answering, automatic speech recognition...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Python Client For NLP Cloud

    Python Client For NLP Cloud

    NLP Cloud serves high performance pre-trained or custom models for NER

    NLP Cloud serves high performance pre-trained or custom models for NER, sentiment-analysis, classification, summarization, dialogue summarization, paraphrasing, intent classification, product description and ad generation, chatbot, grammar and spelling correction, keywords and keyphrases extraction, text generation, image generation, blog post generation, source code generation, question answering, automatic speech recognition, machine translation, language detection, semantic search, semantic...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next