Showing 112 open source projects for "speech processing"

View related business solutions
  • Red Hat Ansible Automation Platform on Microsoft Azure Icon
    Red Hat Ansible Automation Platform on Microsoft Azure

    Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

    Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.
  • AI-powered conversation intelligence software Icon
    AI-powered conversation intelligence software

    Unlock call analytics that provide actionable insights with our call tracking software, empowering you to identify what's working and what's not.

    Every customer interaction is vital to your business success and revenue growth. With Jiminny’s AI-powered conversation intelligence software, we take recording, capturing, and meticulous analysis of call recordings to the next level. Unlock call analytics that provide actionable insights with our call tracking software, empowering you to identify what's working and what's not. Seamlessly support your biggest objectives across the entire business landscape with our innovative call tracking system.
  • 1
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 29 This Week
    Last Update:
    See Project
  • 2
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training...
    Downloads: 39 This Week
    Last Update:
    See Project
  • 3
    SPTK is a suite of speech signal processing tools for UNIX environments, e.g., LPC analysis, PARCOR analysis, LSP analysis, PARCOR synthesis filter, LSP synthesis filter, vector quantization techniques, and other extended versions of them.
    Leader badge
    Downloads: 12 This Week
    Last Update:
    See Project
  • 4
    Alan AI

    Alan AI

    In-App assistant SDK to build a multimodal conversational UX websites

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. To voice enable your app, you only need to get the Alan Client SDK and drop it to your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 9 This Week
    Last Update:
    See Project
  • EBizCharge Payment Platform for Accounts Receivable Icon
    EBizCharge Payment Platform for Accounts Receivable

    Getting paid has never been easier.

    Don’t let unpaid invoices limit your business’s growth. EBizCharge plugs directly into the tools your business already uses to speed up payment collection.
  • 5
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 6
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 7
    HanLP

    HanLP

    Han Language Processing

    HanLP is a multilingual Natural Language Processing (NLP) library composed of a series of models and algorithms. Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. It comes...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    Stanford CoreNLP

    Stanford CoreNLP

    Stanford CoreNLP, a Java suite of core NLP tools

    CoreNLP is your one stop shop for natural language processing in Java! CoreNLP enables users to derive linguistic annotations for text, including token and sentence boundaries, parts of speech, named entities, numeric and time values, dependency and constituency parses, coreference, sentiment, quote attributions, and relations. CoreNLP currently supports 6 languages, Arabic, Chinese, English, French, German, and Spanish. The centerpiece of CoreNLP is the pipeline. Pipelines take in raw text...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ... the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Secure Online Fax and Business Text Messaging Service Icon
    Secure Online Fax and Business Text Messaging Service

    Elevate your business communications with Notifyre's secure SMS and fax solutions.

    Send and receive SMS and fax online, from email, app or with our developer friendly SMS & fax API. HIPAA compliant & ISO 27001 certified. Outstanding value and 5-star service.
  • 10

    speech intonator

    The purpose of the project is to develop audio processing algorithms

    The initial version of the main branch of the project has been completed. The main name of the project is "Java audio mixer Summaha". The second name of the project is "Sound Arithmometer". Main purpose - production of musical sound remixes from a set of samples. The name "Summaha" rhymes well with 'Yamaha' and creates motivation and inspiration to achieve a sound quality comparable to with a well-known brand. Detailed documentation in 'read' signature files. Anyone who is interested in this...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Alan AI for Android

    Alan AI for Android

    Assistant SDK to build a multimodal conversational UX for Android

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Alan AI for iOS

    Alan AI for iOS

    In-App assistant SDK to build a multimodal conversational UX for iOS

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Recorder

    Recorder

    HTML5 js recording mp3 wav ogg webm amr format

    ... of browser (including PWA, WebClip, any App) on low-version iOS (11.0-14.2) except Safari inside page). Provides multiple plug-in function support. Rich audio visualization, variable speed and pitch processing, speech recognition, audio stream playback, etc.; with powerful real-time processing support, it can be used in various web applications: from simple recording to complex real-time voice Recognition (ASR), and even audio-related games, are handled with ease.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    flair

    flair

    A very simple framework for state-of-the-art NLP

    A very simple framework for state-of-the-art NLP. Developed by Humboldt University of Berlin and friends. A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical texts, sense disambiguation and classification, with support for a rapidly growing number of languages. A text embedding library. Flair has simple...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    ... of speech and morphological features, to give a syntactic structure dependency parse, and to recognize named entities. The toolkit is designed to be parallel among more than 70 languages, using the Universal Dependencies formalism. Stanza is built with highly accurate neural network components that also enable efficient training and evaluation with your own annotated data.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    amazon-connect-wisdomjs

    amazon-connect-wisdomjs

    Gives you the power to build your own Wisdom widget

    ... Salesforce and ServiceNow, as well as internal wikis, FAQ stores, and file shares. With Wisdom, agents can search across connected repositories to find answers and quickly resolve customer issues. In addition, Wisdom uses real-time speech analytics and natural language processing (NLP) from Contact Lens for Amazon Connect to detect customer issues during calls, and then provide agents with recommendations and answers. Wisdom provides faster issue resolution and improved customer satisfaction.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    ModelScope

    ModelScope

    Bring the notion of Model-as-a-Service to life

    ... unified experience to explore state-of-the-art models spanning across domains such as CV, NLP, Speech, Multi-Modality, and Scientific-computation. Model contributors of different areas can integrate models into the ModelScope ecosystem through the layered APIs, allowing easy and unified access to their models. Once integrated, model inference, fine-tuning, and evaluations can be done with only a few lines of code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words. Support multilingual: English, Chinese, Japanese and others. Support Traditional Chinese. Support HMM cut text use...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    MaryTTS

    MaryTTS

    An open-source, multilingual text-to-speech synthesis system

    MaryTTS is an open-source, multilingual Text-to-Speech Synthesis platform written in Java. It was originally developed as a collaborative project of DFKI’s Language Technology Lab and the Institute of Phonetics at Saarland University. It is now maintained by the Multimodal Speech Processing Group in the Cluster of Excellence MMCI and DFKI. As of version 5.2, MaryTTS supports German, British and American English, French, Italian, Luxembourgish, Russian, Swedish, Telugu, and Turkish; more...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 22
    VideoSrt

    VideoSrt

    Windows-GUI

    ... to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. Video recognition does not need to upload the original video, which is convenient, fast and time-saving.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 23
    Alan AI for Flutter

    Alan AI for Flutter

    SDK to build a multimodal conversational UX for Flutter apps

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Alan AI for Cordova

    Alan AI for Cordova

    Assistant SDK to build a multimodal conversational UX for Apache

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Alan AI for React Native

    Alan AI for React Native

    Build a multimodal conversational UX for apps created with React

    ...-backend powered by the industry’s best Automatic Speech Recognition (ASR), Natural Language Understanding (NLU) and Speech Synthesis. The Alan Cloud provisions and handles the infrastructure required to maintain your voice deployments and perform all the voice processing tasks. Voice enable your app, you only need to get the Alan Client SDK and drop it into your app. No need to plan for, deploy and maintain any infrastructure or speech components - the Alan Platform does the bulk of the work.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next