Showing 19 open source projects for "processing"

View related business solutions
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    ...It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
    Downloads: 71 This Week
    Last Update:
    See Project
  • 2
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training Optimization Tool, as well as CPU, GPU, MYRIAD, multi device and heterogeneous plugins to accelerate deep learning inferencing on Intel® CPUs and Intel® Processor Graphics. ...
    Downloads: 35 This Week
    Last Update:
    See Project
  • 3
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    ...Supported models: Jasper, QuartzNet, CitriNet, Conformer-CTC, Conformer-Transducer, Squeezeformer-CTC, Squeezeformer-Transducer, ContextNet, LSTM-Transducer (RNNT), LSTM-CTC. NGC collection of pre-trained speech processing models.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    ...Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.
    Downloads: 3 This Week
    Last Update:
    See Project
  • Stop Storing Third-Party Tokens in Your Database Icon
    Stop Storing Third-Party Tokens in Your Database

    Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

    Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.
    Try Auth0 for Free
  • 5
    Voxal voice changer

    Voxal voice changer

    Transform your voice in real-time voxal voice changer

    Voxal Voice Changer is a program that allows you to modify your voice by applying various effects (e.g. pitch change, echo, etc.) in real-time. Effects can be added in any sequence and in any combination, allowing you to distort your voice beyond recognition. Take your audio to the next level! Our powerful Voice Changer software lets you morph your voice in real-time with stunning AI-powered quality. Whether you're looking to have fun, protect your privacy, or create engaging content,...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 6
    VideoSrt

    VideoSrt

    Windows-GUI

    ...Recognize video/audio speech to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. Video recognition does not need to upload the original video, which is convenient, fast and time-saving.
    Downloads: 17 This Week
    Last Update:
    See Project
  • 7
    DeepLearning

    DeepLearning

    Deep Learning (Flower Book) mathematical derivation

    ...At the same time, it also introduces deep learning techniques used by practitioners in the industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling and practical methods, and investigates topics such as natural language processing, Applications in speech recognition, computer vision, online recommender systems, bioinformatics, and video games. Finally, the Deep Learning book provides research directions covering theoretical topics including linear factor models, autoencoders, representation learning, structured probabilistic models, etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Deep Learning Drizzle

    Deep Learning Drizzle

    Drench yourself in Deep Learning, Reinforcement Learning

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    Tensorpack

    Tensorpack

    A Neural Net Training Interface on TensorFlow, with focus on speed

    ...Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not offer the data processing flexibility needed in research. Tensorpack squeezes the most performance out of pure Python with various auto parallelization strategies. There are too many symbolic function wrappers already. Tensorpack includes only a few common layers. You can use any TF symbolic functions inside Tensorpack.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10

    Distant Speech Recognition

    Beamforming and Speech Recognition Toolkit

    BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Speechalyzer

    Speechalyzer

    Process large speech data wrt transcription, labeling and annotation

    ...It is implemented as a client server based framework in Java and interfaces software for speech recognition, synthesis, speech classification and quality evaluation. The application is mainly the processing of training data for speech recognition and classification models and performing benchmarking tests on speech-to-text, text-to-speech and speech classification software systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    Awesome Recurrent Neural Networks

    Awesome Recurrent Neural Networks

    A curated list of resources dedicated to RNN

    ...Provides a wide range of works and resources such as a Recurrent Neural Network Tutorial, a Sequence-to-Sequence Model Tutorial, Tutorials by nlintz, Notebook examples by aymericdamien, Scikit Flow (skflow) - Simplified Scikit-learn like Interface for TensorFlow, Keras (Tensorflow / Theano)-based modular deep learning library similar to Torch, char-rnn-tensorflow by sherjilozair, char-rnn in tensorflow, and much more. Codes, theory, applications, and datasets about natural language processing, robotics, computer vision, and much more.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    jaivox

    jaivox

    Speech recognition application builder and library

    Java library and tools to create open source speech recognition applications. Generates dialogs for conversational interfaces. Works with a popular open source speech recognition library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    InproTK

    InproTK

    An Incremental Spoken Dialogue Processing Toolkit

    InproTK is an Incremental Spoken Dialogue Processing Toolkit, that is, a toolkit to help you build dialogue systems that listen and talk incrementally, allowing for advanced interactional behaviour. Please see our Wiki for more information: http://sourceforge.net/p/inprotk/wiki/
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Open Pandora's Box

    Open Pandora's Box

    Pandora is an artificial intelligent web based bot

    Pandora is an artificial intelligent web based bot written in Java. Pandora is a component based AI architecture including, database memory, XML, voice, voice rec, chat, IRC, HTTP, Wiktionary, Freebase, consciousness, language, GUI, applet, web, jsp, Android
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Transcription Aid

    Transcription Aid

    Transcription Aid helps you type text from recordings.

    This software is to help type in text from speech recordings. It has several functions proven to help this type of work. However it is fully manual (aside from auto-completion), so no speech recognition if you are looking for that, but it is a great tool to do the job.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    This project'll be the core engine of many voice based platforms,which can be implemented into your projects,websites...etc to provide an Arabic speech service, where your servers can interact with the clients through Arabic Speech Recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    The MRCPv2 protocol is designed to allow client devices to control media processing resources, such as speech recognition engines. MRCP4J provides a Java API that encapsulates the MRCPv2 protocol and can be used to implement MRCP clients and/or servers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    The Open Mind Speech project is part of the Open Mind Initiative and aims to develop free(GPL) speech recognition and signal processing (DSP) tools and applications, as well as collect speech data from "e-citizens" using the Internet.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB