Join/Login
Open Source Software
Business Software
For Vendors
Blog
About
More

For Vendors Help Create Join Login

Open Source Software

Business Software

SourceForge Podcast

Resources

Articles
Case Studies
Blog

Menu

Help
Create
Join
Login

Home
Browse Open Source
Artificial Intelligence
Speech Recognition Software
Search Results

Search Results for "speech processing"

x

Sort By:

Relevance

Clear All Filters

OS

Linux 17
Windows 17
Mac 14
More...
BSD 5
ChromeOS 3
Desktop Operating Systems 1

Category

Artificial Intelligence 18
Software Development 5
Multimedia 4
Communications 2
Business 1
Education 1
Scientific/Engineering 1
Text Editors 1

License

OSI-Approved Open Source 10
Public Domain 1

Translations

English 1

Programming Language

Python 8
Java 4
C++ 3
C# 1
More...
Go 1
JSP 1
Perl 1

Status

Beta 2
Production/Stable 2
Alpha 1

Showing 18 open source projects for "speech processing"

View related business solutions

Speech Recognition Clear Filters & Widen Search

Bright Data - All in One Platform for Proxies and Web Scraping
Say goodbye to blocks, restrictions, and CAPTCHAs

Bright Data offers the highest quality proxies with automated session management, IP rotation, and advanced web unlocking technology. Enjoy reliable, fast performance with easy integration, a user-friendly dashboard, and enterprise-grade scaling. Powered by ethically-sourced residential IPs for seamless web scraping.

Get Started
Free CRM Software With Something for Everyone
216,000+ customers in over 135 countries grow their businesses with HubSpot

Think CRM software is just about contact management? Think again. HubSpot CRM has free tools for everyone on your team, and it’s 100% free. Here’s how our free CRM solution makes your job easier.

Get free CRM
1

Whisper

Robust Speech Recognition via Large-Scale Weak Supervision

Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...

Downloads: 88 This Week

Last Update: 2024-09-30
See Project
2

OpenVINO

OpenVINO™ Toolkit repository

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime, Post-Training...

Downloads: 22 This Week

Last Update: 2024-09-05
See Project
3

NVIDIA NeMo

Toolkit for conversational AI

NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...

Downloads: 1 This Week

Last Update: 2024-10-29
See Project
4

Diffgram

Training data (data labeling, annotation, workflow) for all data types

... the activities of annotation, which produces structured data; ready to be consumed by a machine learning model. Annotation is required because raw media is considered to be unstructured and not usable without it. That’s why training data is required for many modern machine learning use cases including computer vision, natural language processing and speech recognition.

Downloads: 0 This Week

Last Update: 2024-10-14
See Project
Save hundreds of developer hours with components built for SaaS applications.
The #1 Embedded Analytics Solution for SaaS Teams.

Whether you want full self-service analytics or simpler multi-tenant security, Qrvey’s embeddable components and scalable data management remove the guess work.

Try Developer Playground
5

VideoSrt

Windows-GUI

... to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. Video recognition does not need to upload the original video, which is convenient, fast and time-saving.

Downloads: 10 This Week

Last Update: 2023-01-13
See Project
6

DeepLearning

Deep Learning (Flower Book) mathematical derivation

... feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling and practical methods, and investigates topics such as natural language processing, Applications in speech recognition, computer vision, online recommender systems, bioinformatics, and video games. Finally, the Deep Learning book provides research directions covering theoretical topics including linear factor models, autoencoders, representation learning, structured probabilistic models, etc.

Downloads: 1 This Week

Last Update: 2022-08-02
See Project
7

Deep Learning Drizzle

Drench yourself in Deep Learning, Reinforcement Learning

Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.

Downloads: 0 This Week

Last Update: 2022-07-29
See Project
8

Tensorpack

A Neural Net Training Interface on TensorFlow, with focus on speed

... the data processing flexibility needed in research. Tensorpack squeezes the most performance out of pure Python with various auto parallelization strategies. There are too many symbolic function wrappers already. Tensorpack includes only a few common layers. You can use any TF symbolic functions inside Tensorpack.

Downloads: 0 This Week

Last Update: 2022-08-01
See Project
9

Distant Speech Recognition

Beamforming and Speech Recognition Toolkit

BTK contains C++ and Python libraries that implement speech processing and microphone array techniques such as speech feature extraction, speech enhancement, speaker tracking, beamforming, dereverberation and echo cancellation algorithms. The Millennium ASR provides C++ and python libraries for automatic speech recognition. The Millennium ASR implements a weighted finite state transducer (WFST) decoder, training and adaptation methods. These toolkits are meant for facilitating research...

Downloads: 0 This Week

Last Update: 2019-08-21
See Project
Red Hat Ansible Automation Platform on Microsoft Azure
Red Hat Ansible Automation Platform on Azure allows you to quickly deploy, automate, and manage resources securely and at scale.

Deploy Red Hat Ansible Automation Platform on Microsoft Azure for a strategic automation solution that allows you to orchestrate, govern and operationalize your Azure environment.

Learn More
10

Speechalyzer

Process large speech data wrt transcription, labeling and annotation

Speechalyzer: a tool for the daily work of a 'speech worker' It is optimized to process large speech data sets with respect to transcription, labeling and annotation. It is implemented as a client server based framework in Java and interfaces software for speech recognition, synthesis, speech classification and quality evaluation. The application is mainly the processing of training data for speech recognition and classification models and performing benchmarking tests on speech-to-text...

Downloads: 0 This Week

Last Update: 2016-04-27
See Project
11

Awesome Recurrent Neural Networks

A curated list of resources dedicated to RNN

... in tensorflow, and much more. Codes, theory, applications, and datasets about natural language processing, robotics, computer vision, and much more.

Downloads: 0 This Week

Last Update: 2021-09-22
See Project
12

jaivox

Speech recognition application builder and library

Java library and tools to create open source speech recognition applications. Generates dialogs for conversational interfaces. Works with a popular open source speech recognition library.

Downloads: 0 This Week

Last Update: 2015-03-26
See Project
13

InproTK

An Incremental Spoken Dialogue Processing Toolkit

InproTK is an Incremental Spoken Dialogue Processing Toolkit, that is, a toolkit to help you build dialogue systems that listen and talk incrementally, allowing for advanced interactional behaviour. Please see our Wiki for more information: http://sourceforge.net/p/inprotk/wiki/

Downloads: 0 This Week

Last Update: 2015-06-16
See Project
14

Open Pandora's Box

Pandora is an artificial intelligent web based bot

Pandora is an artificial intelligent web based bot written in Java. Pandora is a component based AI architecture including, database memory, XML, voice, voice rec, chat, IRC, HTTP, Wiktionary, Freebase, consciousness, language, GUI, applet, web, jsp, Android

1 Review

Downloads: 0 This Week

Last Update: 2013-11-20
See Project
15

Transcription Aid

Transcription Aid helps you type text from recordings.

This software is to help type in text from speech recordings. It has several functions proven to help this type of work. However it is fully manual (aside from auto-completion), so no speech recognition if you are looking for that, but it is a great tool to do the job.

1 Review

Downloads: 1 This Week

Last Update: 2015-07-30
See Project
16

Arabic Phonetic Platform using VoiceXML

This project'll be the core engine of many voice based platforms,which can be implemented into your projects,websites...etc to provide an Arabic speech service, where your servers can interact with the clients through Arabic Speech Recognition.

Downloads: 0 This Week

Last Update: 2013-04-01
See Project
17

MRCP4J

The MRCPv2 protocol is designed to allow client devices to control media processing resources, such as speech recognition engines. MRCP4J provides a Java API that encapsulates the MRCPv2 protocol and can be used to implement MRCP clients and/or servers.

Downloads: 0 This Week

Last Update: 2013-04-25
See Project
18

Open Mind Speech and Overflow

The Open Mind Speech project is part of the Open Mind Initiative and aims to develop free(GPL) speech recognition and signal processing (DSP) tools and applications, as well as collect speech data from "e-citizens" using the Internet.

1 Review

Downloads: 2 This Week

Last Update: 2017-03-22
See Project

Previous
You're on page 1
Next

Related Searches

transcribe audio to srt

convert audio to srt

video to srt

whisper

speech recognition project

text-to-speech

text to speech

hindi text to speech

arabic text to speech

arabic speech recognition

Related Categories

Artificial Intelligence

Software Development

Multimedia

Communications

Business

SourceForge

Create a Project
Open Source Software
Business Software
Top Downloaded Projects

Company

About
Team
SourceForge Headquarters
225 Broadway Suite 1600
San Diego, CA 92101
+1 (858) 454-5900

Resources

Support
Site Documentation
Site Status
SourceForge Reviews

© 2024 Slashdot Media. All Rights Reserved.

Terms Privacy Opt Out Advertise

Thanks for helping keep SourceForge clean.

X

You seem to have CSS turned off. Please don't fill out this field.

You seem to have CSS turned off. Please don't fill out this field.

Briefly describe the problem (required):

Upload screenshot of ad (required):

Select a file, or drag & drop file here.

✔

✘

Screenshot instructions:

Click URL instructions:
Right-click on the ad, choose "Copy Link", then paste here →
(This may not be possible with some types of ads)

More information about our ad policies

Ad destination/click URL: