deep learning toolbox free download

Audiomentations

A Python library for audio data augmentation

A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio and multichannel audio. Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where background noise is present. ...

Downloads: 0 This Week

Last Update: 2025-09-13

See Project

AudioCraft

Audiocraft is a library for audio processing and generation

AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...

Downloads: 4 This Week

Last Update: 2025-10-13

See Project

audioFlux

A library for audio and music analysis, feature extraction.

audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.

Downloads: 0 This Week

Last Update: 2023-03-22

See Project

EnCodec

State-of-the-art deep learning based audio codec

Encodec is a neural audio codec developed by Meta for high-fidelity, low-bitrate audio compression using end-to-end deep learning. Unlike traditional codecs (like MP3 or Opus), Encodec uses a learned quantizer and decoder to reconstruct complex waveforms with remarkable accuracy at bitrates as low as 1.5 kbps. It employs a convolutional encoder–decoder architecture trained with perceptual loss functions that optimize for human auditory quality rather than raw waveform distance. ...

Downloads: 1 This Week

Last Update: 2025-10-12

See Project

Coqui STT

The deep learning toolkit for speech-to-text

Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure.

Downloads: 5 This Week

Last Update: 2022-09-03

See Project

DeepSpeech

Open source embedded speech-to-text engine

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.

Downloads: 18 This Week

Last Update: 2021-04-08

See Project

XZVoice

Free and open source text-to-speech software

...Technically, multi-level rhythmic pauses are taken into account to achieve the purpose of natural synthesizing rhythm, and comprehensively use acoustic parameters and linguistic parameters to establish multiple automatic prediction models based on deep learning. Using massive audio data to train the pronunciation model, the synthetic sound is real, full, cadenced, and expressive, and the MOS score has reached the professional level in the industry.

Downloads: 0 This Week

Last Update: 2022-10-04

See Project

TTS

Deep learning for text to speech

TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed, and quality. TTS comes with pre-trained models, tools for measuring dataset quality, and is already used in 20+ languages for products and research projects. Released models in PyTorch, Tensorflow and TFLite. Tools to curate Text2Speech datasets underdataset_analysis. Demo server for model testing. Notebooks for extensive model...

Downloads: 0 This Week

Last Update: 2021-10-18

See Project

FastoCloud PRO

IPTV/NVR/CCTV/Video cloud https://fastocloud.com

IPTV/Video cloud Features: Cross-platform (Linux, MacOSX, FreeBSD, Raspbian/Armbian) GPU/CPU Encode/Decode/Post Processing Stream statistics CCTV Adaptive hls streams Load balancing Temporary urls HLS push EPG scanning Subtitles to text conversions AD insertion Logo overlay Video effects Relays Timeshifts Catchups Playlists Restream/Transcode from online streaming services like Youtube, Twitch Mozaic Many Outputs Physical Inputs Streaming Protocols File Formats Presets Vods/Series server-side support Pay per view channels Channels on demand HTTP Live Streaming (HLS) server-side support Public API, client server communication via JSON RPC Protocol gzip compression Deep learning video analysis Supported deep learning frameworks: Tensorflow NCSDK Caffe ML Hardware:

Downloads: 1 This Week

Last Update: 2020-06-20

See Project

RBM-provisor

An experimental unsupervised learning method for improvising jazz melodies, based on restricted Boltzmann machines (RBMs) layered into Deep Belief Networks (forms of neural networks).

Downloads: 0 This Week

Last Update: 2014-08-03

See Project

Search Results for "deep learning toolbox"

Showing 10 open source projects for "deep learning toolbox"

Audiomentations

AudioCraft

audioFlux

EnCodec

Coqui STT

DeepSpeech

XZVoice

TTS

FastoCloud PRO

RBM-provisor

Search Results for "deep learning toolbox"

Showing 10 open source projects for "deep learning toolbox"

Audiomentations

AudioCraft

audioFlux

EnCodec

Coqui STT

DeepSpeech

XZVoice

TTS

FastoCloud PRO

RBM-provisor

Related Searches

Related Categories