Showing 48 open source projects for "learning"

View related business solutions
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • Go from Code to Production URL in Seconds Icon
    Go from Code to Production URL in Seconds

    Cloud Run deploys apps in any language instantly. Scales to zero. Pay only when code runs.

    Skip the Kubernetes configs. Cloud Run handles HTTPS, scaling, and infrastructure automatically. Two million requests free per month.
    Try it free
  • 1
    Audiomentations

    Audiomentations

    A Python library for audio data augmentation

    A Python library for audio data augmentation. Inspired by albumentations. Useful for deep learning. Runs on CPU. Supports mono audio and multichannel audio. Can be integrated in training pipelines in e.g. Tensorflow/Keras or Pytorch. Has helped people get world-class results in Kaggle competitions. Is used by companies making next-generation audio products. Mix in another sound, e.g. a background noise. Useful if your original sound is clean and you want to simulate an environment where background noise is present. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    AudioCraft

    AudioCraft

    Audiocraft is a library for audio processing and generation

    AudioCraft is a PyTorch library for text-to-audio and text-to-music generation, packaging research models and tooling for training and inference. It includes MusicGen for music generation conditioned on text (and optionally melody) and AudioGen for text-conditioned sound effects and environmental audio. Both models operate over discrete audio tokens produced by a neural codec (EnCodec), which acts like a tokenizer for waveforms and enables efficient sequence modeling. The repo provides...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 3
    Librosa

    Librosa

    Python library for audio and music analysis

    ...Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 4
    Speakr

    Speakr

    Speakr is a personal, self-hosted web application

    ...It provides a clean, user-friendly interface where users can input text, choose a voice style or language, and immediately hear the output, making it ideal for accessibility, content creation, and learning applications. Behind the scenes, Speakr leverages modern TTS engines and streaming audio technologies to deliver smooth and responsive speech generation without noticeable delay. The project is built with extensibility in mind, enabling developers to add custom voices, integrate additional languages, and tailor the backend for different hardware or cloud environments. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    VCClient

    VCClient

    Software that uses AI to perform real-time voice conversion

    VCClient is a real-time voice conversion system that uses machine learning models to transform a speaker’s voice into another voice with minimal latency. It is designed for live applications such as streaming, gaming, and virtual communication, where immediate feedback is essential. The system supports multiple voice conversion models, including RVC and other neural network-based approaches, allowing users to switch between different voices or customize their output.
    Downloads: 15 This Week
    Last Update:
    See Project
  • 6
    EarQuiz Frequencies

    EarQuiz Frequencies

    Software for technical ear training on equalization

    ...This application is based on (and deeply inspired by) the world-renowned Golden Ears method of David Moulton, whose course is half dedicated to building this essential critical listening skill. The overall training process involves ongoing learning and testing yourself. In the Learn mode, you listen to the pink noise or music (or other external audio) excerpts with switched off and on 1-octave or 1/3-octave graphic EQ, boosting or cutting frequency bands within certain spectral ranges. Then in the Test mode you are given a sequence of 10 similar examples, where you try to guess, which frequencies are boosted or cut, and you get scored. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 7
    LAME (Lame Aint an MP3 Encoder)

    LAME (Lame Aint an MP3 Encoder)

    A high quality MP3 encoder

    LAME is an educational tool to be used for learning about MP3 encoding. The goal of the LAME project is to improve the psycho acoustics, quality and speed of MP3 encoding. Note: we provide source code only!
    Leader badge
    Downloads: 21,627 This Week
    Last Update:
    See Project
  • 8

    NAM-Runner

    Batch file to install and run NAM (neural-amp-modeler) easily.

    A Windows 10 batch file, that installs and runs the NAM model trainer (neural-amp-modeler) by Steven Atkinson right into the GUI application. Fully automated. Custom one-time installation of everything you need to train neural network models of guitar amps and more for the NAM VST plugin, no Conda required. Runs as a launcher afterwards. Portable installation. New pyTorch inclues CUDA runtime for fast Nvidia GPU support. No command line, python or conda knowledge needed! Just double click.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9

    audioFlux

    A library for audio and music analysis, feature extraction.

    audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Full-stack observability with actually useful AI | Grafana Cloud Icon
    Full-stack observability with actually useful AI | Grafana Cloud

    Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

    Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.
    Create free account
  • 10
    EnCodec

    EnCodec

    State-of-the-art deep learning based audio codec

    Encodec is a neural audio codec developed by Meta for high-fidelity, low-bitrate audio compression using end-to-end deep learning. Unlike traditional codecs (like MP3 or Opus), Encodec uses a learned quantizer and decoder to reconstruct complex waveforms with remarkable accuracy at bitrates as low as 1.5 kbps. It employs a convolutional encoder–decoder architecture trained with perceptual loss functions that optimize for human auditory quality rather than raw waveform distance. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12

    DuranDuranbot

    Teachable/trainable artificially intelligent music bot

    A teachable/trainable artificially intelligent music bot fundamentally inspired by how the new wave band Duran Duran composes music. This program utilizes many algorithmic/AI techniques/processes, including machine learning; which allow you to teach/train it to compose music which you prefer... and the technique which is the foundation of the design of DuranDuranbot, which was directly inspired by how Duran Duran writes music........ Called, "bit by bit circular composition"....... and it's explanation can be found here - https://scsynth.org/t/bit-by-bit-circular-composition/1107 This program is written in the SuperCollider programming language - https://en.wikipedia.org/wiki/SuperCollider Contact - ken_brant@ymail.com
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Winds

    Winds

    A Beautiful Open Source RSS & Podcast App Powered by Getstream.io

    ...For Winds the follow suggestions and the list of articles from the feeds you follow is powered by Stream. Stream accounts are free for up to 3 million feed updates and handle personalization (machine learning) for up to 100 users.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    XZVoice

    XZVoice

    Free and open source text-to-speech software

    ...Technically, multi-level rhythmic pauses are taken into account to achieve the purpose of natural synthesizing rhythm, and comprehensively use acoustic parameters and linguistic parameters to establish multiple automatic prediction models based on deep learning. Using massive audio data to train the pronunciation model, the synthetic sound is real, full, cadenced, and expressive, and the MOS score has reached the professional level in the industry.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    TTS

    TTS

    Deep learning for text to speech

    TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed, and quality. TTS comes with pre-trained models, tools for measuring dataset quality, and is already used in 20+ languages for products and research projects. Released models in PyTorch, Tensorflow and TFLite. Tools to curate Text2Speech datasets underdataset_analysis. Demo server for model testing. Notebooks for extensive model...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    X32 Scene Parser

    X32 Scene Parser

    An X32 scene management tool

    This parsing tool can be used to extract sections of a Behringer X32 or Midas M32 scene file in order to create specialized snippets.
    Leader badge
    Downloads: 5 This Week
    Last Update:
    See Project
  • 18
    X-Air Scene Parser

    X-Air Scene Parser

    An X-Air scene management tool

    The X-Air/M-Air does not include snippets (like the X32), This parsing tool is a port of the X32 Scene Parser that can be used to create modified scene files which can function like snippets.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Piano Booster

    Piano Booster

    Boost your Piano playing skills

    A MIDI file player that teaches you how to play the piano. PianoBooster is a fun way of playing along with a musical accompaniment and at the same time learning the basics of reading musical notation. see: https://www.pianobooster.org/
    Leader badge
    Downloads: 112 This Week
    Last Update:
    See Project
  • 20
    OpenOffice.org Export As DAISY
    odt2daisy is an OpenOffice.org Writer extension, enabling to export in DAISY XML, Full DAISY (xml+audio) and Audiobook format. DAISY is an NISO Z39.86 standard for blind, visual impaired, print-disabled, and learning-disabled people.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 21

    FastoCloud PRO

    IPTV/NVR/CCTV/Video cloud https://fastocloud.com

    IPTV/Video cloud Features: Cross-platform (Linux, MacOSX, FreeBSD, Raspbian/Armbian) GPU/CPU Encode/Decode/Post Processing Stream statistics CCTV Adaptive hls streams Load balancing Temporary urls HLS push EPG scanning Subtitles to text conversions AD insertion Logo overlay Video effects Relays Timeshifts Catchups Playlists Restream/Transcode from online streaming services like Youtube, Twitch Mozaic Many Outputs Physical Inputs Streaming Protocols File Formats Presets Vods/Series server-side support Pay per view channels Channels on demand HTTP Live Streaming (HLS) server-side support Public API, client server communication via JSON RPC Protocol gzip compression Deep learning video analysis Supported deep learning frameworks: Tensorflow NCSDK Caffe ML Hardware:
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    jMIR

    jMIR

    Music research software

    ...It also includes tools for managing and profiling large music collections and for checking audio for production errors. jMIR includes software for extracting features, applying machine learning algorithms, applying heuristic error error checkers, mining metadata and analyzing metadata.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    ILA - teachable voice assistant

    ILA - teachable voice assistant

    ILA is a fully customizable and teachable voice assistant for Java

    ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    ScoreDate

    ScoreDate

    ScoreDate is a software to learn music reading and ear training

    ScoreDate is your date with the music ! It is an open source software written in Java that helps musicians to learn music reading. It also helps you with ear training. It is suitable for any skill, from beginners to professional users. From slow training to first sight reading.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    Accelerated Feature Extraction Tool

    A fast GPU accelerated feature extraction software for speech analysis

    A fast feature extraction software tool for speech analysis and processing. It incorporates standard MFCC, PLP, and TRAPS features. The tool is a specially designed to process very large audio data sets. It uses GPU acceleration if compatible GPU available (CUDA as weel as OpenCL, NVIDIA, AMD, and Intel GPUs are supported). CPU SSE intrinsic instruction set is used in cases where no compatible GPU present. The output files are stored in HTK format. The software is developed at Department of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
MongoDB Logo MongoDB