Showing 24 open source projects for "spectrogram"

View related business solutions
  • Ship AI Apps Faster with Vertex AI Icon
    Ship AI Apps Faster with Vertex AI

    Go from idea to deployed AI app without managing infrastructure. Vertex AI offers one platform for the entire AI development lifecycle.

    Ship AI apps and features faster with Vertex AI—your end-to-end AI platform. Access Gemini 3 and 200+ foundation models, fine-tune for your needs, and deploy with enterprise-grade MLOps. Build chatbots, agents, or custom models. New customers get $300 in free credit.
    Try Vertex AI Free
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 1
    Bert-VITS2

    Bert-VITS2

    VITS2 backbone with multilingual-bert

    ...The core idea is to use BERT-style contextual embeddings for text encoding while relying on a refined VITS2 architecture for acoustic generation and vocoding. The repository includes everything needed to train, fine-tune, and run the model, from configuration files to preprocessing scripts, spectrogram utilities, and training entrypoints for multi-GPU and multi-node setups. It provides emotional modeling through “emo embeddings,” allowing voices to be conditioned on different affective states during synthesis. Releases include optimizations for Japanese and English alignment, expanded training data, spec caching and pre-generation tools, as well as ONNX export for more lightweight inference deployments.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2

    pysoundanalyser

    a python program to generate, visualize, and manipulate short sounds

    pysoundanalyser is a Python application that can be used to generate, visualize, and manipulate short sounds through a graphical user interface. Visualization functions include visualization of the power spectrum, the spectrogram, the autocorrelation, and the autocorrelogram of a sound. Manipulation functions include filtering, concatenating, cutting, and scaling the level of a sound. Several types of sounds can also be generated including, pure tones, harmonic complex tones, noise of different colours, frequency modulated and amplitude modulated tones.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Demucs

    Demucs

    Code for the paper Hybrid Spectrogram and Waveform Source Separation

    Demucs (Deep Extractor for Music Sources) is a deep-learning framework for music source separation—extracting individual instrument or vocal tracks from a mixed audio file. The system is based on a U-Net-like convolutional architecture combined with recurrent and transformer elements to capture both short-term and long-term temporal structure. It processes raw waveforms directly rather than spectrograms, allowing for higher-quality reconstruction and fewer artifacts in separated tracks. The...
    Downloads: 60 This Week
    Last Update:
    See Project
  • 4
    DiffSinger

    DiffSinger

    Singing Voice Synthesis via Shallow Diffusion Mechanism

    ...The method introduces a “shallow diffusion” mechanism: instead of diffusing over many steps, generation begins at a shallow step determined adaptively, which leverages prior knowledge learned by a simple mel-spectrogram decoder and speeds up inference.
    Downloads: 80 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 5
    WaveRNN

    WaveRNN

    WaveRNN Vocoder + TTS

    ...A quick_start.py script allows users to immediately synthesize example sentences from a pretrained model and inspect both generated audio and attention plots. For custom TTS, the project guides you through training Tacotron, forcing GTA spectrogram export when desired, training WaveRNN with or without GTA, and then running joint generation.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    TensorFlowTTS

    TensorFlowTTS

    Real-Time State-of-the-art Speech Synthesis for Tensorflow 2

    ...The library supports multiple languages (English, French, Korean, Chinese, German, etc.) and is relatively easy to adapt to new languages. With integrated vocoder + mel-spectrogram generation pipelines, pre-trained models, and fairly flexible architecture, TensorFlowTTS is a great off-the-shelf and extensible TTS engine for applications ranging from voice assistants to content generation or accessibility tools.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    U-Net Fusion RFI

    U-Net Fusion RFI

    U-Net for RFI Detection based on @jakeret's implementation

    See original code here: https://github.com/jakeret/tf_unet Currently this project is based on Tensorflow 1.13 code base and there are no plans to transfer to TF version 2. The primary improvements to this code base include a training and evaluation framework, along with a fusion based approach to detection, combining a number of models (currently hard coded to two trained models) along with Sum Threshold as an additional "expert." Additional work is being done to add custom layers to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Transformer TTS

    Transformer TTS

    Implementation of a Transformer based neural network

    TransformerTTS is an implementation of a non-autoregressive Transformer-based neural network for text-to-speech, built with TensorFlow 2. It takes inspiration from architectures like FastSpeech, FastSpeech 2, FastPitch, and Transformer TTS, and extends them with its own aligner and forward models. The system separates alignment learning and acoustic modeling: an autoregressive Transformer is used as an aligner to extract phoneme-to-frame durations, while a non-autoregressive...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    SigPack

    SigPack

    SigPack - A signal processing library using Armadillo

    SigPack is a C++ signal processing library using the Armadillo library as a base. The API will be familiar for those who has used IT++ and Octave/Matlab.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Go from Data Warehouse to Data and AI platform with BigQuery Icon
    Go from Data Warehouse to Data and AI platform with BigQuery

    Build, train, and run ML models with simple SQL. Automate data prep, analysis, and predictions with built-in AI assistance from Gemini.

    BigQuery is more than a data warehouse—it's an autonomous data-to-AI platform. Use familiar SQL to train ML models, run time-series forecasts, and generate AI-powered insights with native Gemini integration. Built-in agents handle data engineering and data science workflows automatically. Get $300 in free credit, query 1 TB, and store 10 GB free monthly.
    Try BigQuery Free
  • 10
    fmplot

    fmplot

    Plots rtl_power output as a spectrogram

    Designed for finding HD FM stations in the US because they have about twice normal width, you can customize it for other frequency ranges. Uses Gnuplot to plot to a .png file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Mod Direct Panoramic Spectrum Analyzer

    Mod Direct Panoramic Spectrum Analyzer

    Mod Direct Panoramic Spectrum Analyzer

    ...The possibility of cyclic writing/recording from realtime to a file and subsequent playback from it is added (double click of the left mouse button anywhere in the top spectrogram). The size of the MB file is specified in the settings file (Cyclic file size=100).
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    DC-TTS

    DC-TTS

    TensorFlow Implementation of DC-TTS: yet another text-to-speech model

    ...It follows the “Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention” paper, but the author adapts and extends the design to make it practical for real experiments. The model is split into two networks: Text2Mel, which maps text to mel-spectrograms, and SSRN (spectrogram super-resolution network), which converts low-resolution mel-spectrograms into high-resolution magnitude spectrograms suitable for waveform synthesis. Training scripts, data loaders, and hyperparameter configurations are provided to reproduce results on several datasets, including LJ Speech for English, a Korean single-speaker dataset, and audiobook data from Nick Offerman and Kate Winslet.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    SonasoundP is a follow-up to Niklas Werner's sonasound and aims at helping foreign language students with their speech drills. The program shows a real-time spectrogram. Our aim is to add means to record and compare speech from different speakers.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14
    GPS Interactive Time Series Analysis

    GPS Interactive Time Series Analysis

    A software for processing and analyzing time series in Earth Science

    ...Bivariate statistical analysis (including correlation coefficient and linear regression) and time series analysis (including auto and cross-spectral analysis, wavelet power spectrum, spectrogram and periodicities) form the main analysis features of the software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Xtreme Media Player

    Xtreme Media Player

    Xtreme Media Player is a free cross-platform media player.

    ...A key feature of XtremeMP is the capability to view visualizations (on-screen graphics controlled by the music’s audio). These can have scientific/technical purposes such as depicting some properties of the audio (such as the Oscilloscope, Spectrum, Stereogram, and Spectrogram visualizations).
    Downloads: 9 This Week
    Last Update:
    See Project
  • 16
    Luscinia is a program for archiving and analyzing field sound recordings (especially of animals). It incorporates an interface to a database, spectrogram measurement algorithms, sound comparison algorithms, and statistical analysis.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 17
    Pumilio
    Pumilio is a web-based sound analysis and archive system for almost any kind of sound file with tools to see the spectrogram of the sound, select regions for further analysis and insertion in a database, filtering, and many other manipulations.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    This Python script uses the numpy and audiolab modules to generate waveform and spectrogram png images from a wav file. It is based on a script by Freesound.org.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    This application shows a spectrogram (STFT) of sound. The Spectrogram is upgraded in real time. Application is dedicated only for Linux.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    The Analysis & Resynthesis Sound Spectrograph analyses a sound file into a spectrogram and is able to synthesise this spectrogram, or any other user-created image, back into a sound.
    Leader badge
    Downloads: 24 This Week
    Last Update:
    See Project
  • 21
    A Flash 9 MP3 player that allows a user to play an MP3 file while viewing the spectrogram of the sound file.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    The Analysis & Reconstruction Sound Engine analyses a sound file into a spectrogram and is able to synthesise this spectrogram, or any other user-created image, back into a sound.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 23
    SAA (SSPLab Audio Analyzer) It will be able to separate sources, recognize speech and analyze the auditory scene. It can also synthesize spatialised sounds from mono recording, edit, analyze via spectrogram, filter and re-sample signals.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    This project will try to develop an audio analyser which displays several informations as phase in 2 ways (1 dimensional and 2 dimensional), waves shapes, spectrogram in full range and by 1/3 octavia with the value of the current peak frequency, and meter
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next