Search Results for "matlab audio classification"

Showing 22 open source projects for "matlab audio classification"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Run Any Workload on Compute Engine VMs Icon
    Run Any Workload on Compute Engine VMs

    From dev environments to AI training, choose preset or custom VMs with 1–96 vCPUs and industry-leading 99.95% uptime SLA.

    Compute Engine delivers high-performance virtual machines for web apps, databases, containers, and AI workloads. Choose from general-purpose, compute-optimized, or GPU/TPU-accelerated machine types—or build custom VMs to match your exact specs. With live migration and automatic failover, your workloads stay online. New customers get $300 in free credits.
    Try Compute Engine
  • 1
    Qwen2-Audio

    Qwen2-Audio

    Repo of Qwen2-Audio chat & pretrained large audio language model

    Qwen2-Audio is a large audio-language model by Alibaba Cloud, part of the Qwen series. It is trained to accept various audio signal inputs (including speech, sounds, etc.) and perform both voice chat and audio analysis, producing textual responses. It supports two major modes: Voice Chat (interactive voice only input) and Audio Analysis (audio + text instructions), with both base and instruction-tuned models.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Kimi-Audio

    Kimi-Audio

    Audio foundation model excelling in audio understanding

    Kimi-Audio is an ambitious open-source audio foundation model designed to unify a wide array of audio processing tasks — from speech recognition and audio understanding to generative conversation and sound event classification — within a single cohesive architecture. Instead of fragmenting work across specialized models, Kimi-Audio handles automatic speech recognition (ASR), audio question answering, automatic audio captioning, speech emotion recognition, and audio-to-text chat in one system, enabling developers to build rich, multimodal audio applications without stitching together disparate components. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Transformers

    Transformers

    State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

    ...Using pre-trained models can reduce your compute costs, carbon footprint, and save you the time and resources required to train a model from scratch. These models support common tasks in different modalities. Text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 4
    Librosa

    Librosa

    Python library for audio and music analysis

    Librosa is a powerful Python library for analyzing and processing audio and music signals. Built on top of NumPy, SciPy, and matplotlib, it provides a wide range of tools for feature extraction, time-series manipulation, audio display, and music information retrieval. Whether you're building machine learning models for audio classification or visualizing spectrograms, Librosa is a go-to library for researchers and developers working in audio signal processing.
    Downloads: 6 This Week
    Last Update:
    See Project
  • Build AI Apps with Gemini 3 on Vertex AI Icon
    Build AI Apps with Gemini 3 on Vertex AI

    Access Google’s most capable multimodal models. Train, test, and deploy AI with 200+ foundation models on one platform.

    Vertex AI gives developers access to Gemini 3—Google’s most advanced reasoning and coding model—plus 200+ foundation models including Claude, Llama, and Gemma. Build generative AI apps with Vertex AI Studio, customize with fine-tuning, and deploy to production with enterprise-grade MLOps. New customers get $300 in free credits.
    Try Vertex AI Free
  • 5
    ImageBind

    ImageBind

    ImageBind One Embedding Space to Bind Them All

    ...The model is trained using large-scale contrastive learning, leveraging diverse datasets from natural images, videos, audio clips, and sensor data. Once trained, it can perform cross-modal retrieval, zero-shot classification, and multimodal composition without additional fine-tuning.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    ...These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification targets.
    Downloads: 59 This Week
    Last Update:
    See Project
  • 7
    pycm

    pycm

    Multi-class confusion matrix library in Python

    PyCM is a multi-class confusion matrix library written in Python that supports both input data vectors and direct matrix, and a proper tool for post-classification model evaluation that supports most classes and overall statistics parameters. PyCM is the swiss-army knife of confusion matrices, targeted mainly at data scientists that need a broad array of metrics for predictive models and an accurate evaluation of large variety of classifiers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Label Studio

    Label Studio

    Label Studio is a multi-type data labeling and annotation tool

    The most flexible data annotation tool. Quickly installable. Build custom UIs or use pre-built labeling templates. Detect objects on image, bboxes, polygons, circular, and keypoints supported. Partition image into multiple segments. Use ML models to pre-label and optimize the process. Label Studio is an open-source data labeling tool. It lets you label data types like audio, text, images, videos, and time series with a simple and straightforward UI and export to various model formats. It can...
    Downloads: 13 This Week
    Last Update:
    See Project
  • 9
    Adversarial Robustness Toolbox

    Adversarial Robustness Toolbox

    Adversarial Robustness Toolbox (ART) - Python Library for ML security

    ...ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, sci-kit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio, video, etc.) and machine learning tasks (classification, object detection, generation, certification, etc.).
    Downloads: 0 This Week
    Last Update:
    See Project
  • Build on Google Cloud with $300 in Free Credit Icon
    Build on Google Cloud with $300 in Free Credit

    New to Google Cloud? Get $300 in free credit to explore Compute Engine, BigQuery, Cloud Run, Vertex AI, and 150+ other products.

    Start your next project with $300 in free Google Cloud credit. Spin up VMs, run containers, query exabytes in BigQuery, or build AI apps with Vertex AI and Gemini. Once your credits are used, keep building with 20+ products with free monthly usage, including Compute Engine, Cloud Storage, GKE, and Cloud Run functions. Sign up to start building right away.
    Start Free Trial
  • 10
    The Sound Description Interchange Format (SDIF) is an established standard for the interchange of sound descriptions and analysis data. This project provides libraries, SDIF (in C) and Easdif (in C++), tools, and wrappers to read and write SDIF files.
    Leader badge
    Downloads: 9 This Week
    Last Update:
    See Project
  • 11
    Tensorflow Transformers

    Tensorflow Transformers

    State of the art faster Transformer with Tensorflow 2.0

    Imagine auto-regressive generation to be 90x faster. tf-transformers (Tensorflow Transformers) is designed to harness the full power of Tensorflow 2, designed specifically for Transformer based architecture. These models can be applied on text, for tasks like text classification, information extraction, question answering, summarization, translation, text generation, in over 100 languages. Images, for tasks like image classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Faster AutoReggressive Decoding, TFlite support, creating TFRecords is simple. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OpenDAFF

    OpenDAFF

    Directional Audio File Format

    OpenDAFF is a free, open-source software package for directional audio data - like the directivity of microphones, speakers, as well as head-related transfer functions (HRTFs)
    Downloads: 4 This Week
    Last Update:
    See Project
  • 13
    Yet Another Audio Feature Extractor is a toolbox for audio analysis. Easy to use and efficient at extracting a large number of audio features simultaneously. WAV and MP3 files supported, or embedding in C++, Python or Matlab applications.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 14

    avimmir

    (audio, video, image) Multimedia Multimodal Information Retrieval

    audio classification; speaker segmentation; speaker clustering; speaker recognition; spoken document retrieval; image retrieval; video retrieval; etc.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    RNNLIB is a recurrent neural network library for sequence learning problems. Applicable to most types of spatiotemporal data, it has proven particularly effective for speech and handwriting recognition. full installation and usage instructions given at http://sourceforge.net/p/rnnl/wiki/Home/
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16

    CogPy

    Cognitive Python

    ...A few of its main features are: - A full-featured 2D display library for rapid development - Full control of computer I/O (display, mouse, keyboard, gamepad, joystick, audio) - Advanced library of data collection techniques - Data export to NumPy/SciPy, R, MATLAB, and Microsoft Excel - Compatibility with PyACT-R for cognitive modeling If you are interested in contributing to the CogPy project, contact the lead developer, Jasper Danielson, at jrd4@rice.edu.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Stanford Machine Learning Course

    Stanford Machine Learning Course

    machine learning course programming exercise

    The Stanford Machine Learning Course Exercises repository contains programming assignments from the well-known Stanford Machine Learning online course. It includes implementations of a variety of fundamental algorithms using Python and MATLAB/Octave. The repository covers a broad set of topics such as linear regression, logistic regression, neural networks, clustering, support vector machines, and recommender systems. Each folder corresponds to a specific algorithm or concept, making it easy...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 18
    This is a c-library that provides tools for advanced analysis of electrophysiological data. It features denoising, unsupervised classification, time-frequency analysis, phase-space analysis, neural networks, time-warping and more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    This is a fast C implementation of Arturo Camacho's SWIPE' pitch extraction algorithm. See the project homepage for more about the advantages of the SWIPE' algorithm. swipe-1.0.tar.gz contains the current source, which should compile quite neatly.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Open-source content and evaluation framework for music transcription systems. Can be used as monophonic or polyphonic database, through software mixing.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    DanceBox is a fork of the pyTone jukebox highly customized for a dance studio. It features an advanced SQLite classification database that should rival commercial offerings.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    nBoost is a suite of boosting algorithms designed to solve binary classification problems on data that is not linearly separable by a convex combination of base hypotheses, i.e. noisy data. WARNING: Active development. Underlying algorithm is unstable.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB