Showing 108 open source projects for "audio recognition"

View related business solutions
  • SKUDONET Open Source Load Balancer Icon
    SKUDONET Open Source Load Balancer

    Take advantage of Open Source Load Balancer to elevate your business security and IT infrastructure with a custom ADC Solution.

    SKUDONET ADC, operates at the application layer, efficiently distributing network load and application load across multiple servers. This not only enhances the performance of your application but also ensures that your web servers can handle more traffic seamlessly.
  • Email and SMS Marketing Software Icon
    Email and SMS Marketing Software

    Boost Sales. Grow Audiences. Reduce Workloads.

    Our intuitive email marketing software to help you save time and build lasting relationships with your subscribers.
  • 1
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 45 This Week
    Last Update:
    See Project
  • 2
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using pip...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 3
    Buster

    Buster

    Captcha solver extension for humans

    Save time by asking Buster to solve captchas for you. Buster is a Firefox extension which helps you to solve difficult captchas by completing reCAPTCHA audio challenges using speech recognition. Challenges are solved by clicking on the extension button at the bottom of the reCAPTCHA widget. It is not guaranteed that challenges are always solved, the limitations of the technology need to be considered. The continued development of Buster is made possible thanks to the support of awesome backers...
    Downloads: 27 This Week
    Last Update:
    See Project
  • 4
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 7 This Week
    Last Update:
    See Project
  • RMM Software | Remote Monitoring Platform and Tools Icon
    RMM Software | Remote Monitoring Platform and Tools

    Best-in-class automation, scalability, and single-pane IT management.

    Don’t settle when it comes to managing your clients’ IT infrastructure. Exceed their expectations with ConnectWise RMM, our MSP RMM software that provides proactive tools and NOC services—regardless of device environment. With the number of new vulnerabilities rising each year, smart patching procedures have never been more important. We automatically test and deploy patches when they are viable and restrict patches that are harmful. Get better protection for clients while you spend less time managing endpoints and more time growing your business. It’s tough to locate, afford, and retain quality talent. In fact, 81% of IT leaders say it’s hard to find the recruits they need. Add ConnectWise RMM, NOC services and get the expertise and problem resolution you need to become the advisor your clients demand—without adding headcount.
  • 5
    audioFlux

    audioFlux

    A library for audio and music analysis, feature extraction

    A library for audio and music analysis, and feature extraction. Can be used for deep learning, pattern recognition, signal processing, bioinformatics, statistics, finance, etc. audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training and is used...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 6
    Recorder

    Recorder

    HTML5 js recording mp3 wav ogg webm amr format

    ... of browser (including PWA, WebClip, any App) on low-version iOS (11.0-14.2) except Safari inside page). Provides multiple plug-in function support. Rich audio visualization, variable speed and pitch processing, speech recognition, audio stream playback, etc.; with powerful real-time processing support, it can be used in various web applications: from simple recording to complex real-time voice Recognition (ASR), and even audio-related games, are handled with ease.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Transformers

    Transformers

    State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX

    ... classification, object detection, and segmentation. Audio, for tasks like speech recognition and audio classification. Transformers provides APIs to quickly download and use those pretrained models on a given text, fine-tune them on your own datasets and then share them with the community on our model hub. At the same time, each python module defining an architecture is fully standalone and can be modified to enable quick research experiments.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 8
    TorchAudio

    TorchAudio

    Data manipulation and transformation for audio signal processing

    The aim of torchaudio is to apply PyTorch to the audio domain. By supporting PyTorch, torchaudio follows the same philosophy of providing strong GPU acceleration, having a focus on trainable features through the autograd system, and having consistent style (tensor names and dimension names). Therefore, it is primarily a machine learning library and not a general signal processing library. The benefits of PyTorch can be seen in torchaudio through having all the computations be through PyTorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 9
    hfapigo

    hfapigo

    Unofficial (Golang) Go bindings for the Hugging Face Inference API

    (Golang) Go bindings for the Hugging Face Inference API. Directly call any model available in the Model Hub. An API key is required for authorized access. To get one, create a Hugging Face profile.
    Downloads: 0 This Week
    Last Update:
    See Project
  • JobNimbus Construction Software Icon
    JobNimbus Construction Software

    For Roofers, Remodelers, Contractors, Home Service Industry

    Track leads, jobs, and tasks from one easy to use software. You can access your information wherever you are, get everyone on the same page, and grow your business.
  • 10
    WPPConnect

    WPPConnect

    WPPConnect is an open source project

    WPPConnect is an open-source project developed by the JavaScript community with the aim of exporting functions from WhatsApp Web to the node, which can be used to support the creation of any interaction, such as customer service, media sending, intelligence recognition based on phrases artificial and many other things, use your imagination. We are the best WhatsApp automation solution you have been looking for. We are a team that started an OpenSource project that performs automation...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Venom

    Venom

    Venom is the most complete javascript library for Whatsapp

    Venom is a high-performance system developed with JavaScript to create a bot for WhatsApp, support for creating any interaction, such as customer service, media sending, sentence recognition based on artificial intelligence and all types of design architecture for WhatsApp. It's a high-performance alternative API to whatzapp, you can send, text messages, files, images, videos and more. Remember, the API was developed on a platform called RESTful Web services, providing interoperability between...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 12
    VideoSrt

    VideoSrt

    Windows-GUI

    ... to generate subtitle files (support Chinese-English translation, bilingual subtitles) Extract speech text from video/audio. Batch translation, filter processing/encoding SRT subtitle files. Using the Alibaba Cloud speech recognition interface, the accuracy is high, and the standard Mandarin/English recognition rate is over 95%. Video recognition does not need to upload the original video, which is convenient, fast and time-saving.
    Downloads: 22 This Week
    Last Update:
    See Project
  • 13
    ml5.js

    ml5.js

    Friendly machine learning for the web

    A neighborly approach to creating and exploring artificial intelligence in the browser. ml5.js aims to make machine learning approachable for a broad audience of artists, creative coders, and students. The library provides access to machine learning algorithms and models in the browser, building on top of TensorFlow.js with no other external dependencies.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    VoodooHDA
    VoodooHDA is an open source audio driver for devices compliant with the Intel High Definition Audio specification. It is intended as a replacement for AppleHDA on Mac OS X with support for a wide range of audio controllers and codecs.
    Leader badge
    Downloads: 441 This Week
    Last Update:
    See Project
  • 15
    Tensorflow Transformers

    Tensorflow Transformers

    State of the art faster Transformer with Tensorflow 2.0

    ... speech recognition and audio classification. Faster AutoReggressive Decoding, TFlite support, creating TFRecords is simple. Auto-Batching tf.data.dataset or tf.ragged tensors. Everything is dictionary (inputs and outputs) Multiple mask modes like causal, user-defined, prefix. tensorflow-text tokenizer support. Supports GPU, TPU, multi-GPU trainer with wandb, multiple callbacks, auto tensorboard.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    OpenClinic GA

    OpenClinic GA

    Open Source Integrated Hospital Information Management System

    OpenClinic GA is an open source integrated hospital information management system covering management of administrative, financial, clinical, lab, x-ray, pharmacy, meals distribution and other data. Extensive statistical and reporting capabilities.
    Leader badge
    Downloads: 207 This Week
    Last Update:
    See Project
  • 17
    cerberuscms2

    cerberuscms2

    Cerberus Content Management System

    Cerberus Content Management System is a dynamic, secure and infinitely expandable CMS designed after a Unix-Like model. It is a custom written Web Application Framework ( W.A.F. ) with a consistent and custom written Pre-Hyper-Text-Post-Processor Programming Code Framework ( P.C.F. ). This Web Application Software Project' aim is to be the fastest and most secure Web Application Framework, Web Application Programming Code Framework, Text, Voice and Video Communications Platform and Content...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 18
    Vosk Desktop

    Vosk Desktop

    Desktop software for controlling the Vosk Speech Recognition Toolkit

    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    PlateCatcher

    PlateCatcher

    Free Automatic Number Plate Recognition (ANPR) for Windows

    "Plate Catcher" is designed to offer Windows users robust ANPR functionality with a user-friendly interface. Below are the key features of the application: "Plate Catcher" offers comprehensive alert options when number plates in it's database are detected: Visual Alerts: Pop-up windows with time-stamped vehicle registrations. Email Alerts: Email notifications with detected number plates and optional vehicle snapshots. Audio Alerts: Customizable audio alerts for individual number plates...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Common Resource Grep - crgrep

    Common Resource Grep - crgrep

    Common Resource Grep

    CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you will find...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    Meihu-FaceBeauty-Live

    Meihu-FaceBeauty-Live

    Beauty can be applied to live broadcasts, short videos, and selfies

    Meihu beauty sdk is a mobile sdk with face recognition technology as the core, providing professional-grade real-time beauty, big eyes and face reduction, beauty filters, dynamic stickers and other filters, to create a multi-functional video beauty software The goal is to fully meet the beautification needs of customers in many audio and video software business scenarios such as live beauty and short video beauty. The open source version is now available for iOS, and the Android open source...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    VoodooHDA Installer
    VoodooHDA Installer is an open source audio driver Installer for devices compliant with the Intel High Definition Audio specification. It is intended as a replacement for AppleHDA on Mac OS X with support for a wide range of audio controllers and codecs. Taken from projects https://sourceforge.net/projects/voodoohda/ https://github.com/MuntashirAkon/DPCIManager https://github.com/sveinbjornt/Platypus
    Downloads: 7 This Week
    Last Update:
    See Project
  • 23
    wav2letter++

    wav2letter++

    Facebook AI research's automatic speech recognition toolkit

    ... export KENLM_ROOT_DIR=... so that wav2letter++ can find it. This is needed because KenLM doesn't support a make install step.wav2letter++ expects audio and transcription data to be prepared in a specific format so that they can be read from the pipelines. Each dataset (test/valid/train) needs to be in a separate file with one sample per line. A sample is specified using 4 columns separated by space (or tabs).
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    General C++ Library, with modules for Computer Vision, Pattern Recognition and much more.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next