Showing 76 open source projects for "artificial intelligence python"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    The database for AI-powered applications.

    MongoDB Atlas is the developer-friendly database used to build, scale, and run gen AI and LLM-powered apps—without needing a separate vector database. Atlas offers built-in vector search, global availability across 115+ regions, and flexible document modeling. Start building AI apps faster, all in one place.
    Start Free
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 1
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 89 This Week
    Last Update:
    See Project
  • 2
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 3
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 4
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This...
    Downloads: 5 This Week
    Last Update:
    See Project
  • eProcurement Software Icon
    eProcurement Software

    Enterprises and companies seeking a solution to manage all their procurement operations and processes

    eBuyerAssist by Eyvo is a cloud-based procurement solution designed for businesses of all sizes and industries. Fully modular and scalable, it streamlines the entire procurement lifecycle—from requisition to fulfillment. The platform includes powerful tools for strategic sourcing, supplier management, warehouse operations, and contract oversight. Additional modules cover purchase orders, approval workflows, inventory and asset management, customer orders, budget control, cost accounting, invoice matching, vendor credit checks, and risk analysis. eBuyerAssist centralizes all procurement functions into a single, easy-to-use system—improving visibility, control, and efficiency across your organization. Whether you're aiming to reduce costs, enhance compliance, or align procurement with broader business goals, eBuyerAssist helps you get there faster, smarter, and with measurable results.
    Learn More
  • 5
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...
    Downloads: 42 This Week
    Last Update:
    See Project
  • 6
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples....
    Downloads: 525 This Week
    Last Update:
    See Project
  • 7
    The SpeechBrain Toolkit

    The SpeechBrain Toolkit

    A PyTorch-based Speech Toolkit

    SpeechBrain is an open-source and all-in-one conversational AI toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains. SpeechBrain supports state-of-the-art methods for end-to-end speech recognition, including models based on CTC, CTC+attention, transducers, transformers, and neural language models relying on recurrent neural networks and transformers. Speaker recognition is already deployed in a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Omnilingual ASR

    Omnilingual ASR

    Omnilingual ASR Open-Source Multilingual SpeechRecognition

    Omnilingual-ASR is a research codebase exploring automatic speech recognition that generalizes across a very large number of languages using shared modeling and training recipes. It focuses on leveraging self-supervised audio pretraining and scalable fine-tuning so low-resource languages can benefit from high-resource data. The project provides data preparation pipelines, training scripts, decoding utilities, and evaluation tools so researchers can reproduce results and extend to new...
    Downloads: 6 This Week
    Last Update:
    See Project
  • 9
    Kaldi

    Kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project

    Kaldi is an open source toolkit for speech recognition research. It provides a powerful framework for building state-of-the-art automatic speech recognition (ASR) systems, with support for deep neural networks, Gaussian mixture models, hidden Markov models, and other advanced techniques. The toolkit is widely used in both academia and industry due to its flexibility, extensibility, and strong community support. Kaldi is designed for researchers who need a highly customizable environment to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • AestheticsPro Medical Spa Software Icon
    AestheticsPro Medical Spa Software

    Our new software release will dramatically improve your medspa business performance while enhancing the customer experience

    AestheticsPro is the most complete Aesthetics Software on the market today. HIPAA Cloud Compliant with electronic charting, integrated POS, targeted marketing and results driven reporting; AestheticsPro delivers the tools you need to manage your medical spa business. It is our mission To Provide an All-in-One Cutting Edge Software to the Aesthetics Industry.
    Learn More
  • 10
    WhisperKit

    WhisperKit

    On-device Speech Recognition for Apple Silicon

    WhisperKit is a Swift package that integrates OpenAI's popular Whisper speech recognition model with Apple's CoreML framework for efficient, local inference on Apple devices. Whisper has pulled the future forward when fast, free and virtually error-free translation and transcription will be ubiquitous. It inspired numerous developers to improve and deploy it with minimal friction and maximum performance. We founded Argmax in November 2023 to empower developers and enterprises everywhere to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Buster

    Buster

    Captcha solver extension for humans

    Save time by asking Buster to solve captchas for you. Buster is a Firefox extension which helps you to solve difficult captchas by completing reCAPTCHA audio challenges using speech recognition. Challenges are solved by clicking on the extension button at the bottom of the reCAPTCHA widget. It is not guaranteed that challenges are always solved, the limitations of the technology need to be considered. The continued development of Buster is made possible thanks to the support of awesome...
    Downloads: 54 This Week
    Last Update:
    See Project
  • 12
    Google2SRT

    Google2SRT

    Download, save and convert multiple subtitles from YouTube videos

    Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.
    Downloads: 81 This Week
    Last Update:
    See Project
  • 13
    Voxal voice changer

    Voxal voice changer

    Transform your voice in real-time voxal voice changer

    Voxal Voice Changer is a program that allows you to modify your voice by applying various effects (e.g. pitch change, echo, etc.) in real-time. Effects can be added in any sequence and in any combination, allowing you to distort your voice beyond recognition. Take your audio to the next level! Our powerful Voice Changer software lets you morph your voice in real-time with stunning AI-powered quality. Whether you're looking to have fun, protect your privacy, or create engaging content,...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 14
    ASRT Speech Recognition

    ASRT Speech Recognition

    A Deep-Learning-Based Chinese Speech Recognition System

    ASRT is an end-to-end deep-learning Chinese ASR system built with TensorFlow/Keras, using convolution + CTC and a Max-Entropy HMM language model. It provides a REST/gRPC server backend and client SDKs in multiple languages (Python, Java, Go, Windows). Notably lightweight, it performs well without needing GPU acceleration and runs across platforms, targeting developers and researchers building Chinese voice interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    VideoSrt

    VideoSrt

    Windows-GUI

    This is an open source Windows-GUI software tool that can recognize video speech and automatically generate subtitle SRT files. VideoSrtIt is written in Golanglanguage and developed based on lxn/walk Windows-GUI toolkit. Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to...
    Downloads: 32 This Week
    Last Update:
    See Project
  • 16
    wav2letter++

    wav2letter++

    Facebook AI research's automatic speech recognition toolkit

    First, install Flashlight (using the 0.3 branch is required) with the ASR application. This repository includes recipes to reproduce the following research papers as well as pre-trained models. All results reproduction must use Flashlight <= 0.3.2 for exact reproducibility. At least one of LZMA, BZip2, or Z is required for LM compression with KenLM. It is highly recommended to build KenLM with position-independent code (-fPIC) enabled, to enable python compatibility. After installing, run...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    DeepLearning

    DeepLearning

    Deep Learning (Flower Book) mathematical derivation

    " Deep Learning " is the only comprehensive book in the field of deep learning. The full name is also called the Deep Learning AI Bible (Deep Learning) . It is edited by three world-renowned experts, Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Includes linear algebra, probability theory, information theory, numerical optimization, and related content in machine learning. At the same time, it also introduces deep learning techniques used by practitioners in the industry, including...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    Deep Learning Drizzle

    Deep Learning Drizzle

    Drench yourself in Deep Learning, Reinforcement Learning

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Deep Learning with PyTorch

    Deep Learning with PyTorch

    Latest techniques in deep learning and representation learning

    This course concerns the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional and recurrent nets, with applications to computer vision, natural language understanding, and speech recognition. The prerequisites include DS-GA 1001 Intro to Data Science or a graduate-level machine learning course. To be able to follow the exercises, you are going to need a laptop with Miniconda (a...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    CMU Sphinx

    CMU Sphinx

    Speech Recognition Toolkit

    Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
    Leader badge
    Downloads: 613 This Week
    Last Update:
    See Project
  • 22

    NFZ-core

    Neural Network Multiplayer SImulation

    Multiplayer Shared Problem solving Neural Network AI Simulation, Chat / Speech Recognition / Customizable Identities / GUI Chat in real time with users and their developed AI Real time Simulation. NNET FANN Blockchain integration
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Tensorpack

    Tensorpack

    A Neural Net Training Interface on TensorFlow, with focus on speed

    Tensorpack is a neural network training interface based on TensorFlow v1. Uses TensorFlow in the efficient way with no extra overhead. On common CNNs, it runs training 1.2~5x faster than the equivalent Keras code. Your training can probably gets faster if written with Tensorpack. Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    NASH OS

    NASH OS

    Nash Operating System for Modern Ecommerce

    The all-built-in-one, automatic, ready-to-go out-of-box, easy-to-use state-of-the-art, and really awesome NASH OS! Over 25,000+ flexible features and controls and all scalable!! The most powerful solution ever built to instantly deliver new heights of online ecommerce enterprise to you.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 25

    KALDI IVR ASTERISK SPEECH

    Working template to create an Asterisk IVR system using kaldi

    Working template to create an Asterisk IVR system using kaldi for speech recognition. IVR based speech recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next