Showing 74 open source projects for "artificial intelligence algorithm"

View related business solutions
  • Gen AI apps are built with MongoDB Atlas Icon
    Gen AI apps are built with MongoDB Atlas

    Build gen AI apps with an all-in-one modern database: MongoDB Atlas

    MongoDB Atlas provides built-in vector search and a flexible document model so developers can build, scale, and run gen AI apps without stitching together multiple databases. From LLM integration to semantic search, Atlas simplifies your AI architecture—and it’s free to get started.
    Start Free
  • Photo and Video Editing APIs and SDKs Icon
    Photo and Video Editing APIs and SDKs

    Trusted by 150 million+ creators and businesses globally

    Unlock Picsart's full editing suite by embedding our Editor SDK directly into your platform. Offer your users the power of a full design suite without leaving your site.
    Learn More
  • 1
    whisper.cpp

    whisper.cpp

    Port of OpenAI's Whisper model in C/C++

    whisper.cpp is a lightweight, C/C++ reimplementation of OpenAI’s Whisper automatic speech recognition (ASR) model—designed for efficient, standalone transcription without external dependencies. The entire high-level implementation of the model is contained in whisper.h and whisper.cpp. The rest of the code is part of the ggml machine learning library. The command downloads the base.en model converted to custom ggml format and runs the inference on all .wav samples in the folder samples....
    Downloads: 470 This Week
    Last Update:
    See Project
  • 2
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    OpenAI Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented...
    Downloads: 98 This Week
    Last Update:
    See Project
  • 3
    OpenVINO

    OpenVINO

    OpenVINO™ Toolkit repository

    OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. Boost deep learning performance in computer vision, automatic speech recognition, natural language processing and other common tasks. Use models trained with popular frameworks like TensorFlow, PyTorch and more. Reduce resource demands and efficiently deploy on a range of Intel® platforms from edge to cloud. This open-source version includes several components: namely Model Optimizer, OpenVINO™ Runtime,...
    Downloads: 21 This Week
    Last Update:
    See Project
  • 4
    Buster

    Buster

    Captcha solver extension for humans

    Save time by asking Buster to solve captchas for you. Buster is a Firefox extension which helps you to solve difficult captchas by completing reCAPTCHA audio challenges using speech recognition. Challenges are solved by clicking on the extension button at the bottom of the reCAPTCHA widget. It is not guaranteed that challenges are always solved, the limitations of the technology need to be considered. The continued development of Buster is made possible thanks to the support of awesome...
    Downloads: 31 This Week
    Last Update:
    See Project
  • Deliver secure remote access with OpenVPN. Icon
    Deliver secure remote access with OpenVPN.

    Trusted by nearly 20,000 customers worldwide, and all major cloud providers.

    OpenVPN's products provide scalable, secure remote access — giving complete freedom to your employees to work outside the office while securely accessing SaaS, the internet, and company resources.
    Get started — no credit card required.
  • 5
    SpeechRecognition

    SpeechRecognition

    Speech recognition module for Python

    Library for performing speech recognition, with support for several engines and APIs, online and offline. Recognize speech input from the microphone, transcribe an audio file, save audio data to an audio file. Show extended recognition results, calibrate the recognizer energy threshold for ambient noise levels (see recognizer_instance.energy_threshold for details). Listening to a microphone in the background, various other useful recognizer features. The easiest way to install this is using...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    WhisperKit

    WhisperKit

    On-device Speech Recognition for Apple Silicon

    WhisperKit is a Swift package that integrates OpenAI's popular Whisper speech recognition model with Apple's CoreML framework for efficient, local inference on Apple devices. Whisper has pulled the future forward when fast, free and virtually error-free translation and transcription will be ubiquitous. It inspired numerous developers to improve and deploy it with minimal friction and maximum performance. We founded Argmax in November 2023 to empower developers and enterprises everywhere to...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 7
    NVIDIA NeMo

    NVIDIA NeMo

    Toolkit for conversational AI

    NVIDIA NeMo, part of the NVIDIA AI platform, is a toolkit for building new state-of-the-art conversational AI models. NeMo has separate collections for Automatic Speech Recognition (ASR), Natural Language Processing (NLP), and Text-to-Speech (TTS) models. Each collection consists of prebuilt modules that include everything needed to train on your data. Every module can easily be customized, extended, and composed to create new conversational AI model architectures. Conversational AI...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    Diffgram

    Diffgram

    Training data (data labeling, annotation, workflow) for all data types

    From ingesting data to exploring it, annotating it, and managing workflows. Diffgram is a single application that will improve your data labeling and bring all aspects of training data under a single roof. Diffgram is world’s first truly open source training data platform that focuses on giving its users an unlimited experience. This is aimed to reduce your data labeling bills and increase your Training Data Quality. Training Data is the art of supervising machines through data. This...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 9
    The SpeechBrain Toolkit

    The SpeechBrain Toolkit

    A PyTorch-based Speech Toolkit

    SpeechBrain is an open-source and all-in-one conversational AI toolkit. It is designed to be simple, extremely flexible, and user-friendly. Competitive or state-of-the-art performance is obtained in various domains. SpeechBrain supports state-of-the-art methods for end-to-end speech recognition, including models based on CTC, CTC+attention, transducers, transformers, and neural language models relying on recurrent neural networks and transformers. Speaker recognition is already deployed in a...
    Downloads: 2 This Week
    Last Update:
    See Project
  • Build Securely on Azure with Proven Frameworks Icon
    Build Securely on Azure with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Kaldi

    Kaldi

    kaldi-asr/kaldi is the official location of the Kaldi project

    Kaldi is an open source toolkit for speech recognition research. It provides a powerful framework for building state-of-the-art automatic speech recognition (ASR) systems, with support for deep neural networks, Gaussian mixture models, hidden Markov models, and other advanced techniques. The toolkit is widely used in both academia and industry due to its flexibility, extensibility, and strong community support. Kaldi is designed for researchers who need a highly customizable environment to...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 11
    Google2SRT

    Google2SRT

    Download, save and convert multiple subtitles from YouTube videos

    Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.
    Downloads: 56 This Week
    Last Update:
    See Project
  • 12
    ASRT Speech Recognition

    ASRT Speech Recognition

    A Deep-Learning-Based Chinese Speech Recognition System

    ASRT is an end-to-end deep-learning Chinese ASR system built with TensorFlow/Keras, using convolution + CTC and a Max-Entropy HMM language model. It provides a REST/gRPC server backend and client SDKs in multiple languages (Python, Java, Go, Windows). Notably lightweight, it performs well without needing GPU acceleration and runs across platforms, targeting developers and researchers building Chinese voice interfaces.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    VideoSrt

    VideoSrt

    Windows-GUI

    This is an open source Windows-GUI software tool that can recognize video speech and automatically generate subtitle SRT files. VideoSrtIt is written in Golanglanguage and developed based on lxn/walk Windows-GUI toolkit. Open source software tool that can recognize video speech and automatically generate subtitle SRT files. It is suitable for business scenarios that quickly and batch generate Chinese/English subtitles and text files for media (video/audio). Recognize video/audio speech to...
    Downloads: 22 This Week
    Last Update:
    See Project
  • 14
    wav2letter++

    wav2letter++

    Facebook AI research's automatic speech recognition toolkit

    First, install Flashlight (using the 0.3 branch is required) with the ASR application. This repository includes recipes to reproduce the following research papers as well as pre-trained models. All results reproduction must use Flashlight <= 0.3.2 for exact reproducibility. At least one of LZMA, BZip2, or Z is required for LM compression with KenLM. It is highly recommended to build KenLM with position-independent code (-fPIC) enabled, to enable python compatibility. After installing, run...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 15
    Tensor2Tensor

    Tensor2Tensor

    Library of deep learning models and datasets

    Deep Learning (DL) has enabled the rapid advancement of many useful technologies, such as machine translation, speech recognition and object detection. In the research community, one can find code open-sourced by the authors to help in replicating their results and further advancing deep learning. However, most of these DL systems use unique setups that require significant engineering effort and may only work for a specific problem or architecture, making it hard to run new experiments and...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    DeepLearning

    DeepLearning

    Deep Learning (Flower Book) mathematical derivation

    " Deep Learning " is the only comprehensive book in the field of deep learning. The full name is also called the Deep Learning AI Bible (Deep Learning) . It is edited by three world-renowned experts, Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Includes linear algebra, probability theory, information theory, numerical optimization, and related content in machine learning. At the same time, it also introduces deep learning techniques used by practitioners in the industry, including...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 17
    Deep Learning Drizzle

    Deep Learning Drizzle

    Drench yourself in Deep Learning, Reinforcement Learning

    Drench yourself in Deep Learning, Reinforcement Learning, Machine Learning, Computer Vision, and NLP by learning from these exciting lectures! Optimization courses which form the foundation for ML, DL, RL. Computer Vision courses which are DL & ML heavy. Speech recognition courses which are DL heavy. Structured Courses on Geometric, Graph Neural Networks. Section on Autonomous Vehicles. Section on Computer Graphics with ML/DL focus.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    Deep Learning with PyTorch

    Deep Learning with PyTorch

    Latest techniques in deep learning and representation learning

    This course concerns the latest techniques in deep learning and representation learning, focusing on supervised and unsupervised deep learning, embedding methods, metric learning, convolutional and recurrent nets, with applications to computer vision, natural language understanding, and speech recognition. The prerequisites include DS-GA 1001 Intro to Data Science or a graduate-level machine learning course. To be able to follow the exercises, you are going to need a laptop with Miniconda (a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 19
    CMU Sphinx

    CMU Sphinx

    Speech Recognition Toolkit

    Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
    Leader badge
    Downloads: 435 This Week
    Last Update:
    See Project
  • 20
    Tensorpack

    Tensorpack

    A Neural Net Training Interface on TensorFlow, with focus on speed

    Tensorpack is a neural network training interface based on TensorFlow v1. Uses TensorFlow in the efficient way with no extra overhead. On common CNNs, it runs training 1.2~5x faster than the equivalent Keras code. Your training can probably gets faster if written with Tensorpack. Scalable data-parallel multi-GPU / distributed training strategy is off-the-shelf to use. Squeeze the best data loading performance of Python with tensorpack.dataflow. Symbolic programming (e.g. tf.data) does not...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    NFZ-core

    Neural Network Multiplayer SImulation

    Multiplayer Shared Problem solving Neural Network AI Simulation, Chat / Speech Recognition / Customizable Identities / GUI Chat in real time with users and their developed AI Real time Simulation. NNET FANN Blockchain integration
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    NASH OS

    NASH OS

    Nash Operating System for Modern Ecommerce

    The all-built-in-one, automatic, ready-to-go out-of-box, easy-to-use state-of-the-art, and really awesome NASH OS! Over 25,000+ flexible features and controls and all scalable!! The most powerful solution ever built to instantly deliver new heights of online ecommerce enterprise to you.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    annyang!

    annyang!

    Speech recognition for your site

    annyang is a tiny javascript library that lets your visitors control your site with voice commands. annyang supports multiple languages, has no dependencies, weighs just 2kb and is free to use. annyang understands commands with named variables, splats, and optional words. Use named variables for one word arguments in your command. Use splats to capture multi-word text at the end of your command (greedy). Use optional words or phrases to define a part of the command as optional. annyang plays...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 24

    KALDI IVR ASTERISK SPEECH

    Working template to create an Asterisk IVR system using kaldi

    Working template to create an Asterisk IVR system using kaldi for speech recognition. IVR based speech recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    ILA - teachable voice assistant

    ILA - teachable voice assistant

    ILA is a fully customizable and teachable voice assistant for Java

    ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own,...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.