recognition free download

52 projects for "recognition" with 2 filters applied:

Multimedia ChromeOS Clear Filters & Widen Search

Build Agents and Models on One Platform
Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.

Try It Free
Stop Cyber Threats with VM-Series Next-Gen Firewall on Azure
Native application identity and user-based security for your Azure cloud

Gain integrated visibility across all traffic in a single pass. Deploy Palo Alto Networks VM-Series to determine application identity and content while automating security policy updates via rich APIs.

Get a free trial
1

Textream

Textream is a free macOS teleprompter app for streamers, interviewers

Textream is an open-source, free macOS teleprompter application designed for streamers, podcasters, presenters, and interviewers who want a smooth, distraction-free way to stay on script. It runs natively on macOS and leverages on-device speech recognition to highlight each word in real time as you speak, keeping your focus where it belongs — on delivery rather than memorization. The interface supports multiple modes of use, such as classic constant-scroll auto-scrolling, voice-activated scrolling that pauses when you’re silent, and direct word tracking that syncs the displayed script to your spoken pace. ...

Downloads: 30 This Week

Last Update: 2026-05-08
See Project
2

SCAIL

Towards Studio-Grade Character Animation via In-Context Learning of 3D

...While specific documentation about SCAIL’s exact goals and implementation is limited from the repository context alone, the project appears to be part of a collection of machine learning and AI research tools that facilitate scalable model development, evaluation, or application workflows. Given its listing alongside other ZAI projects like speech recognition and text-to-speech systems, SCAIL likely emphasizes scalable, composable AI learning frameworks that support researchers and practitioners in experimenting with learning algorithms, datasets, and model components. The repository structure suggests a focus on flexibility and extensibility, with potential integration into other ZAI tooling for training or analysis.

Downloads: 0 This Week

Last Update: 2026-05-06
See Project
3

Google2SRT

Download, save and convert multiple subtitles from YouTube videos

Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.

33 Reviews

Downloads: 34 This Week

Last Update: 2025-01-11
See Project
4

VietOCR

Provides optical character recognition (OCR) solutions for Vietnamese language.

24 Reviews

Downloads: 163 This Week

Last Update: 2026-05-27
See Project
Stop Storing Third-Party Tokens in Your Database
Auth0 Token Vault handles secure token storage, exchange, and refresh for external providers so you don't have to build it yourself.

Rolling your own OAuth token storage can be a security liability. Token Vault securely stores access and refresh tokens from federated providers and handles exchange and renewal automatically. Connected accounts, refresh exchange, and privileged worker flows included.

Try Auth0 for Free
5

Jvedio

Jvedio is a local video management software

...The software supports tagging, filtering, and advanced search, enabling users to manage large collections efficiently. It integrates AI-based features such as actor recognition and translation of metadata, improving the usability and accessibility of stored content. Jvedio also includes media processing tools powered by FFmpeg, allowing users to generate screenshots and GIF previews directly from videos. Its plugin system enables customization through themes and synchronization tools, while its modern interface provides a smooth user experience. ...

Downloads: 2 This Week

Last Update: 2026-04-24
See Project
6

AutoSub

A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

...AutoSub leverages FFmpeg for media handling and integrates with speech recognition engines for transcription. It is particularly useful for content creators who want to quickly produce subtitles without manual effort. Overall, it simplifies the process of making media content accessible and searchable.

Downloads: 10 This Week

Last Update: 2026-04-28
See Project
7

gImageReader

A graphical frontend to tesseract-ocr

...Features include: - Import PDF documents and images from disk, scanning devices, clipboard and screenshots - Process multiple images and documents in one go - Manual or automatic recognition area definition - Recognize to plain text or to hOCR documents - Recognized text displayed directly next to the image - Post-process the recognized text, including spellchecking - Generate PDF documents from hOCR documents **Note**: This page is only a mirror for the downloads. Development is happening on github at https://github.com/manisandro/gImageReader, release binaries are also posted there.

27 Reviews

Downloads: 134 This Week

Last Update: 2022-01-28
See Project
8

TimeSformer

The official pytorch implementation of our paper

TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods. The official implementation in PyTorch...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
9

chatbot_chung

chatbot chung is a keywords based probabilities algorythm simple entertainment chatbot with 3D talking openGL avatars written in freebasic. Can import aiml simple question/answer or question/random/answers or single star/ multi srai data saved from "AIML_chung" open source application . Online html5 javascript version with 44 languages multilingual auto detection available on the website (source included in the zip file). SORT gentext text generation algorythm option added (desktop version) .

Downloads: 2 This Week

Last Update: 2020-06-27
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
10

JNIZ music notation audio to midi

music composition and notation software, audio to midi converter

The Jniz project is stopped. The new Web version is now JnizWeb hosted on Gitlab (under construction): https://gitlab.com/jniz70/jnizweb/ Demo: https://jniz70.gitlab.io/jnizweb/ Jniz is a piece of software designed for musicians as a support tool to the musical composition. It allows you to build and to harmonize several voices according to the rules of classical harmony. Sound/audio-to-Midi converter: real-time conversion of any monophonic sound (voice, instrument etc.) into...

2 Reviews

Downloads: 2 This Week

Last Update: 2026-01-14
See Project
11

LaueTools

open source python packages for X-ray MicroLaue Diffraction analysis

LaueTools is an open-source project for white beam Laue x-ray microdiffraction data analysis including tools in image processing, peaks searching & indexing, crystal structure solving (orientation & strain) and data & grain mapping visualisation. Python 3 Code and new features are now at: https://gitlab.esrf.fr/micha/lauetools

2 Reviews

Downloads: 1 This Week

Last Update: 2019-09-12
See Project
12

Video Nonlocal Net

Non-local Neural Networks for Video Classification

...Non-local blocks compute attention-like responses across all positions in space-time, allowing a feature at one frame and location to aggregate information from distant frames and regions. This formulation improves action recognition and spatiotemporal reasoning, especially for classes requiring context beyond short temporal windows. The repo provides training recipes and models for standard datasets, as well as ablations that show how many non-local blocks to insert and at which stages. Efficient implementations keep memory and compute manageable so the blocks can be added without rewriting the entire backbone. ...

Downloads: 0 This Week

Last Update: 2025-10-07
See Project
13

ILA - teachable voice assistant

ILA is a fully customizable and teachable voice assistant for Java

ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own, free and open Amazon Echo replacement ;-) Right now the key components of ILA are the open source speech recognition CMU Sphinx-4, Google (Speech Recognition/Text-To-Speech) and MaryTTS (Text-To-Speech). ...

4 Reviews

Downloads: 1 This Week

Last Update: 2018-07-23
See Project
14

OpenPR

OpenPR stands for Open Pattern Recognition project and is intended to be an open source library for algorithms of image processing, computer vision, natural language processing, pattern recognition, machine learning and the related fields.

Downloads: 3 This Week

Last Update: 2018-05-15
See Project
15

libcrn

libcrn is document image processing library written in C++11 for Linux, Windows, Mac OsX and Google Android. It is a toolbox that allows to create easily software such as OCRs and layout analysis tools.

Downloads: 1 This Week

Last Update: 2016-10-23
See Project
16

Modular Audio Recognition Framework

MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.

3 Reviews

Downloads: 0 This Week

Last Update: 2015-10-06
See Project
17

Tygamusic

A pygame music lib.

...With this lib I want to create an layer that allows you to interact with the music, how you would expect it. Currently featuring: -Playlist -Normal pausing and resuming (played time isn’t lost when new song is loaded) -Automatic recognition of songs and adding them to a separate list

Downloads: 0 This Week

Last Update: 2015-04-10
See Project
18

Extract Objects from Image

Connected Component Labeling Algorithm - Extracting Objects From image

fast Connected Component Labeling Algorithm - java application - Extracting Objects From image

Downloads: 0 This Week

Last Update: 2015-07-07
See Project
19

LeapInto

Simplified interface to Leap Motion designed for art and music apps

LeapInto provides a simplified interface to the Leap Motion hand sensor input device. Multiple hand recognition is simplified to several stable categories and coordinates are normalised. The interface comes two flavours at present, an open broadcast system using the OSC protocol and a plugin for the Csound audio/music programming language.

Downloads: 0 This Week

Last Update: 2016-05-05
See Project
20

InproTK

An Incremental Spoken Dialogue Processing Toolkit

InproTK is an Incremental Spoken Dialogue Processing Toolkit, that is, a toolkit to help you build dialogue systems that listen and talk incrementally, allowing for advanced interactional behaviour. Please see our Wiki for more information: http://sourceforge.net/p/inprotk/wiki/

Downloads: 0 This Week

Last Update: 2015-06-16
See Project
21

Image Cerberus

Image Cerberus is an image spam detector, based on pattern recognition and image processing techniques. It can be used as a SpamAssassin plug-in or integrated in any other anti-spam filter. It has been widely tested, achieving high performances.

Downloads: 0 This Week

Last Update: 2014-01-14
See Project
22

Optical Mark Recognition MySQL and PHP

Optical Mark recognition with MySQL database and PHP scripts

http://omr-ai.sourceforge.net/ described how to use MySQL and PHP for OMR. The PHP scripts were very much dependent on scanned page size and resolution. I have written the OMR page format and PHP scripts which are more flexible in nature. This script used horizontal and vertical guides to identify area to be analysed for MCQ answer checking and also identify and correct tilt. It is tailor made for A4 page size, but resolution can be varied with little impect on results. The answer page and...

Downloads: 1 This Week

Last Update: 2014-08-20
See Project
23

HMM Speech Recognition in Matlab

A speech recognition system using Matlab/Simulink/Stateflow.

This project provide hidden Markov model speech recognition system by using Matlab/Simulink/Stateflow.

4 Reviews

Downloads: 0 This Week

Last Update: 2016-07-25
See Project
24

HMM Speech Recognition in Java

HMM Speech Recognition in Java

HMM Speech Recognition in Java

Downloads: 0 This Week

Last Update: 2013-09-21
See Project
25

Voce

A speech synthesis and recognition library that is cross-platform, accessible from Java and C++, and has a very small API. Uses CMU Sphinx4 and FreeTTS internally.

3 Reviews

Downloads: 0 This Week

Last Update: 2013-10-03
See Project