86 projects for "recognition" with 2 filters applied:

  • $300 Free Credits for Your Google Cloud Projects Icon
    $300 Free Credits for Your Google Cloud Projects

    Start building on Google Cloud with $300 in free credits. No commitment, no credit card required until you're ready to scale.

    Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
    Start Free Trial
  • Build Agents and Models on One Platform Icon
    Build Agents and Models on One Platform

    Everything you need to build production-ready agents and models. Access 200+ Google and third-party AI models and tools.

    Gemini Enterprise Agent Platform is Google Cloud's comprehensive platform for developers to build, scale, govern, and optimize agents and models. Choose from Google's most advanced models and third-party models like Anthropic's Claude Model Family.
    Try It Free
  • 1
    Textream

    Textream

    Textream is a free macOS teleprompter app for streamers, interviewers

    Textream is an open-source, free macOS teleprompter application designed for streamers, podcasters, presenters, and interviewers who want a smooth, distraction-free way to stay on script. It runs natively on macOS and leverages on-device speech recognition to highlight each word in real time as you speak, keeping your focus where it belongs — on delivery rather than memorization. The interface supports multiple modes of use, such as classic constant-scroll auto-scrolling, voice-activated scrolling that pauses when you’re silent, and direct word tracking that syncs the displayed script to your spoken pace. ...
    Downloads: 37 This Week
    Last Update:
    See Project
  • 2
    SCAIL

    SCAIL

    Towards Studio-Grade Character Animation via In-Context Learning of 3D

    ...While specific documentation about SCAIL’s exact goals and implementation is limited from the repository context alone, the project appears to be part of a collection of machine learning and AI research tools that facilitate scalable model development, evaluation, or application workflows. Given its listing alongside other ZAI projects like speech recognition and text-to-speech systems, SCAIL likely emphasizes scalable, composable AI learning frameworks that support researchers and practitioners in experimenting with learning algorithms, datasets, and model components. The repository structure suggests a focus on flexibility and extensibility, with potential integration into other ZAI tooling for training or analysis.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 3
    Google2SRT

    Google2SRT

    Download, save and convert multiple subtitles from YouTube videos

    Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.
    Downloads: 26 This Week
    Last Update:
    See Project
  • 4
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 159 This Week
    Last Update:
    See Project
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • 5
    Jvedio

    Jvedio

    Jvedio is a local video management software

    ...The software supports tagging, filtering, and advanced search, enabling users to manage large collections efficiently. It integrates AI-based features such as actor recognition and translation of metadata, improving the usability and accessibility of stored content. Jvedio also includes media processing tools powered by FFmpeg, allowing users to generate screenshots and GIF previews directly from videos. Its plugin system enables customization through themes and synchronization tools, while its modern interface provides a smooth user experience. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 6
    AutoSub

    AutoSub

    A CLI script to generate subtitle files (SRT/VTT/TXT) for any video

    ...AutoSub leverages FFmpeg for media handling and integrates with speech recognition engines for transcription. It is particularly useful for content creators who want to quickly produce subtitles without manual effort. Overall, it simplifies the process of making media content accessible and searchable.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 7
    Physics Simulation Software based on user sketchs running a pattern recognition agent, this app is able to animate a physics sketch, from a blackboard
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    gImageReader

    gImageReader

    A graphical frontend to tesseract-ocr

    ...Features include: - Import PDF documents and images from disk, scanning devices, clipboard and screenshots - Process multiple images and documents in one go - Manual or automatic recognition area definition - Recognize to plain text or to hOCR documents - Recognized text displayed directly next to the image - Post-process the recognized text, including spellchecking - Generate PDF documents from hOCR documents **Note**: This page is only a mirror for the downloads. Development is happening on github at https://github.com/manisandro/gImageReader, release binaries are also posted there.
    Downloads: 122 This Week
    Last Update:
    See Project
  • 9
    TimeSformer

    TimeSformer

    The official pytorch implementation of our paper

    TimeSformer is a vision transformer architecture for video that extends the standard attention mechanism into spatiotemporal attention. The model alternates attention along spatial and temporal dimensions (or designs variants like divided attention) so that it can capture both appearance and motion cues in video. Because the attention is global across frames, TimeSformer can reason about dependencies across long time spans, not just local neighborhoods. The official implementation in PyTorch...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Enterprise-grade ITSM, for every business Icon
    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity.

    Freshservice is an intuitive, AI-powered platform that helps IT, operations, and business teams deliver exceptional service without the usual complexity. Automate repetitive tasks, resolve issues faster, and provide seamless support across the organization. From managing incidents and assets to driving smarter decisions, Freshservice makes it easy to stay efficient and scale with confidence.
    Try it Free
  • 10
    LTI-Lib is an object oriented computer vision library written in C++ for Windows/MS-VC++ and Linux/gcc. It provides lots of functionality to solve mathematical problems, many image processing algorithms, some classification tools and much more...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 11
    General C++ Library, with modules for Computer Vision, Pattern Recognition and much more.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    chatbot_chung
    chatbot chung is a keywords based probabilities algorythm simple entertainment chatbot with 3D talking openGL avatars written in freebasic. Can import aiml simple question/answer or question/random/answers or single star/ multi srai data saved from "AIML_chung" open source application . Online html5 javascript version with 44 languages multilingual auto detection available on the website (source included in the zip file). SORT gentext text generation algorythm option added (desktop version) .
    Downloads: 2 This Week
    Last Update:
    See Project
  • 13
    JNIZ music notation audio to midi

    JNIZ music notation audio to midi

    music composition and notation software, audio to midi converter

    The Jniz project is stopped. The new Web version is now JnizWeb hosted on Gitlab (under construction): https://gitlab.com/jniz70/jnizweb/ Demo: https://jniz70.gitlab.io/jnizweb/ Jniz is a piece of software designed for musicians as a support tool to the musical composition. It allows you to build and to harmonize several voices according to the rules of classical harmony. Sound/audio-to-Midi converter: real-time conversion of any monophonic sound (voice, instrument etc.) into...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    LaueTools

    LaueTools

    open source python packages for X-ray MicroLaue Diffraction analysis

    LaueTools is an open-source project for white beam Laue x-ray microdiffraction data analysis including tools in image processing, peaks searching & indexing, crystal structure solving (orientation & strain) and data & grain mapping visualisation. Python 3 Code and new features are now at: https://gitlab.esrf.fr/micha/lauetools
    Downloads: 2 This Week
    Last Update:
    See Project
  • 15
    Video Nonlocal Net

    Video Nonlocal Net

    Non-local Neural Networks for Video Classification

    ...Non-local blocks compute attention-like responses across all positions in space-time, allowing a feature at one frame and location to aggregate information from distant frames and regions. This formulation improves action recognition and spatiotemporal reasoning, especially for classes requiring context beyond short temporal windows. The repo provides training recipes and models for standard datasets, as well as ablations that show how many non-local blocks to insert and at which stages. Efficient implementations keep memory and compute manageable so the blocks can be added without rewriting the entire backbone. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    ILA - teachable voice assistant

    ILA - teachable voice assistant

    ILA is a fully customizable and teachable voice assistant for Java

    ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own, free and open Amazon Echo replacement ;-) Right now the key components of ILA are the open source speech recognition CMU Sphinx-4, Google (Speech Recognition/Text-To-Speech) and MaryTTS (Text-To-Speech). ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 17
    OpenPR
    OpenPR stands for Open Pattern Recognition project and is intended to be an open source library for algorithms of image processing, computer vision, natural language processing, pattern recognition, machine learning and the related fields.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 18
    Animal is AN IMAging Library written in C. Its simple API supports over 80 image formats, and is intended to make massive use of other image processing libraries. Animal aims at image analysis and recognition. It is mainly the C basis of the SIP toolbox.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    Gamera is a framework for the creation of structured document analysis applications by domain experts. It combines a programming library with GUI tools for the training and interactive development of recognition systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    libcrn is document image processing library written in C++11 for Linux, Windows, Mac OsX and Google Android. It is a toolbox that allows to create easily software such as OCRs and layout analysis tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    MARF is a general cross-platform framework with a collection of algorithms for audio (voice, speech, and sound) and natural language text analysis and recognition along with sample applications (identification, NLP, etc.) of its use, implemented in Java.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    Tygamusic

    A pygame music lib.

    ...With this lib I want to create an layer that allows you to interact with the music, how you would expect it. Currently featuring: -Playlist -Normal pausing and resuming (played time isn’t lost when new song is loaded) -Automatic recognition of songs and adding them to a separate list
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    High-order HMM in Matlab

    Implementation of duration high-order hidden Markov model in Matlab.

    Implementation of duration high-order hidden Markov model (DHO-HMM) in Matlab with application in speech recognition.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    jaivox

    jaivox

    Speech recognition application builder and library

    Java library and tools to create open source speech recognition applications. Generates dialogs for conversational interfaces. Works with a popular open source speech recognition library.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25

    Extract Objects from Image

    Connected Component Labeling Algorithm - Extracting Objects From image

    fast Connected Component Labeling Algorithm - java application - Extracting Objects From image
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • Next