Showing 66 open source projects for "fuzzy word recognition"

View related business solutions
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    whisper-timestamped

    whisper-timestamped

    Multilingual Automatic Speech Recognition with word-level timestamps

    Multilingual Automatic Speech Recognition with word-level timestamps and confidence. Whisper is a set of multi-lingual, robust speech recognition models trained by OpenAI that achieve state-of-the-art results in many languages. Whisper models were trained to predict approximate timestamps on speech segments (most of the time with 1-second accuracy), but they cannot originally predict word timestamps.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 2
    Tesseract.js

    Tesseract.js

    A pure Javascript Multilingual OCR

    Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine. Tesseract.js' library supports more than 100 languages, automatic text orientation and script detection, a simple interface for reading paragraph, word, and character bounding boxes. Tesseract.js can run either in a browser and on a server with NodeJS. Tesseract.js is a javascript library that gets words in almost any spoken language out of images. The main Tesseract.js functions (ex. recognize, detect) take an image...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 3
    fzf

    fzf

    A command-line fuzzy finder

    ...(You can override the default command with FZF_DEFAULT_COMMAND). fzf by default starts in fullscreen mode, but you can make it start below the cursor with the height option. Unless otherwise specified, fzf starts in "extended-search mode" where you can type in multiple search terms delimited by spaces. Fuzzy completion for files and directories can be triggered if the word before the cursor ends with the trigger sequence, which is by default **. Fuzzy completion for PIDs is provided for the kill command. In this case, there is no trigger sequence.
    Downloads: 7 This Week
    Last Update:
    See Project
  • 4
    Underthesea

    Underthesea

    Underthesea - Vietnamese NLP Toolkit

    Underthesea is a Vietnamese NLP toolkit providing various text processing capabilities, including word segmentation, part-of-speech tagging, and named entity recognition.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build generative AI apps with Vertex AI. Switch between models without switching platforms.
    Start Free
  • 5
    DocTR

    DocTR

    Library for OCR-related tasks powered by Deep Learning

    ...End-to-End OCR is achieved in docTR using a two-stage approach: text detection (localizing words), then text recognition (identify all characters in the word). As such, you can select the architecture used for text detection, and the one for text recognition from the list of available implementations.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 6
    annyang!

    annyang!

    Speech recognition for your site

    annyang is a tiny javascript library that lets your visitors control your site with voice commands. annyang supports multiple languages, has no dependencies, weighs just 2kb and is free to use. annyang understands commands with named variables, splats, and optional words. Use named variables for one word arguments in your command. Use splats to capture multi-word text at the end of your command (greedy). Use optional words or phrases to define a part of the command as optional. annyang plays nicely with all browsers, progressively enhancing browsers that support SpeechRecognition, while leaving users with older browsers unaffected. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Textream

    Textream

    Textream is a free macOS teleprompter app for streamers, interviewers

    Textream is an open-source, free macOS teleprompter application designed for streamers, podcasters, presenters, and interviewers who want a smooth, distraction-free way to stay on script. It runs natively on macOS and leverages on-device speech recognition to highlight each word in real time as you speak, keeping your focus where it belongs — on delivery rather than memorization. The interface supports multiple modes of use, such as classic constant-scroll auto-scrolling, voice-activated scrolling that pauses when you’re silent, and direct word tracking that syncs the displayed script to your spoken pace. ...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 8
    Polyglot

    Polyglot

    Cross-platform AI language practice app

    ...Users can define custom AI personas, choose languages, and configure their own OpenAI and Azure keys so they retain control over which backends they use. The app supports speech recognition with quick keyboard shortcuts, allowing learners to hold down a key to speak and release it to submit for recognition and response. It includes translation features, dark mode, playback of the user’s own recorded speech, and word highlighting that tracks the progress of synthesized audio to make following along easier. Polyglot also integrates additional AI providers, supports configurable conversation scenarios, and lets users personalize avatars, making the experience more engaging and flexible.
    Downloads: 4 This Week
    Last Update:
    See Project
  • 9
    HanLP

    HanLP

    Han Language Processing

    ...Built on TensorFlow 2.0, it was designed to advance state-of-the-art deep learning techniques and popularize the application of natural language processing in both academia and industry. HanLP is capable of lexical analysis (Chinese word segmentation, part-of-speech tagging, named entity recognition), syntax analysis, text classification, and sentiment analysis. It comes with pretrained models for numerous languages including Chinese and English. It offers efficient performance, clear structure and customizable features, with plenty more amazing features to look forward to on the roadmap.
    Downloads: 3 This Week
    Last Update:
    See Project
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • 10
    WhisperJAV

    WhisperJAV

    Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD

    WhisperJAV is an open-source speech transcription pipeline designed specifically for generating subtitles for Japanese adult video content. The project addresses challenges that standard speech recognition models face when transcribing this type of audio, which often includes low signal-to-noise ratios and large numbers of non-verbal vocalizations. Traditional automatic speech recognition systems can misinterpret these sounds as words, leading to inaccurate transcripts. WhisperJAV introduces...
    Downloads: 11 This Week
    Last Update:
    See Project
  • 11
    gse

    gse

    Go efficient multilingual NLP and text segmentation

    Go efficient multilingual NLP and text segmentation; support English, Chinese, Japanese and others. Gse is implements jieba by golang, and try add NLP support and more feature. Support common, search engine, full mode, precise mode and HMM mode multiple word segmentation modes. Support user and embed dictionary, Part-of-speech/POS tagging, analyze segment info, stop and trim words. Support multilingual: English, Chinese, Japanese and others. Support Traditional Chinese. Support HMM cut text use Viterbi algorithm. Support NLP by TensorFlow (in work). Named Entity Recognition (in work). ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    TTime

    TTime

    Screenshots, word marking, OCR, AI, translation software

    TTime is a desktop productivity tool that combines translation, OCR, and screen capture capabilities into a unified application designed for fast and efficient text processing workflows. It allows users to translate text through multiple methods, including direct input, screenshot-based capture, and real-time word selection, making it versatile for both casual use and professional tasks. The software integrates a wide range of translation engines and OCR services, including cloud-based...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    Scriberr

    Scriberr

    Self-hosted AI audio transcription

    ...Unlike cloud-based transcription services, Scriberr runs entirely on the user’s machine, ensuring that sensitive recordings are never sent to third-party servers and remain fully under user control. It leverages modern speech recognition models such as Whisper and other advanced architectures to deliver precise transcripts with word-level timing and speaker identification. The application includes a polished user interface that simplifies the management of recordings, transcripts, and annotations, making it suitable for both casual users and professionals handling large volumes of audio. ...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 14
    spaCy

    spaCy

    Industrial-strength Natural Language Processing (NLP)

    spaCy is a library built on the very latest research for advanced Natural Language Processing (NLP) in Python and Cython. Since its inception it was designed to be used for real world applications-- for building real products and gathering real insights. It comes with pretrained statistical models and word vectors, convolutional neural network models, easy deep learning integration and so much more. spaCy is the fastest syntactic parser in the world according to independent benchmarks, with...
    Downloads: 10 This Week
    Last Update:
    See Project
  • 15
    LLPlayer

    LLPlayer

    The media player for language learning, with dual subtitles

    LLPlayer is an open-source media player designed specifically for language learning through video content. Unlike traditional media players, the application focuses on advanced subtitle-related features that help learners understand and interact with foreign language media more effectively. The player supports dual subtitles so users can simultaneously view text in both the original language and their native language while watching videos. It can also automatically generate subtitles in real...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 16
    spaCy models

    spaCy models

    Models for the spaCy Natural Language Processing (NLP) library

    spaCy is designed to help you do real work, to build real products, or gather real insights. The library respects your time, and tries to avoid wasting it. It's easy to install, and its API is simple and productive. spaCy excels at large-scale information extraction tasks. It's written from the ground up in carefully memory-managed Cython. If your application needs to process entire web dumps, spaCy is the library you want to be using. Since its release in 2015, spaCy has become an industry...
    Downloads: 16 This Week
    Last Update:
    See Project
  • 17
    Rofi

    Rofi

    A window switcher, application launcher and dmenu replacement

    Rofi started as a clone of the simple switcher, written by Sean Pringle - a popup window switcher roughly based on a super switcher. Simpleswitcher laid the foundations, and therefore Sean Pringle deserves most of the credit for this tool. Rofi (renamed, as it lost the simple property) has been extended with extra features, like an application launcher and ssh-launcher, and can act as a drop-in menu replacement, making it a very versatile tool. Rofi, like dmenu, will provide the user with a...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 18
    BaikalDB

    BaikalDB

    BaikalDB, A Distributed HTAP Database

    ...In a typical scenario, hundreds of millions of rows can be scanned and aggregated in few seconds. BaikalDB also supports full-text search by building inverted indices after word segmentation. Users can harness the fuzzy search features simply by adding a FULLTEXT KEY type index when creating tables.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    flair

    flair

    A very simple framework for state-of-the-art NLP

    ...Developed by Humboldt University of Berlin and friends. A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), sentiment analysis, part-of-speech tagging (PoS), special support for biomedical texts, sense disambiguation and classification, with support for a rapidly growing number of languages. A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings and various transformers. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    match-sorter

    match-sorter

    Simple, expected, and deterministic best-match sorting

    match-sorter is a small JavaScript library that takes a list of items and returns them sorted by how well they match a given search query. It is designed to produce “simple, expected, and deterministic” results so users see intuitive matches instead of opaque fuzzy scores. The core API accepts arrays of strings or objects and returns a filtered, ranked list, making it a natural fit for search boxes, autocomplete components, and table filtering. It supports a variety of advanced options, such...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21
    Stanza

    Stanza

    Stanford NLP Python library for many human languages

    Stanza is a collection of accurate and efficient tools for the linguistic analysis of many human languages. Starting from raw text to syntactic analysis and entity recognition, Stanza brings state-of-the-art NLP models to languages of your choosing. Stanza is a Python natural language analysis package. It contains tools, which can be used in a pipeline, to convert a string containing human language text into lists of sentences and words, to generate base forms of those words, their parts of...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 22
    OmegaT - multiplatform CAT tool

    OmegaT - multiplatform CAT tool

    The free computer aided translation (CAT) tool for professionals

    OmegaT is a free and open source multiplatform Computer Assisted Translation tool with fuzzy matching, translation memory, keyword search, glossaries, and translation leveraging into updated projects.
    Leader badge
    Downloads: 1,581 This Week
    Last Update:
    See Project
  • 23
    PDP Astronomical Image Framework

    PDP Astronomical Image Framework

    Quantum Cosmology & Astrophysics Unified Suite (QCAUS)

    A collection of four interconnected open‑source projects that explore the quantum nature of the universe – from the early cosmos to extreme astrophysical environments. 📸 Live Demo The application is deployed on Streamlit Cloud: Live App test now: https://qcuas-quantum-cosmology-astrophysics-unified-suite.streamlit.app/ QCI AstroEntangle Refiner – FDM soliton physics & image processing Magnetar QED Explorer – Magnetar fields, dark photons & vacuum QED Primordial...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Gravia-5

    Gravia-5

    Gravia is a desktop AI virtual assistant

    Meet Gravia, the ultimate desktop AI virtual assistant designed to revolutionize your desktop experience. With Gravia by your side, you can streamline your tasks with ease and efficiency. Getting started with Gravia is easy - simply download and install the application on your desktop. For any questions or support, our dedicated team is here to assist you every step of the way. Customize and personalize your Gravia experience to match your unique preferences. Rest assured, your privacy...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Kindle Mate(KMate)

    Kindle Mate(KMate)

    Kindle clippings and Kindle Vocabulary Builder manager

    KMate is the ultimate reading companion for Kindle users — and the all-new, cross-platform successor to Kindle Mate, the classic Kindle notes manager trusted by readers worldwide for over a decade. It is the only Kindle assistant that unifies cross-device import, cloud sync, vocabulary & dictionary management, flexible export, reading analytics, and AI-powered definitions — all in one app. ## KMate 3 for Windows latest (Store...
    Downloads: 24 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • Next
MongoDB Logo MongoDB