42 projects for "ocr a" with 2 filters applied:

  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Compliant and Reliable File Transfers Backed by Top Security Certifications Icon
    Compliant and Reliable File Transfers Backed by Top Security Certifications

    Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

    Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.
    Start Free Trial
  • 1
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 3
    Unlimited OCR Works

    Unlimited OCR Works

    Welcome the Era of One-shot Long-horizon Parsing

    Unlimited-OCR is an OCR and document parsing model project focused on one-shot long-horizon parsing. It is designed to push OCR beyond short, isolated image recognition and into longer document understanding workflows. The project supports single-image parsing as well as multi-page and PDF-style parsing by converting pages into images. It provides inference paths for Hugging Face Transformers, vLLM, and SGLang, which gives users several deployment options.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 4
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...
    Downloads: 3 This Week
    Last Update:
    See Project
  • Ship Agents Faster Icon
    Ship Agents Faster

    Transform your applications and workflows into powerful agentic systems at global scale.

    Gemini Enterprise Agent Platform lets you rapidly build, scale, govern and optimize production-ready agents grounded in your organization's data. The platform enables developers to build custom or pre-built agents for virtually any use case. New customers get $300 in free credits.
    Get Started Free
  • 5
    OpenOCR

    OpenOCR

    An Open-Source Toolkit for General-OCR Research and Applications

    OpenOCR is an open-source General OCR toolkit developed by the OCR team at Fudan University for research and real-world document processing applications. It provides a unified platform for text detection, text recognition, formula recognition, table recognition, and document parsing. Built on advanced OCR technologies such as SVTRv2 and UniRec-0.1B, OpenOCR delivers high accuracy while maintaining efficient inference performance.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 6
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    ...Beyond standard OCR tasks, it extends its capabilities to parse complex visual elements such as charts, diagrams, and web interfaces, converting them into structured outputs like SVG code.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Chandra

    Chandra

    OCR model for complex documents with layout-aware structured outputs

    Chandra is an advanced OCR model designed to extract and structure information from complex documents such as tables, forms, handwritten notes, and mathematical content. It focuses on preserving full document layout, meaning that extracted text is accompanied by positional metadata like bounding boxes for each element. Chandra supports multiple output formats including Markdown, HTML, and JSON, making it suitable for downstream processing and integration into data pipelines.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    React Native ExecuTorch

    React Native ExecuTorch

    Declarative way to run AI models in React Native on device

    ...It is powered by ExecuTorch and provides a declarative approach to on-device model execution. The project supports a range of AI use cases, including large language models, computer vision, OCR, object detection, speech processing, segmentation, and embeddings. It helps React Native developers use local AI capabilities without needing deep native programming or machine learning infrastructure expertise. The library is especially relevant for privacy-first apps, offline experiences, and mobile products that need low-latency inference. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • Cut Data Warehouse Costs by 54% Icon
    Cut Data Warehouse Costs by 54%

    Easily migrate from Snowflake, Redshift, or Databricks with free tools.

    BigQuery delivers 54% lower TCO with exabyte scale and flexible pricing. Free migration tools handle the SQL translation automatically.
    Try Free
  • 10
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    ...For broader format support, the system combines its Rust core with ahead-of-time compiled Apache Tika shared libraries, which allows it to extend parsing coverage while still avoiding traditional server-based overhead. It also supports OCR for images and scanned documents through Tesseract, making it useful for document ingestion pipelines that include image-based or scanned inputs.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    DocStrange

    DocStrange

    Extract and convert data from any document, images, pdfs, word doc

    DocStrange is an open-source document understanding and extraction library designed to convert complex files into structured, LLM-ready outputs such as Markdown, JSON, CSV, and HTML. Developed by Nanonets, the project combines OCR, layout detection, table understanding, and structured extraction into one end-to-end pipeline, which reduces the need to stitch together multiple separate services. It is built for developers who need high-quality parsing from scans, photos, PDFs, office files, and other document sources while preserving privacy and control over the processing flow. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    SimpleHTR

    SimpleHTR

    Handwritten Text Recognition (HTR) system implemented with TensorFlow

    SimpleHTR is an open-source implementation of a handwriting text recognition system based on deep learning techniques. The project focuses on converting images of handwritten text into machine-readable digital text using neural networks. The system uses a combination of convolutional neural networks and recurrent neural networks to extract visual features and model sequential character patterns in handwriting. It also employs connectionist temporal classification (CTC) to align predicted...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    autoMate

    autoMate

    AI tool for automating desktop tasks via natural language input

    autoMate is an AI-powered local automation tool designed to enable users to control and automate their computers using natural language instructions instead of traditional scripting or rule-based systems. It combines large language models with computer vision techniques to interpret user intent and understand on-screen content, allowing it to interact with graphical interfaces similarly to a human user. autoMate follows an observe-decide-act workflow, where it analyzes the screen, plans...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    docext

    docext

    An on-premises, OCR-free unstructured data extraction

    docext is a document intelligence toolkit that uses vision-language models to extract structured information from documents such as PDFs, forms, and scanned images. The system is designed to operate entirely on-premises, allowing organizations to process sensitive documents without relying on external cloud services. Unlike traditional document processing pipelines that rely heavily on optical character recognition, docext leverages multimodal AI models capable of understanding both visual...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Provides optical character recognition (OCR) solutions for Vietnamese language.
    Leader badge
    Downloads: 234 This Week
    Last Update:
    See Project
  • 16
    chessPDFBrowser

    chessPDFBrowser

    Chess application whichs allows working with chess PDF books and PGNs.

    Chess application which allows working with PDFs and PGNs. You can work with the chess games of the PDF and edit their tree of variants. Graphical environment. Standard PGN TAGs. PGN comments. Ocr like (Fen string detection from chess board position images). Connection to Uci chess engines (like stockfish). Position analysis, full game analysis. You can now play games against uci engines. pdf2pgn command line command included. Detailed documentation. Multilanguage currently support for English, Spanish and Catalan. ...
    Downloads: 52 This Week
    Last Update:
    See Project
  • 17
    OpenKM Document Management - DMS

    OpenKM Document Management - DMS

    Document Management System and Content Management System

    OpenKM Community Edition is a free Document Management System (DMS) that helps businesses control the production, storage, management and distribution of electronic documents, boosting effectiveness and productivity. It integrates document management, collaboration and advanced search into one easy-to-use solution, including administration tools for user roles, access control, security levels, activity logs and automation setup. With OpenKM Community Edition you can: Collect information...
    Leader badge
    Downloads: 258 This Week
    Last Update:
    See Project
  • 18
    gImageReader

    gImageReader

    A graphical frontend to tesseract-ocr

    gImageReader is a simple Gtk/Qt front-end to tesseract. Features include: - Import PDF documents and images from disk, scanning devices, clipboard and screenshots - Process multiple images and documents in one go - Manual or automatic recognition area definition - Recognize to plain text or to hOCR documents - Recognized text displayed directly next to the image - Post-process the recognized text, including spellchecking - Generate PDF documents from hOCR documents **Note**:...
    Leader badge
    Downloads: 146 This Week
    Last Update:
    See Project
  • 19
    cintruder

    cintruder

    CIntruder - OCR Bruteforcing Toolkit

    Captcha Intruder is an automatic pentesting tool to bypass captchas. -> CIntruder-v0.4 (.zip) -> md5 = 6326ab514e329e4ccd5e1533d5d53967 -> CIntruder-v0.4 (.tar.gz) ->md5 = 2256fccac505064f3b84ee2c43921a68 --------------------------------------------
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    A Java JNA wrapper for Tesseract OCR API
    Leader badge
    Downloads: 44 This Week
    Last Update:
    See Project
  • 21
    Convolutional Recurrent Neural Network

    Convolutional Recurrent Neural Network

    Convolutional Recurrent Neural Network (CRNN) for image-based sequence

    Convolutional Recurrent Neural Network provides an implementation of the Convolutional Recurrent Neural Network (CRNN) architecture, a deep learning model designed for image-based sequence recognition tasks such as optical character recognition and scene text recognition. The architecture combines convolutional neural networks for extracting visual features from images with recurrent neural networks that model sequential dependencies in the extracted features. This hybrid approach allows the...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22

    WebDjVuTextEd

    Edit the OCR text layer of DjVu documents in a web browser

    WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking. Note that current SVN...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23

    Immutable Sparse Wave Trees (WaveTree)

    Realtime bigdata tool for bit strings up to 2^63 based on AVL forest

    Realtime bigdata tool at the bit level based on immutable AVL forest which can be run in memory or, in future versions, as a merkle forest like a blockchain. Main object is a sparse bit string (Bits) that efficiently scales up to 2^63 bits normally compressed as forest has duplicated substrings. Bits objects support reading bit, byte, short, int, or long (Java primitives) at any bit index in 64 bit range. Example: instead of building a class to hold a header and then data, represent all of...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    CD+Graphics Magic
    Timeline based editor for creating Compact Disc Subcode Graphics (also known as CD+G or CDG). Both karaoke and multimedia styles of content are supported. Please visit cdgmagic.sf.net for examples playable directly in the HTML5 CD+G player. CD+Graphics Scribe utility (separate download -- click "Browse All Files" above) can now convert existing CDG karaoke content to CMP (CD+Graphics Magic Project), LRC (Enhanced Lyrics), and ASS (Advanced SubStation Alpha) format.
    Leader badge
    Downloads: 24 This Week
    Last Update:
    See Project
  • 25

    File-em

    File-'em is an automatic receipts organizer implemented in Java & SWT.

    File-'em (pronounced like phylum) is an open source alternative to the software behind NeatReceipts?®. It allows you to load in scanned receipts and automatically pulls the information out of the receipt using OCR and stores it in a SQLite database for easy reference, reports, and retrieval.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
Auth0 Logo