Search Results for "java ocr extraction text"

Showing 108 open source projects for "java ocr extraction text"

View related business solutions
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • Auth0 B2B Essentials: SSO, MFA, and RBAC Built In Icon
    Auth0 B2B Essentials: SSO, MFA, and RBAC Built In

    Unlimited organizations, 3 enterprise SSO connections, role-based access control, and pro MFA included. Dev and prod tenants out of the box.

    Auth0's B2B Essentials plan gives you everything you need to ship secure multi-tenant apps. Unlimited orgs, enterprise SSO, RBAC, audit log streaming, and higher auth and API limits included. Add on M2M tokens, enterprise MFA, or additional SSO connections as you scale.
    Sign Up Free
  • 1
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines.
    Downloads: 47 This Week
    Last Update:
    See Project
  • 2
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 3
    Video-subtitle-extractor

    Video-subtitle-extractor

    A GUI tool for extracting hard-coded subtitle (hardsub) from videos

    ...Use local OCR recognition, no need to set up and call any API, and do not need to access online OCR services such as Baidu and Ali to complete text recognition locally. Support GPU acceleration, after GPU acceleration, you can get higher accuracy and faster extraction speed. (CLI version) No need for users to manually set the subtitle area, the project automatically detects the subtitle area through the text detection model.
    Downloads: 39 This Week
    Last Update:
    See Project
  • 4
    Scribe.js

    Scribe.js

    JavaScript OCR and text extraction for images and PDFs

    Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • Gemini 3 and 200+ AI Models on One Platform Icon
    Gemini 3 and 200+ AI Models on One Platform

    Access Google's best plus Claude, Llama, and Gemma. Fine-tune and deploy from one console.

    Build, govern, and optimize agents and models with Gemini Enterprise Agent Platform.
    Start Free
  • 5
    PaddleOCR-json

    PaddleOCR-json

    OCR offline image text recognition command line windows program

    PaddleOCR-json is an OCR engine based on the PaddleOCR project that provides a command-line interface and tools for extracting text from images and exporting results in structured JSON format. It wraps the PaddleOCR models, which are capable of detecting and recognizing text in a wide variety of languages and layouts, into a self-contained executable that can be run locally without needing a deep learning environment configured manually. This makes it practical for developers or system...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    OCRBase

    OCRBase

    MD/.JSON Document OCR and structured data extraction API

    OCRBase is a self-hostable document OCR and structured extraction system built to turn PDFs into machine-usable outputs at scale, aiming to bridge the gap between raw text extraction and production-ready pipelines. Instead of treating OCR as a one-off script, it presents an API-driven workflow where documents are submitted as jobs and processed through a queue-based architecture that can handle high throughput.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    TTime

    TTime

    Screenshots, word marking, OCR, AI, translation software

    TTime is a desktop productivity tool that combines translation, OCR, and screen capture capabilities into a unified application designed for fast and efficient text processing workflows. It allows users to translate text through multiple methods, including direct input, screenshot-based capture, and real-time word selection, making it versatile for both casual use and professional tasks. The software integrates a wide range of translation engines and OCR services, including cloud-based...
    Downloads: 2 This Week
    Last Update:
    See Project
  • 8
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Pot Desktop

    Pot Desktop

    A cross-platform software for text translation and recognition

    Pot-Desktop is a cross-platform productivity tool aimed at helping users quickly translate, perform OCR (optical character recognition), and synthesize speech for selected text or images — all with minimal friction. It supports picking text via mouse selection (“highlight-and-translate”), clipboard listening, or screenshot-based OCR; this makes it ideal for reading webpages, documents, images — or any on-screen text — and instantly getting translations or text extraction.
    Downloads: 4 This Week
    Last Update:
    See Project
  • Build Securely on AWS with Proven Frameworks Icon
    Build Securely on AWS with Proven Frameworks

    Lay a foundation for success with Tested Reference Architectures developed by Fortinet’s experts. Learn more in this white paper.

    Moving to the cloud brings new challenges. How can you manage a larger attack surface while ensuring great network performance? Turn to Fortinet’s Tested Reference Architectures, blueprints for designing and securing cloud environments built by cybersecurity experts. Learn more and explore use cases in this white paper.
    Download Now
  • 10
    Extractous

    Extractous

    Fast and efficient unstructured data extraction

    Extractous is a Rust-based unstructured data extraction library focused on fast local parsing of documents and other content-heavy files. Its purpose is to extract text and metadata efficiently from formats such as PDF, Word, HTML, email archives, images, and more, without depending on external APIs or separate parsing servers. The project emphasizes performance and low memory usage, and its maintainers describe it as a local-first alternative to heavier extraction stacks. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    NeMo Retriever Library

    NeMo Retriever Library

    Document content and metadata extraction microservice

    NeMo Retriever Library is a scalable microservice framework designed for extracting, structuring, and enriching content from documents to support downstream generative AI applications. It processes various document types by splitting them into components such as text, tables, charts, and images, and then applies OCR and contextual analysis to convert them into structured data formats. The system is built on NVIDIA NIM microservices, enabling high-performance parallel processing and efficient handling of large datasets. It supports multiple extraction strategies for different document formats, balancing accuracy and throughput depending on the use case. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    paperless-gpt

    paperless-gpt

    Use LLMs and LLM Vision (OCR) to handle paperless-ngx

    paperless-gpt is an AI-powered extension for document management systems that enhances the capabilities of paperless-ngx by integrating large language models and vision-based OCR to automate document processing and organization. It is designed to transform scanned or uploaded documents into structured, searchable, and intelligently categorized data without requiring manual tagging or sorting. The system uses OCR combined with LLM reasoning to extract text, classify documents, and generate...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Sparrow

    Sparrow

    Structured data extraction and instruction calling with ML, LLM

    Sparrow is an open-source platform designed to extract structured information from documents, images, and other unstructured data sources using machine learning and large language models. The system focuses on transforming complex documents such as invoices, receipts, forms, and scanned pages into structured formats like JSON that can be processed by downstream applications. It combines several components, including OCR pipelines, vision-language models, and LLM-based reasoning modules to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    Unredact

    Unredact

    A simple tool for reading in poorly redacted documents

    Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and...
    Downloads: 25 This Week
    Last Update:
    See Project
  • 15
    OpenDataLoader PDF

    OpenDataLoader PDF

    PDF Parser for AI-ready data. Automate PDF accessibility

    OpenDataLoader PDF is an open-source document processing system designed to convert complex PDF files into structured, AI-ready formats such as Markdown, JSON, and HTML while preserving layout, hierarchy, and semantic meaning. It focuses on enabling downstream use cases like retrieval-augmented generation (RAG), knowledge extraction, and document intelligence pipelines by maintaining accurate reading order and spatial metadata through bounding boxes. The tool combines deterministic parsing...
    Downloads: 3 This Week
    Last Update:
    See Project
  • 16
    Scanopy

    Scanopy

    Clean network diagrams, One-time setup, zero upkeep

    Scanopy is a powerful multi-modal data capture and analysis toolkit that enables users to collect, process, and visualize structured and unstructured information from a variety of sources in a flexible pipeline. It is built to handle complex scanning tasks — such as OCR, document analysis, audio transcription, network data capture, and image extraction — while providing unified APIs and workflows that make managing heterogeneous data sources seamless. Developers can compose custom pipelines...
    Downloads: 5 This Week
    Last Update:
    See Project
  • 17
    deepdoctection

    deepdoctection

    A Repo For Document AI

    DeepDoctection is a document AI framework that applies deep learning techniques to analyze and extract structured data from scanned documents, PDFs, and images. deepdoctection is a Python library that orchestrates document extraction and document layout analysis tasks using deep learning models. It does not implement models but enables you to build pipelines using highly acknowledged libraries for object detection, OCR and selected NLP tasks and provides an integrated frameworks for fine-tuning, evaluating and running models. For more specific text processing tasks use one of the many other great NLP libraries.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    KaraKeep

    KaraKeep

    A self-hostable bookmark-everything app

    ...With support for image OCR and structured data extraction, Karakeep functions as a flexible personal knowledge base for researchers, content creators, and heavy bookmarkers.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    GROBID

    GROBID

    A machine learning software for extracting information

    GROBID is a machine learning library for extracting, parsing, and re-structuring raw documents such as PDF into structured XML/TEI encoded documents with a particular focus on technical and scientific publications. First developments started in 2008 as a hobby. In 2011 the tool has been made available in open source. Work on GROBID has been steady as a side project since the beginning and is expected to continue as such. Header extraction and parsing from article in PDF format. The...
    Downloads: 4 This Week
    Last Update:
    See Project
  • 20
    MiniCPM-o

    MiniCPM-o

    A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming

    MiniCPM-o 2.6 is a cutting-edge multimodal large language model (MLLM) designed for high-performance tasks across vision, speech, and video. Capable of running on end-side devices such as smartphones and tablets, it provides powerful features like real-time speech conversation, video understanding, and multimodal live streaming. With 8 billion parameters, MiniCPM-o 2.6 surpasses its predecessors in versatility and efficiency, making it one of the most robust models available. It supports...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 21
    ANTLR

    ANTLR

    Parser generator to read, process, or translate structured text

    ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It's widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees. It’s widely used in academia and industry to build all sorts of languages, tools, and frameworks. Twitter search uses ANTLR for query parsing, with over 2 billion queries a day. The languages for...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 22
    SunnyCapturer

    SunnyCapturer

    A simple and beautiful cross-platform screenshot software

    SunnyCapturer is a simple and beautiful cross-platform screenshot software tool that supports OCR to extract text from images, image translation, and pinning images to the screen, among other features.
    Leader badge
    Downloads: 1 This Week
    Last Update:
    See Project
  • 23
    tika-python

    tika-python

    Python binding to the Apache Tika™ REST services

    A Python port of the Apache Tika library that makes Tika available using the Tika REST Server. This makes Apache Tika available as a Python library, installable via Setuptools, Pip and easy to install. To use this library, you need to have Java 7+ installed on your system as tika-python starts up the Tika REST server in the background. To get this working in a disconnected environment, download a tika server file (both tika-server.jar and tika-server.jar.md5, which can be found here) and set...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24

    UniversalTextExtractor

    Command-line toolset for extracting text from files

    Command-line toolset for extracting text from files (documents, images, archives) into SQLite with OCR support. Simple, expandable, one shell script only.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    chessPDFBrowser

    chessPDFBrowser

    Chess application whichs allows working with chess PDF books and PGNs.

    Chess application which allows working with PDFs and PGNs. You can work with the chess games of the PDF and edit their tree of variants. Graphical environment. Standard PGN TAGs. PGN comments. Ocr like (Fen string detection from chess board position images). Connection to Uci chess engines (like stockfish). Position analysis, full game analysis. You can now play games against uci engines. pdf2pgn command line command included. Detailed documentation. Multilanguage...
    Downloads: 32 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB