Showing 12 open source projects for "python text parser"

View related business solutions
  • Our Free Plans just got better! | Auth0 Icon
    Our Free Plans just got better! | Auth0

    With up to 25k MAUs and unlimited Okta connections, our Free Plan lets you focus on what you do best—building great apps.

    You asked, we delivered! Auth0 is excited to expand our Free and Paid plans to include more options so you can focus on building, deploying, and scaling applications without having to worry about your security. Auth0 now, thank yourself later.
    Try free now
  • Cloud-based help desk software with ServoDesk Icon
    Cloud-based help desk software with ServoDesk

    Full access to Enterprise features. No credit card required.

    What if You Could Automate 90% of Your Repetitive Tasks in Under 30 Days? At ServoDesk, we help businesses like yours automate operations with AI, allowing you to cut service times in half and increase productivity by 25% - without hiring more staff.
    Try ServoDesk for free
  • 1
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...
    Downloads: 19 This Week
    Last Update:
    See Project
  • 2
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 119 This Week
    Last Update:
    See Project
  • 3

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general...
    Downloads: 58 This Week
    Last Update:
    See Project
  • 4
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and...
    Downloads: 14 This Week
    Last Update:
    See Project
  • Level Up Your Cyber Defense with External Threat Management Icon
    Level Up Your Cyber Defense with External Threat Management

    See every risk before it hits. From exposed data to dark web chatter. All in one unified view.

    Move beyond alerts. Gain full visibility, context, and control over your external attack surface to stay ahead of every threat.
    Try for Free
  • 5
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools. Despite being fairly lightweight (about 1 billion parameters), it delivers state-of-the-art performance across a...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 6
    LayoutParser

    LayoutParser

    A Unified Toolkit for Deep Learning Based Document Image Analysis

    With the help of state-of-the-art deep learning models, Layout Parser enables extracting complicated document structures using only several lines of code. This method is also more robust and generalizable as no sophisticated rules are involved in this process. A complete instruction for installing the main Layout Parser library and auxiliary components. Learn how to load DL Layout models and use them for layout detection. The full list of layout models currently available in Layout Parser....
    Downloads: 0 This Week
    Last Update:
    See Project
  • 7
    Ozyr

    Ozyr

    Ozyr is a simple and easy to use OCR snipping tool

    Ozyr is a simple and easy to use OCR snipping tool to get text from images so you can copy and edit it. Source Code: https://github.com/PETEROLO291/Ozyr Installer: 117MB Program: 524MB Version: 1.0
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 9
    Linux-Intelligent-Ocr-Solution

    Linux-Intelligent-Ocr-Solution

    Easy-OCR solution and Tesseract trainer for GNU/Linux

    Linux-intelligent-ocr-solution Lios is a free and open source software for converting print in to text using either scanner or a camera, It can also produce text out of scanned images from other sources such as Pdf, Image, Folder containing Images or screenshot. Program is given total accessibility for visually impaired. A Tesseract Trainer GUI is also shipped with this package. Forum : https://groups.google.com/forum/#!forum/lios Video Tutorial :...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Keep company data safe with Chrome Enterprise Icon
    Keep company data safe with Chrome Enterprise

    Protect your business with AI policies and data loss prevention in the browser

    Make AI work your way with Chrome Enterprise. Block unapproved sites and set custom data controls that align with your company's policies.
    Download Chrome
  • 10
    Simple Document management software written in PHP for linux-operating-systems supporting Bank-Account-Linking and OCR
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    Software to fit whole-sentence language models using the principle of maximum entropy. For developers of speech recognizers, text prediction interfaces, OCR, machine translation software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    OpenOCR will be a commercial quality ocr engine with tools for pre- and post-processing of images and resulting text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next