Showing 16 open source projects for "python"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Train ML Models With SQL You Already Know Icon
    Train ML Models With SQL You Already Know

    BigQuery automates data prep, analysis, and predictions with built-in AI assistance.

    Build and deploy ML models using familiar SQL. Automate data prep with built-in Gemini. Query 1 TB and store 10 GB free monthly.
    Try Free
  • 1
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 101 This Week
    Last Update:
    See Project
  • 2
    PaddleOCR

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general...
    Downloads: 71 This Week
    Last Update:
    See Project
  • 3
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines. The software supports flexible usage patterns including screenshot capture OCR, batch...
    Downloads: 41 This Week
    Last Update:
    See Project
  • 4
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 18 This Week
    Last Update:
    See Project
  • Fully Managed MySQL, PostgreSQL, and SQL Server Icon
    Fully Managed MySQL, PostgreSQL, and SQL Server

    Automatic backups, patching, replication, and failover. Focus on your app, not your database.

    Cloud SQL handles your database ops end to end, so you can focus on your app.
    Try Free
  • 5
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B),...
    Downloads: 23 This Week
    Last Update:
    See Project
  • 6
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    ...It supports local deployment, enabling organizations concerned about privacy or latency to run the pipeline on-premises rather than send sensitive documents to third-party cloud services. The codebase is written in Python with a focus on modularity: you can swap preprocessing, recognition, and post-processing components as needed for custom workflows.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 7
    Papermerge

    Papermerge

    Open Source Document Management System for Digital Archives

    Papermerge is an open source document management system (DMS) primarily designed for archiving and retrieving your digital documents. Instead of having piles of paper documents all over your desk, office or drawers - you can quickly scan them and configure your scanner to directly upload to Papermerge DMS. Store, organize and index scanned documents in PDF, JPEG and TIFF formats. Instantly find relevant information using full text, tags and metadata-based search. Papermerge is free and...
    Downloads: 24 This Week
    Last Update:
    See Project
  • 8
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...
    Downloads: 7 This Week
    Last Update:
    See Project
  • 9
    Paper2GUI

    Paper2GUI

    Convert AI papers to GUI

    Convert AI papers to GUI,Make it easy and convenient for everyone to use artificial intelligence technology。让每个人都简单方便的使用前沿人工智能技术 Paper2GUI: An AI desktop APP toolbox for ordinary people. It can be used immediately without installation. It already supports 40+ AI models, covering AI painting, speech synthesis, video frame complementing, video super-resolution, object detection, and image stylization. , OCR recognition and other fields. Support Windows, Mac, Linux systems. Paper2GUI:...
    Downloads: 10 This Week
    Last Update:
    See Project
  • Add Two Lines of Code. Get Full APM. Icon
    Add Two Lines of Code. Get Full APM.

    AppSignal installs in minutes and auto-configures dashboards, alerts, and error tracking.

    Works out of the box for Rails, Django, Express, Phoenix, and more. Monitoring exceptions and performance in no time.
    Start Free
  • 10
    dots.ocr

    dots.ocr

    Multilingual Document Layout Parsing in a Single Vision-Language Model

    dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    LayoutParser

    LayoutParser

    A Unified Toolkit for Deep Learning Based Document Image Analysis

    With the help of state-of-the-art deep learning models, Layout Parser enables extracting complicated document structures using only several lines of code. This method is also more robust and generalizable as no sophisticated rules are involved in this process. A complete instruction for installing the main Layout Parser library and auxiliary components. Learn how to load DL Layout models and use them for layout detection. The full list of layout models currently available in Layout Parser....
    Downloads: 9 This Week
    Last Update:
    See Project
  • 12
    qiji-font

    qiji-font

    Typeface from Ming Dynasty woodblock printed books

    Typeface from Ming Dynasty woodblock printed books. A Ming typeface. Extracted from Ming Dynasty woodblock printed books (凌閔刻本). Using semi-automatic computer vision and OCR. Open-source. A work in progress. Named in honor of 閔齊伋, a 16th-century printer. Intended to be used with Kenyan-lang, the Classical Chinese programming language. Download high-resolution PDFs and split pages into images. Manually lay a grid on top of each page to generate bounding boxes for characters (potentially...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    Paperless-ng

    Paperless-ng

    A supercharged version of paperless, scan, index and archive docs

    Paperless is a simple Django application running in two parts, a Consumer (the thing that does the indexing) and a Web server (the part that lets you search & download already-indexed documents). Paper is a nightmare. Environmental issues aside, there’s no excuse for it in the 21st century. It takes up space, collects dust, doesn’t support any form of a search feature, indexing is tedious, it’s heavy and prone to damage & loss. I wrote this to make “going paperless” easier. I do not have to...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 14
    cintruder

    cintruder

    CIntruder - OCR Bruteforcing Toolkit

    Captcha Intruder is an automatic pentesting tool to bypass captchas. -> CIntruder-v0.4 (.zip) -> md5 = 6326ab514e329e4ccd5e1533d5d53967 -> CIntruder-v0.4 (.tar.gz) ->md5 = 2256fccac505064f3b84ee2c43921a68 --------------------------------------------
    Downloads: 0 This Week
    Last Update:
    See Project
  • 15
    Software to fit whole-sentence language models using the principle of maximum entropy. For developers of speech recognizers, text prediction interfaces, OCR, machine translation software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    OpenOCR will be a commercial quality ocr engine with tools for pre- and post-processing of images and resulting text.
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • Next
MongoDB Logo MongoDB