Showing 1265 open source projects for "latex-ocr"

View related business solutions
  • MongoDB Atlas runs apps anywhere Icon
    MongoDB Atlas runs apps anywhere

    Deploy in 115+ regions with the modern database for every enterprise.

    MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.
    Start Free
  • Try Google Cloud Risk-Free With $300 in Credit Icon
    Try Google Cloud Risk-Free With $300 in Credit

    No hidden charges. No surprise bills. Cancel anytime.

    Use your credit across every product. Compute, storage, AI, analytics. When it runs out, 20+ products stay free. You only pay when you choose to.
    Start Free
  • 1
    Rapid LaTeX OCR

    Rapid LaTeX OCR

    Formula recognition based on LaTeX-OCR and ONNXRuntime

    Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes.
    Downloads: 5 This Week
    Last Update:
    See Project
  • 2
    Tesseract OCR

    Tesseract OCR

    Open Source OCR Engine

    Tesseract is an open source OCR or optical character recognition engine and command line program. OCR is a technology that allows for the recognition of text characters within a digital image. With the latest version of Tesseract, there is a greater focus on line recognition, however it still supports the legacy Tesseract OCR engine which recognizes character patterns.
    Downloads: 3,156 This Week
    Last Update:
    See Project
  • 3
    Umi-OCR

    Umi-OCR

    OCR software, free and offline

    Umi-OCR is a free and open-source optical character recognition (OCR) tool designed to provide fast, offline text extraction from images, screenshots, PDFs, and more without requiring a network connection. It includes a highly efficient offline OCR engine with built-in multilingual recognition libraries, so users can extract text across multiple languages with high accuracy directly on their machines.
    Downloads: 58 This Week
    Last Update:
    See Project
  • 4
    Texify

    Texify

    Math OCR model that outputs LaTeX and markdown

    Texify is an OCR model that converts images or pdfs containing math into markdown and LaTeX that can be rendered by MathJax ($$ and $ are delimiters). It can run on CPU, GPU, or MPS.
    Downloads: 2 This Week
    Last Update:
    See Project
  • AI-powered service management for IT and enterprise teams Icon
    AI-powered service management for IT and enterprise teams

    Enterprise-grade ITSM, for every business

    Give your IT, operations, and business teams the ability to deliver exceptional services—without the complexity. Maximize operational efficiency with refreshingly simple, AI-powered Freshservice.
    Try it Free
  • 5
    DeepSeek-OCR

    DeepSeek-OCR

    Contexts Optical Compression

    DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...
    Downloads: 9 This Week
    Last Update:
    See Project
  • 6
    GLM-OCR

    GLM-OCR

    Accurate × Fast × Comprehensive

    GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...
    Downloads: 8 This Week
    Last Update:
    See Project
  • 7
    Zerox OCR

    Zerox OCR

    PDF to Markdown with vision models

    A dead simple way of OCR-ing a document for AI ingestion. Documents are meant to be a visual representation after all. With weird layouts, tables, charts, etc. The vision models just make sense. ZeroX is an open-source machine learning framework designed for fast experimentation and production deployment, optimized for speed and ease of use.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 8
    DeepSeek-OCR 2

    DeepSeek-OCR 2

    Visual Causal Flow

    DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents with rich spatial structure. ...
    Downloads: 14 This Week
    Last Update:
    See Project
  • 9
    Crowbook LaTeX

    Crowbook LaTeX

    Converts books written in Markdown to HTML, LaTeX/PDF and EPUB

    Crowbook's aim is to allow you to write a book in Markdown without worrying about formatting or typography and let the program generate HTML, PDF and EPUB output for you. Its focus is novels and fiction, and the default settings should (hopefully) generate readable books with correct typography without requiring you to worry about it.
    Downloads: 1 This Week
    Last Update:
    See Project
  • Go From AI Idea to AI App Fast Icon
    Go From AI Idea to AI App Fast

    One platform to build, fine-tune, and deploy ML models. No MLOps team required.

    Access Gemini 3 and 200+ models. Build chatbots, agents, or custom models with built-in monitoring and scaling.
    Try Free
  • 10
    LaTeX Examples

    LaTeX Examples

    Examples for the usage of LaTeX

    LaTeX-examples is a repository collecting a variety of example documents and snippets demonstrating LaTeX features, usage patterns, and common templates. It acts as a playground for learning LaTeX syntax, macros, formatting tricks, and document structuring practices. Files include sample articles, reports, book chapters, presentations (using Beamer), tables, mathematical typesetting examples (equations, aligned systems, integrals, matrices), custom macros, and styling. ...
    Downloads: 0 This Week
    Last Update:
    See Project
  • 11
    JupyterLab LaTeX

    JupyterLab LaTeX

    JupyterLab extension for live editing of LaTeX documents

    An extension for JupyterLab which allows for live-editing of LaTeX documents. To use, right-click on an open .tex document within JupyterLab, and select Show LaTeX Preview. This extension includes both a notebook server extension (which interfaces with the LaTeX compiler) and a lab extension (which provides the UI for the LaTeX preview). The Python package named jupyterlab_latex provides both of them as a prebuilt extension.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 12
    latex-action

    latex-action

    GitHub Action to compile LaTeX documents

    GitHub Action to compile LaTeX documents. It runs in a Docker container with a full TeXLive environment installed.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 13
    HunyuanOCR

    HunyuanOCR

    OCR expert VLM powered by Hunyuan's native multimodal architecture

    HunyuanOCR is an open-source, end-to-end OCR (optical character recognition) Vision-Language Model (VLM) developed by Tencent‑Hunyuan. It’s designed to unify the entire OCR pipeline, detection, recognition, layout parsing, information extraction, translation, and even subtitle or structured output generation, into a single model inference instead of a cascade of separate tools.
    Downloads: 1 This Week
    Last Update:
    See Project
  • 14
    OCRmyPDF

    OCRmyPDF

    OCRmyPDF adds an OCR text layer to scanned PDF files

    OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.
    Downloads: 97 This Week
    Last Update:
    See Project
  • 15
    LaTeX.CSS

    LaTeX.CSS

    LaTeX.css is a library that makes your website look like a LaTeX doc

    This almost class-less CSS library turns your HTML document into a website that looks like a LATEX document. Write semantic HTML, and you are good to go. The source code can be found on GitHub. LaTeX.css is a minimal, almost class-less CSS library that makes any website look like a LaTeX document. Add any optional classes to elements with special styles (author subtitle, abstract, lemmas, theorems, etc.). The labels of theorems, definitions, lemmas and proofs can be changed to other supported languages by including the snippet provided in addition to the main CSS file. ...
    Downloads: 1 This Week
    Last Update:
    See Project
  • 16
    Obsidian Latex Suite

    Obsidian Latex Suite

    Make typesetting LaTeX as fast as handwriting through snippets & text

    A plugin for Obsidian that aims to make typesetting LaTeX math as fast as handwriting.
    Downloads: 13 This Week
    Last Update:
    See Project
  • 17
    LaTeX Gboard Dictionary

    LaTeX Gboard Dictionary

    Importable dictionary for typing math symbols more easily

    Importable dictionary for typing math symbols more easily on your Android phone by using keyboard shortcuts inspired by LaTeX. Gboard Dictionary for easily typing unicode symbols with shortcuts based on LaTeX. Shortcuts for Gboard are supported on all Android devices. As of now, these shortcuts cannot be imported to iOS devices.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 18
    PaddleOCR

    PaddleOCR

    Awesome multilingual OCR toolkits based on PaddlePaddle

    PaddleOCR offers exceptional, multilingual, and practical Optical Character Recognition (OCR) tools that can help users train better models and apply them into practice. Inspired by PaddlePaddle, PaddleOCR is an ultra lightweight OCR system, with multilingual recognition, digit recognition, vertical text recognition, as well as long text recognition. It features a PPOCR series of high-quality pre-trained models, which includes: ultra lightweight ppocr_mobile series models, general ppocr_server series models, and ultra lightweight compression ppocr_mobile_slim series models. ...
    Downloads: 81 This Week
    Last Update:
    See Project
  • 19
    Markdown package LaTeX

    Markdown package LaTeX

    Package for converting and rendering markdown documents in TeX

    The Markdown package converts CommonMark markup to TeX commands. The functionality is provided both as a Lua module, and as plain TeX, LaTeX, and ConTeXt macro packages that can be used to directly typeset TeX documents containing markdown markup. Unlike other convertors, the Markdown package does not require any external programs and makes it easy to redefine how each and every markdown element is rendered. Creative abuse of the markdown syntax is encouraged.
    Downloads: 2 This Week
    Last Update:
    See Project
  • 20
    PaddleOCR-json

    PaddleOCR-json

    OCR offline image text recognition command line windows program

    ...This makes it practical for developers or system integrators who want reliable OCR output in JSON while avoiding the complexity of training or managing models by hand. Projects and wrappers built around PaddleOCR-json demonstrate how it can be integrated into other applications, such as desktop OCR utilities or language-specific bindings, because the JSON output is easy to parse and consume.
    Downloads: 6 This Week
    Last Update:
    See Project
  • 21
    LLM-Aided OCR Project

    LLM-Aided OCR Project

    Enhances Tesseract OCR output using LLMs (local or API)

    LLM Aided OCR is an open-source system designed to improve optical character recognition accuracy by combining traditional OCR tools with large language models. The project addresses common OCR challenges such as distorted text, unusual fonts, historical documents, and complex layouts that often produce inaccurate results with standard OCR pipelines.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    EasyOCR

    EasyOCR

    Ready-to-use OCR with 80+ supported languages

    Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc. EasyOCR is a python module for extracting text from image. It is a general OCR that can read both natural scene text and dense text in document. We are currently supporting 80+ languages and expanding. Second-generation models: multiple times smaller size, multiple times faster inference, additional characters and comparable accuracy to the first generation models. ...
    Downloads: 38 This Week
    Last Update:
    See Project
  • 23
    MinerU

    MinerU

    A high-quality tool for convert PDF to Markdown and JSON

    MinerU is an open-source, high-quality document extraction toolkit focused on converting PDFs (and other document formats) into structured Markdown and JSON. It leverages OCR and layout analysis to preserve semantic structure and metadata, ideal for research and data science workflows.
    Downloads: 10 This Week
    Last Update:
    See Project
  • 24
    LaTeX2e Kernel Code Repository
    LaTeX is a high-quality typesetting system; it includes features designed for the production of technical and scientific documentation. LaTeX is the de facto standard for the communication and publication of scientific documents. LaTeX is available as free software.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 25
    TinyTeX releases

    TinyTeX releases

    Windows/macOS/Linux binaries and installation methods of TinyTeX

    A lightweight, cross-platform, portable, and easy-to-maintain LaTeX distribution based on TeX Live. TinyTeX is a custom LaTeX distribution based on TeX Live that is small in size but functions well in most cases, especially for R users. If you run into the problem of missing LaTeX packages, it should be super clear to you what you need to do (in fact, R users won’t need to do anything). You only install LaTeX packages you actually need.
    Downloads: 41 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • 3
  • 4
  • 5
  • Next
MongoDB Logo MongoDB