recognition free download

Pix2Text

Open-Source Python3 tool for recognizing layouts, tables, and math

An Open-Source Python3 tool for recognizing layouts, tables, math formulas, and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported. Pix2Text (P2T) aims to be a free and open-source Python alternative to Mathpix, and it can already accomplish Mathpix's core functionality. Pix2Text (P2T) can recognize layouts, tables, images, text, and mathematical...

Downloads: 9 This Week

Last Update: 2026-02-07

See Project

OCRmyPDF

OCRmyPDF adds an OCR text layer to scanned PDF files

OCRmyPDF adds an optical character recognition (OCR) text layer to scanned PDF files, allowing them to be searched. PDF is the best format for storing and exchanging scanned documents. Unfortunately, PDFs can be difficult to modify. OCRmyPDF makes it easy to apply image processing and OCR (recognized, searchable text) to existing PDFs.

Downloads: 109 This Week

Last Update: 1 day ago

See Project

$Rapid LaTeX OCR$

Rapid LaTeX OCR

Formula recognition based on LaTeX-OCR and ONNXRuntime

Formula recognition based on LaTeX-OCR and ONNXRuntime. rapid_latex_ocr is a tool to convert formula images to latex format. The reasoning code in the repo is modified from LaTeX-OCR, the model has all been converted to ONNX format, and the reasoning code has been simplified, Inference is faster and easier to deploy. The repo only has codes based on ONNXRuntime or OpenVINO inference in onnx format and does not contain training model codes.

Downloads: 1 This Week

Last Update: 2024-11-03

See Project

pdfly

CLI tool to extract (meta)data from PDF and manipulate PDF files

A Python library designed for manipulating PDF files with functionalities for extraction, transformation, and document generation.

Downloads: 0 This Week

Last Update: 2025-10-13

See Project

JSON Hero

JSON Hero is an open-source, beautiful JSON explorer for the web

JSON Hero is a beautiful and powerful JSON viewer designed for developers who work with large and complex JSON files. It runs as a web-based interface (and as a standalone app) that provides semantic, interactive rendering of JSON content, helping users understand the structure and meaning of data at a glance. JSON Hero automatically detects data types such as URLs, dates, colors, and base64 images, and presents them in meaningful ways. It’s designed for productivity and readability, with...

Downloads: 6 This Week

Last Update: 2025-07-17

See Project

Unredact

A simple tool for reading in poorly redacted documents

Unredact is a specialized tool that attempts to reconstruct redacted or obscured text in images, PDFs, or screenshots using a combination of image processing and generative AI inference to suggest plausible completions of blurred, black-boxed, or jumbled content. Unlike traditional optical character recognition (OCR), which only reads visible text, Unredact focuses on inferring missing content where redaction has been applied by analyzing surrounding context, font characteristics, and linguistic patterns to produce candidate reconstructions. It accepts a variety of input formats, automatically identifies redacted regions, and then generates text suggestions that are presented alongside visual overlays so users can choose or refine outputs.

Downloads: 5 This Week

Last Update: 2026-02-03

See Project

Google2SRT

Download, save and convert multiple subtitles from YouTube videos

Google2SRT allows you to download, save and convert multiple subtitles and translations from YouTube and Google Video to SubRip (.srt) format, which is recognized by most video players. You can download XML subtitles or simply type video's URL, Google2SRT will do the rest.

33 Reviews

Downloads: 26 This Week

Last Update: 2025-01-11

See Project

SimpleXlsxWriter

C++ library for creating XLSX files for MS Excel 2007 and above.

This library represents XLSX files writer for Microsoft Excel 2007 and above. The main feature of this library is that it uses C++ standard file streams. On the one hand it results in almost unnoticeable memory and CPU resources consumption while processing (that may be very useful at saving a large data arrays), but on the other hand it makes unfeasible to edit data that were written. Hence, if using this library the structure of the future report should be known enough. The library...

9 Reviews

Downloads: 10 This Week

Last Update: 2023-04-24

See Project

html2canvas

A JavaScript HTML screenshot renderer

html2canvas is a JavaScript HTML renderer. The script provides you with the tools to take screenshots of webpages directly on the browser. The screenshot is based on the DOM and therefore, it may not be 100% accurate to the real representation, given that it is not an actual screenshot, but a type of screenshot built based on the available data and information of the page. The script renders such page as a canvas image, by reading the DOM and the different styles of the featured elements. It...

Downloads: 10 This Week

Last Update: 2023-09-07

See Project

Budou

Budou is an auto organizer tool for beautiful line breaking in CJK

...The tool supports multiple segmentation backends, including Google Cloud Natural Language API, MeCab, and TinySegmenter, enabling flexibility for both cloud-based and offline processing. Budou can be used via command line, in Python scripts, or integrated into web applications, and it provides advanced options such as caching and entity recognition for improved segmentation accuracy.

Downloads: 0 This Week

Last Update: 2025-10-11

See Project

Highlight

Source code to formatted text converter

Highlight converts source code to HTML, XHTML, RTF, ODT, LaTeX, TeX, SVG, BBCode, Pango markup, and terminal escape sequences with colored syntax highlighting. Language definitions and color themes are customizable. Highlight was designed to offer a flexible but easy-to-use syntax highlighter for several output formats. No syntax or coloring information is hardcoded, instead all relevant data is stored in configuration scripts. These Lua scripts may be altered and enhanced with plug-in...

Downloads: 1 This Week

Last Update: 2024-05-15

See Project

MathOCR

A scientific document recognition system

MathOCR is a printed scientific document recognition system. MathOCR is still in the pre-alpha stage, recognition result may not be good enough for practical purposes. MathOCR is a printed scientific document recognition system written in pure Java. MathOCR has the functionality of image preprocessing, layout analysis and character recognition, especially the ability to recognize mathematical expression.

Downloads: 0 This Week

Last Update: 2024-05-16

See Project

Search Results for "recognition"

Showing 12 open source projects for "recognition"

Pix2Text

OCRmyPDF

Rapid LaTeX OCR

pdfly

JSON Hero

Unredact

Google2SRT

SimpleXlsxWriter

html2canvas

Budou

Highlight

MathOCR

Search Results for "recognition"

Showing 12 open source projects for "recognition"

Pix2Text

OCRmyPDF

Rapid LaTeX OCR

pdfly

JSON Hero

Unredact

Google2SRT

SimpleXlsxWriter

html2canvas

Budou

Highlight

MathOCR

Related Searches

Related Categories