text processing free download

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...

Downloads: 8 This Week

Last Update: 2026-01-27

See Project

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...

Downloads: 10 This Week

Last Update: 2026-02-03

See Project

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...

3 Reviews

Downloads: 1 This Week

Last Update: 2023-04-23

See Project

pdfsandwich

pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries: convert, unpaper, tesseract, gs, and hocr2pdf (if tesseract < 3.03). ...

8 Reviews

Downloads: 319 This Week

Last Update: 2018-08-12

See Project

WebDjVuTextEd

Edit the OCR text layer of DjVu documents in a web browser

WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking. Note that current SVN...

Downloads: 1 This Week

Last Update: 2015-11-21

See Project

YagpoOCRUnicode c++library

OCR c++ library. Include: contour recognition; vectorisation; matrix letter feature recognition; auto page segmentation and detect rotation; SS3 ASM core; XML base; web-based GUI; 99,6% printed Unicode text recognition; letter base up to 1200 letters.

Downloads: 0 This Week

Last Update: 2013-04-08

See Project

eBookFormatter

Got any emails with obnoxious inline text? Long text stories with bad formatting? Files that an OCR didn't quite translate right? RTF format files and no easy way to read or modify them? Then eBookFormatter is for you!

Downloads: 0 This Week

Last Update: 2013-03-12

See Project

OpenOCR

OpenOCR will be a commercial quality ocr engine with tools for pre- and post-processing of images and resulting text.

Downloads: 0 This Week

Last Update: 2015-07-12

See Project

Search Results for "text processing"

8 projects for "text processing" with 2 filters applied:

DeepSeek-OCR

DeepSeek-OCR 2

Common Resource Grep - crgrep

pdfsandwich

WebDjVuTextEd

YagpoOCRUnicode c++library

eBookFormatter

OpenOCR

Search Results for "text processing"

8 projects for "text processing" with 2 filters applied:

DeepSeek-OCR

DeepSeek-OCR 2

Common Resource Grep - crgrep

pdfsandwich

WebDjVuTextEd

YagpoOCRUnicode c++library

eBookFormatter

OpenOCR

Related Searches

Related Categories