source-han free download

53 projects for "source-han" with 2 filters applied:

OCR BSD Clear Filters & Widen Search

Forever Free Full-Stack Observability | Grafana Cloud
Our generous forever free tier includes the full platform, including the AI Assistant, for 3 users with 10k metrics, 50GB logs, and 50GB traces.

Built on open standards like Prometheus and OpenTelemetry, Grafana Cloud includes Kubernetes Monitoring, Application Observability, Incident Response, plus the AI-powered Grafana Assistant. Get started with our generous free tier today.

Create free account
Compliant and Reliable File Transfers Backed by Top Security Certifications
Cerberus FTP Server delivers SOC 2 Type II certified security and FIPS 140-2 validated encryption.

Stop relying on non-certified, legacy file transfer tools that creak under the weight of modern security demands. Get full audit trails, advanced access controls and more supported by an award-winning team of experts. Start your free 25-day trial today.

Start Free Trial
1

DeepSeek-OCR

Contexts Optical Compression

DeepSeek-OCR is an open-source optical character recognition solution built as part of the broader DeepSeek AI vision-language ecosystem. It is designed to extract text from images, PDFs, and scanned documents, and integrates with multimodal capabilities that understand layout, context, and visual elements beyond raw character recognition. The system treats OCR not simply as “read the text” but as “understand what the text is doing in the image”—for example distinguishing captions from body text, interpreting tables, or recognizing handwritten versus printed words. ...

Downloads: 11 This Week

Last Update: 2026-01-27
See Project
2

DeepSeek-OCR 2

Visual Causal Flow

DeepSeek-OCR-2 is the second-generation optical character recognition system developed to improve document understanding by introducing a “visual causal flow” mechanism, enabling the encoder to reorder visual tokens in a way that better reflects semantic structure rather than strict raster scan order. It is designed to handle complex layouts and noisy documents by giving the model causal reasoning capabilities that mimic human visual scanning behavior, enhancing OCR performance on documents...

Downloads: 7 This Week

Last Update: 2026-02-03
See Project
3

Scribe.js

JavaScript OCR and text extraction for images and PDFs

Scribe.js is a JavaScript library that provides Optical Character Recognition (OCR) and text extraction capabilities for both images and PDF documents, aimed at developers who want to build OCR features directly into their applications. The library can take image files (such as PNG or JPEG) and recognize the text they contain, and it can also extract text from PDF files that either already contain text or are image-based scans, using modern web standards and WebAssembly under the hood. In...

Downloads: 4 This Week

Last Update: 7 days ago
See Project
4

GLM-OCR

Accurate × Fast × Comprehensive

GLM-OCR is an open-source multimodal optical character recognition (OCR) model built on a GLM-V encoder–decoder foundation that brings robust, accurate document understanding to complex real-world layouts and modalities. Designed to handle text recognition, table parsing, formula extraction, and general information retrieval from documents containing mixed content, GLM-OCR excels across major benchmarks while remaining highly efficient with a relatively compact parameter size (~0.9B), enabling deployment in high-concurrency services and edge environments. ...

Downloads: 2 This Week

Last Update: 2026-04-08
See Project
MongoDB Atlas runs apps anywhere
Deploy in 115+ regions with the modern database for every enterprise.

MongoDB Atlas gives you the freedom to build and run modern applications anywhere—across AWS, Azure, and Google Cloud. With global availability in over 115 regions, Atlas lets you deploy close to your users, meet compliance needs, and scale with confidence across any geography.

Start Free
5

dots.ocr

Multilingual Document Layout Parsing in a Single Vision-Language Model

dots.ocr is a cutting-edge multilingual document parsing system built on a unified vision-language model that combines layout detection, text recognition, and structural understanding into a single architecture. Unlike traditional OCR pipelines that rely on multiple specialized components, dots.ocr integrates these processes end-to-end, reducing error propagation and improving consistency across tasks. The model is designed to recognize virtually any human script, making it highly effective...

Downloads: 1 This Week

Last Update: 2026-03-24
See Project
6

OpenKM Document Management - DMS

Document Management System and Content Management System

...It integrates document management, collaboration and advanced search into one easy-to-use solution, including administration tools for user roles, access control, security levels, activity logs and automation setup. With OpenKM Community Edition you can: Collect information from any digital source. Collaborate with colleagues on documents and projects. Capitalize on accumulated knowledge by locating documents and information sources. Control business processes with an embedded workflow engine. Automate tasks. For a complete feature list visit: http://goo.gl/au8cQy

32 Reviews

Downloads: 360 This Week

Last Update: 2026-05-14
See Project
7

VietOCR

Provides optical character recognition (OCR) solutions for Vietnamese language.

24 Reviews

Downloads: 187 This Week

Last Update: 7 days ago
See Project
8

gscan2pdf

A GUI to ease the process of producing a multipage PDF from a scan. gscan2pdf should work on almost any Linux/BSD machine.

22 Reviews

Downloads: 136 This Week

Last Update: 2025-11-05
See Project
9

chessPDFBrowser

Chess application whichs allows working with chess PDF books and PGNs.

Chess application which allows working with PDFs and PGNs. You can work with the chess games of the PDF and edit their tree of variants. Graphical environment. Standard PGN TAGs. PGN comments. Ocr like (Fen string detection from chess board position images). Connection to Uci chess engines (like stockfish). Position analysis, full game analysis. You can now play games against uci engines. pdf2pgn command line command included. Detailed documentation. Multilanguage...

1 Review

Downloads: 38 This Week

Last Update: 2026-04-04
See Project
Atera - an All-in-one platform for IT management
Ideal for IT departments and MSPs (managed service providers)

Your IT essentials, integrated & elevated. Take your IT management from automated to autonomous, download Atera's agent to start your free trial!

Try Atera now
10

Common Resource Grep - crgrep

Common Resource Grep

CRGREP searches for matching text in databases, various document formats, archives and other difficult to access resources. A command line tool for name and content text matching in database tables, plain files, MS Office documents, PDF, archives, MP3 audio, image meta-data, scanned documents, maven dependencies and web resources. CRGREP will search resources within resources of any arbitrary combination or depth, so text within a document within a zip archive, and so on. Here you...

3 Reviews

Downloads: 5 This Week

Last Update: 2023-04-23
See Project
11

gImageReader

A graphical frontend to tesseract-ocr

gImageReader is a simple Gtk/Qt front-end to tesseract. Features include: - Import PDF documents and images from disk, scanning devices, clipboard and screenshots - Process multiple images and documents in one go - Manual or automatic recognition area definition - Recognize to plain text or to hOCR documents - Recognized text displayed directly next to the image - Post-process the recognized text, including spellchecking - Generate PDF documents from hOCR documents **Note**:...

27 Reviews

Downloads: 102 This Week

Last Update: 2022-01-28
See Project
12

Linux-Intelligent-Ocr-Solution

Easy-OCR solution and Tesseract trainer for GNU/Linux

...forum/lios Video Tutorial : https://www.youtube.com/playlist?list=PLn29o8rxtRe1zS1r2-yGm1DNMOZCgdU0i Tesseract Training Tutorial (beta) : https://www.youtube.com/watch?v=qLpCld4cdtk Source Code Github : https://github.com/Nalin-x-Linux/lios-3 Gitlab : https://gitlab.com/Nalin-x-Linux/lios-3 User guide is available in download page

5 Reviews

Downloads: 1 This Week

Last Update: 2020-10-19
See Project
13

cintruder

CIntruder - OCR Bruteforcing Toolkit

Captcha Intruder is an automatic pentesting tool to bypass captchas. -> CIntruder-v0.4 (.zip) -> md5 = 6326ab514e329e4ccd5e1533d5d53967 -> CIntruder-v0.4 (.tar.gz) ->md5 = 2256fccac505064f3b84ee2c43921a68 --------------------------------------------

Downloads: 0 This Week

Last Update: 2020-07-25
See Project
14

pdfsandwich

pdfsandwich generates "sandwich" OCR pdf files, i.e. pdf files which contain only images (but no editable text) will be processed by optical character recognition (OCR) and the text will be added to each page invisibly "behind" the images. pdfsandwich is a command line tool which is supposed to be useful to OCR scanned books or journals. It is able to recognize the page layout even for multicolumn text. Essentially, pdfsandwich is a wrapper script which calls the following binaries:...

8 Reviews

Downloads: 306 This Week

Last Update: 2018-08-12
See Project
15

Tess4J

A Java JNA wrapper for Tesseract OCR API

9 Reviews

Downloads: 42 This Week

Last Update: 2018-05-26
See Project
16

WebDjVuTextEd

Edit the OCR text layer of DjVu documents in a web browser

WebDjVuTextEd allows to edit the text layer of OCR'ed DjVu documents in a web browser. You can modify the structure (paragraphs, lines, words...) create, delete, edit text nodes, modify their container box by mouse, and run a spellchecker. The program does not directly read the DjVu files, it requires exported XML text data and images. When using without a webserver, you can open and save local files, but cannot take advantages of auto-save and spell checking. Note that current SVN...

Downloads: 0 This Week

Last Update: 2015-11-21
See Project
17

DJVU++

The DjVu complete solution,with OCR Technology(Arabic ,English).

DjVu++ is a user-friendly program that used to manipulate DjVu file formats such as eBooks with a penalty of editing features. The program introduce a free replacement for the property PDF format with similar resolution and smaller file size DjVu++ also support OCR to handle text in scanned books and images. The program shows good performance for English. In addition to the Arabic language to lead free and commercial software in this area. The main features of DjVu++ program are: o...

4 Reviews

Downloads: 4 This Week

Last Update: 2015-08-24
See Project
18

yagf

YAGF is a tesseract and cuneiform wrapper and helper*

YAGF is a graphical front-end for cuneiform and tesseract OCR tools. With YAGF you can open already scanned image files or obtain new images via XSane (scanning results are automatically passed to YAGF). Once you have a scanned image you can prepare it for recognition, select particular image areas for recognition, set the recognition language and so on. Recognized text is displayed in a editor window where it can be corrected, saved to disk or copied to clipboard. YAGF also provides some...

2 Reviews

Downloads: 2 This Week

Last Update: 2016-11-25
See Project
19

Immutable Sparse Wave Trees (WaveTree)

Realtime bigdata tool for bit strings up to 2^63 based on AVL forest

Realtime bigdata tool at the bit level based on immutable AVL forest which can be run in memory or, in future versions, as a merkle forest like a blockchain. Main object is a sparse bit string (Bits) that efficiently scales up to 2^63 bits normally compressed as forest has duplicated substrings. Bits objects support reading bit, byte, short, int, or long (Java primitives) at any bit index in 64 bit range. Example: instead of building a class to hold a header and then data, represent all of...

Downloads: 0 This Week

Last Update: 2015-03-02
See Project
20

hocr - Hebrew OCR

hocr - Hebrew OCR c/c++ library

Downloads: 0 This Week

Last Update: 2014-06-09
See Project
21

phpSANE

Web-Based Frontend for SANE

phpSANE is a web-based frontend for SANE written in HTML/PHP so you can scan with your web-browser. It also supports OCR.

13 Reviews

Downloads: 4 This Week

Last Update: 2013-10-24
See Project
22

Linux for Beagleboard-xm

A Tailored Small Linux for Beagleboard-xm

Beagleboard-xm is a powerful chip with a cortex-A8 CPU and a DSP. I have the plan to build an OCR gadget using it with Linux. As a by product I will post my tailored Linux kernel and u-boot, and all relevant stuff here, from now on. I was shocked by the blocking of Chinese citizens from accessing some of the contents on sourceforge. I deeply regret the outrageous action initiated, even though I fully understand the reasoning behind it.

Downloads: 0 This Week

Last Update: 2015-11-13
See Project
23

CD+Graphics Magic

Timeline based editor for creating Compact Disc Subcode Graphics (also known as CD+G or CDG). Both karaoke and multimedia styles of content are supported. Please visit cdgmagic.sf.net for examples playable directly in the HTML5 CD+G player. CD+Graphics Scribe utility (separate download -- click "Browse All Files" above) can now convert existing CDG karaoke content to CMP (CD+Graphics Magic Project), LRC (Enhanced Lyrics), and ASS (Advanced SubStation Alpha) format.

4 Reviews

Downloads: 20 This Week

Last Update: 2013-07-25
See Project
24

File-em

File-'em is an automatic receipts organizer implemented in Java & SWT.

File-'em (pronounced like phylum) is an open source alternative to the software behind NeatReceipts?®. It allows you to load in scanned receipts and automatically pulls the information out of the receipt using OCR and stores it in a SQLite database for easy reference, reports, and retrieval.

Downloads: 0 This Week

Last Update: 2016-11-26
See Project
25

Plasma OCR

An omnifont OCR engine. The long-term goal is recognition of formulas.

Downloads: 0 This Week

Last Update: 2014-06-09
See Project