VietOCR / News: Recent posts

jTessBoxEditor v1.2.1 Release

jTessBoxEditor is a box editor and trainer for Tesseract OCR, providing editing of box data of both Tesseract 2.0x and 3.0x formats and full automation of Tesseract training. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 7.0 or later.

This release fixes a regression bug caused by RTL training by applying unicharset's Unicode character directionality fix only when RTL is selected.... read more

Posted by Quan Nguyen 2014-11-21

jTessBoxEditor v1.2 Release

jTessBoxEditor is a box editor and trainer for Tesseract OCR, providing editing of box data of both Tesseract 2.0x and 3.0x formats and full automation of Tesseract training. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 7.0 or later.

This release includes the following improvements:

  • Break up the training process to allow flexible, incremental training
  • Incorporate logging... read more
Posted by Quan Nguyen 2014-11-07

jTessBoxEditor v1.1 Release

jTessBoxEditor is a box editor and trainer for Tesseract OCR, providing editing of box data of both Tesseract 2.0x and 3.0x formats and full automation of Tesseract training. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 7.0 or later.

This release includes the following improvements:

  • Add training support for Right-to-Left (RTL) text
  • Add horizontal box split using modifier keys
  • Add split multi-page TIFF function... read more
Posted by Quan Nguyen 2014-11-07

VietOCR & VietOCR.NET v3.5

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following improvements:

  • Upgrade to Tesseract 3.02.03 (r866)
  • Enhance Bulk ops with subdirectory support
  • Incorporate image filters to enhance images for OCR
  • Implement Auto Crop and Undo functions
  • Additional translations
  • Update Tess4J library; JNA to v4.0; JACOB to v1.17 (Java only)

http://vietocr.sf.net

Posted by Quan Nguyen 2014-01-25

jTessBoxEditor v1.0 Release

jTessBoxEditor is a box editor and trainer for Tesseract OCR, providing editing of box data of both Tesseract 2.0x and 3.0x formats and full automation of Tesseract training. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

This release includes the following improvements:

  • Integrate support for full automation of Tesseract training
  • Bundle Tesseract Windows training executables (r866), English data, and config files
  • Fix an issue with generated TIFF missing metadata
  • Optionally add noise to generated image
  • Bug fixes and improvements... read more
Posted by Quan Nguyen 2013-11-16

jTessBoxEditor v0.9 Release

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

This release includes the following improvements:

  • Enhance Generate TIFF/Box functionality to allow for combining prepending symbols in addition to appending
  • Fix a bug that failed to persist changes to table in edit mode
  • Find function now supports partial matches
  • Fix a problem with table not scrolling along when row header has focus and scrolling
Posted by Quan Nguyen 2013-04-30

jTessBoxEditor v0.8 Release

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF. The program requires Java Runtime Environment 6.0 or later.

This release includes the following improvements:

  • Add row number header
  • Char cell now editable
  • Convert Unicode escape sequences where possible
  • Find box now displays Unicode characters and allows search using Unicode escape sequences
  • Improve Generate TIFF/Box functionality:
    -- automatically combine boxes that have the same coordinates or completely encloses one another
    -- automatically combine boxes that are combining symbols, specified in an external file, with the main, base character
    -- retain last-modified exp number in Generate TIFF/Box window... read more
Posted by Quan Nguyen 2013-04-17

VietOCR v3.4.2 & VietOCR.NET v3.4 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following improvements:

Java:

  • Update Tesseract 3.02 to r820
  • Add hocr support for Bulk & Batch and command-line operations
  • Update links to dictionary files
  • Update JNA to v3.5.1

.NET:

  • Upgrade to Tesseract 3.02 .NET wrapper (r820)
  • Add hocr support for Bulk & Batch and command-line operations
  • Update links to dictionary files... read more
Posted by Quan Nguyen 2013-01-08

VietOCR v3.4.1 & VietOCR.NET v3.3.1 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following improvements:

Java:
- Add Bulk OCR process
- Update Tesseract 3.02 to r806

.NET
- Fit Image now retains image aspect ratio
- Add Bulk OCR process

http://vietocr.sf.net

Posted by Quan Nguyen 2012-11-29

VietOCR v3.4 Release

A Java GUI frontend for Tesseract OCR engine. The release includes the following improvements:

- Upgrade Tesseract engine to v3.02
- Enable text entry in the combobox for Tesseract 3.02's multi-language OCR support
- Fit Image now retains image aspect ratio
- Add optional support for using Tess4J library
- Update JACOB to 1.16.1 version

http://vietocr.sf.net

Posted by Quan Nguyen 2012-11-03

VietOCR v3.3 & VietOCR.NET v3.3 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following improvements:

- Download Language Data will warn if the user does not have write access to tessdata folder
- Use icons from FatCow (fatcow.com/free-icons) and by Rimas Kudelis
- Fix a bug with scrollbars visible when Fit Image selected (Java only)
- Update Hunspell to v1.3.2 (Java only)

http://vietocr.sf.net

Posted by Quan Nguyen 2012-02-26

VietOCR.NET v3.2 Release

A .NET GUI frontend for Tesseract OCR engine. The release includes the following improvements:

- Update Tesseract 3.01 to r639 (final release version)
- Remove unneeded liblept168.dll
- Update lists of language codes
- Add PSM support to execution from command line

http://vietocr.sf.net

Posted by Quan Nguyen 2011-11-26

VietOCR v3.2 Release

A Java GUI frontend for Tesseract OCR engine. The release includes the following improvements:

- Update Tesseract 3.01 to r638 (final release version)
- Remove unneeded liblept168.dll
- Update lists of language codes
- Update JACOB to 1.16-M1 version
- Add PSM support to execution from command line

http://vietocr.sf.net

Posted by Quan Nguyen 2011-10-22

VietOCR.NET v3.1 Release

A .NET GUI frontend for Tesseract OCR engine. The release includes the following fixes and improvements:

* Integrate tesseractdotnet .NET wrapper DLL x86 (r48+) based on Tesseract 3.01 (r597)
* Remove tesseract.exe file
* Trap OutOfMemory exceptions that intermittently occur during drawing of selection boxes
* Refactor

http://vietocr.sf.net

Posted by Quan Nguyen 2011-08-03

VietOCR v3.1.4 Release

A Java GUI frontend for Tesseract OCR engine. The release includes the following improvements:

* Update Tesseract 3.01 to r597

http://vietocr.sf.net

Posted by Quan Nguyen 2011-08-03

VietOCR.NET v3.0 Release

A .NET GUI frontend for Tesseract OCR engine. The new release includes the following features:

- Use command-line process to invoke Tesseract 3.01 (r585) binary executable
- Include improved Vietnamese language pack

http://vietocr.sf.net

Posted by Quan Nguyen 2011-06-26

VietOCR v3.1.3 Release

A Java GUI frontend for Tesseract OCR engine. The release includes the following fixes and improvements:

* Refactoring
* Improve program usability, enabling image nagivation and manipulation with keyboard
* Fix an EOL issue that broke Remove Line Breaks functionality on Windows
* Integrate Linux Sane scanning support
* Fix an issue with restart notification after language pack downloads
* Update Tesseract 3.01 to r585
* Replace Vietnamese language pack with an improved version... read more

Posted by Quan Nguyen 2011-06-04

VietOCR v2.0.3 & VietOCR.NET v2.0.3 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following fixes and improvements:

- Refactoring
- Improve program usability, enabling image nagivation and manipulation with keyboard
- Fix an installation issue that was unable to uninstall previous versions (.NET only)
- Fix an EOL issue that broke Remove Line Breaks functionality on Windows (Java only)
- Integrate Linux Sane scanning support... read more

Posted by Quan Nguyen 2011-06-04

jTessBoxEditor v0.4 Release

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF.

The release includes the following enhancements:

- Add a utility function for merging images into a multi-page TIFF

http://vietocr.sourceforge.net/training.html

Posted by Quan Nguyen 2011-05-28

jTessBoxEditor v0.3 Release

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF.

The release includes the following enhancements:

- Provide a close-up view of current box

http://vietocr.sourceforge.net/training.html

Posted by Quan Nguyen 2011-04-26

jTessBoxEditor v0.2 Release

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF.

The release includes the following enhancements:

- Add a provision to set font for the Box Coordinates table
- Set table row height to match font
- Incorporate a pangram into the Font dialog

http://vietocr.sourceforge.net/training.html

Posted by Quan Nguyen 2011-04-15

jTessBoxEditor v0.1 Release

jTessBoxEditor is a box editor for Tesseract OCR data, providing editing of box data of both Tesseract 2.0x and 3.0x formats. It can read images of common image formats, including multi-page TIFF.

The initial release includes the following features:

- Support editing box data of both Tesseract 2.0x and 3.0x formats
- Implement box select & merge/split/insert/delete operations
- Implement box size change function via spinners
- Support Unicode conversion for the text field
- Include box search function... read more

Posted by Quan Nguyen 2011-04-15

VietOCR v2.0.2/3.1.2 & VietOCR.NET v2.0.2 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following fixes and improvements:

- Incorporate deskew functionality using GMSE Deskew algorithm
- Fix a MissingResourceException associated with Font dialog (Java only)

http://vietocr.sf.net

Posted by Quan Nguyen 2011-03-13

VietOCR v2.0.1/3.1.1 & VietOCR.NET v2.0.1 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following fixes and improvements:

* Fix a bug which hangs the program if x.DangAmbigs.txt contains entries starting with an equal symbol
* Improve postprocessing performance by caching the word list used; reload only if changes
* Fix a bug that crashes the program when inline spellcheck suggests on empty text (.NET only)
* Incorporate Apple Java Extensions (Java only)... read more

Posted by Quan Nguyen 2011-03-07

VietOCR v2.0/3.1 & VietOCR.NET v2.0 Releases

A Java/.NET GUI frontend for Tesseract OCR engine. The releases include the following fixes and improvements:

* Upgrade JACOB library to version 1.15-M4 (Java only)
* Add support for spellcheck suggestion in context menu
* Improve program accessibility and usability
* Add support for downloading and installing language data packs and appropriate spell dictionaries
* Add UI localization for Lithuanian and Slovak
* Refactor by breaking up large classes into smaller ones
* Update Tesseract OCR engine to 3.01 (r551) (v3.1 only)... read more

Posted by Quan Nguyen 2011-02-06