#20 Batch process and Viewing results

open
nobody
None
5
2004-12-07
2004-12-07
Anonymous
No

The Google Search Appliance uses pdftohtml version
0.33a. 0.33a is unable to read some OCR'ed files
and therefore the appliance does not index them
(since they are blank). We have approximately
10,000 files that we want to run thru the 0.33a.
Those that are blank will be re-scanned with a
different software.

Do you know of a way to use your software to
process multiple files? Additionally, how can you
tell if there are blank HTML files, other than
opening and viewing each converted file?

Thank you,
wongn@metro.net

Discussion


Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks