Hi, awesome program. It seems like the one thing that it might be lacking is OCR. I also noticed that there's an open-source OCR engine called Tesseract which has a page on .NET - see http://code.google.com/p/tesseractdotnet/
Thanks for pointing out that project. OCR is something that I'm looking to add in the future (almost certainly using Tesseract), though it might be a while before I get around to implementing it.
Images in NAPS2's PDF files are encoded in either JPEG or PNG format. If the "Maximum quality" option is checked, then it's always PNG; otherwise, it's PNG for black/white, and JPEG for grayscale and color.
I might look at JBIG2 in the future (I think JPBG2 was a typo on that site), though I think PNG is good enough for most people, since black/white images have inherently small file sizes.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, awesome program. It seems like the one thing that it might be lacking is OCR. I also noticed that there's an open-source OCR engine called Tesseract which has a page on .NET - see http://code.google.com/p/tesseractdotnet/
Incidentally, what compression scheme do you use? I'm still learning about PDFs but based on this http://blogs.adobe.com/acrolaw/2009/08/reducing-the-file-size-of-scanned-pdfs/ post it seems like allowing compression scheme options might be good, and JPBG2 seems like a good default.
Hi Ben,
Thanks for pointing out that project. OCR is something that I'm looking to add in the future (almost certainly using Tesseract), though it might be a while before I get around to implementing it.
Images in NAPS2's PDF files are encoded in either JPEG or PNG format. If the "Maximum quality" option is checked, then it's always PNG; otherwise, it's PNG for black/white, and JPEG for grayscale and color.
I might look at JBIG2 in the future (I think JPBG2 was a typo on that site), though I think PNG is good enough for most people, since black/white images have inherently small file sizes.