I was using ocropus in the past but since I upgraded to Ubunntu 12.04 ocropus is no longer available thus I can't use with gscan2pdf naturally. It appears this was a result of a bug in Ocropus. This has since been fixed with the release of ocropus 0.5 but it isn't in the repo yet. Not really a gscan2pdf issue but I figure I'd bring it up.
I switched to tesseract but this gives me issues. Some of my pages give me this error.
utf8 "\x80" does not map to Unicode at /.../lib/Gscan2pdf.pm line 921, <>
thus when I try to save the resulting file as a djvu it says bad characters and it hangs. I cancel the save in gscan and then delete the bad pages. The problem is gscan doesn't completely cancel the save because when I try to resave it just says doing process 1 of 2 and I assume process was is the last save since it continues to hang.
I'm using 1.0.4