Menu

3.5 not loading Hindi pdf (made through MS word), 4.0 Beta not doing any ocr at all

Help
Rawat
2014-09-05
2017-08-14
  • Rawat

    Rawat - 2014-09-05

    I am using 3.5 on w8

    It was working fine but recently I have also loaded 4.0 beta that is installed in separate folder and is running.


    1. in 3.5, when I loaded a Hindi pdf file (made using MS Word) today, it gave the error message, telling about tmp1B43.tif error.

    All other things, English, as well as Hind images are getting ocr-ed well. just Hindi pdf is not loading.

    What is it and how to resolve it?


    1. In 4.0 Beta, when I load any file, whether pdf or image, it gets loaded. But on getting it ocr, gave the error message. No error No. or description, just a terse "error occurred".

    It is just not doing any ocr. and every time is just giving this message.

    What is it and how to resolve it?

    Thanks.

    Rawat

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-05

    Can you attach some sample files for our investigation of the issue?

     
  • Rawat

    Rawat - 2014-09-06

    this is the error 4.0 beta is throwing everywhere on ocr-ing any file.

    jpg/ pdf file gets loaded and is displayed well, but doesn't ocr at all.

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-06

    Can't reproduce the issue for both versions. The recognition took a little long but output the text below:

    जीवन में उच्च स्तरीय सत्यनिष्ठच्चा का निर्वाह
    कबीर का एक दोहा है…
    सांच बराबर प्तप नहीं, झूठ बराबर पाप ।

    जाके हिरदै सांच है, ताके हिरदै झप 1।
    ...

     
  • Rawat

    Rawat - 2014-09-06

    :-) How does that help in resolving problem at my system.

    Anyway, I did some trial and found that if the name of the pdf file is in unicode Hindi, then it is not getting loaded and the above error in 3.5.

    When I changed the name of the pdf to latin english only, that loaded and did the ocr.

    So that issue is solved.

    --

    However I think pdf with Hindi name got loaded well in 4.0 beta, not sure.

    --

    But, the 4.0 beta is not working. All files are loading but not getting ocr. I had just unzipped the file to a separate folder and moved my language files from the 3.5 to 4.0 beta. I don't know if some file or something is still missing.

    I shall wait for final releases .exe of 4.0, I guess.

    Thanks.

    Rawat

     
  • Quan Nguyen

    Quan Nguyen - 2014-09-06

    I tried your PDF file on Windows 8.1, and both 3.5 and 4.0 beta had no problem reading and recognizing it.

     

Log in to post a comment.