Menu

#179 OCR error yielding a double byte unicode

4.X
closed
cb4960
2022-03-19
2021-08-15
Sammy MoK
No

I am running into intermittent OCR errors where the ' character resulted in a double byte unicode of 0x2019. The text I am running the OCR on is "MEISSNER'S CORPUSCLE". Any suggestions on what is the best way to circumvent this? At least a way to prevent double byte unicode characters in the result?

I have attached the bmp file of the source text image.

1 Attachments

Discussion

  • Sammy MoK

    Sammy MoK - 2021-08-15

    BTW, I am using Capture2Text_CLI.exe version 4.6.2 and invoking it using AutoHotKey script and getting result back in clipboard.

     
  • cb4960

    cb4960 - 2022-03-19
    • status: open --> closed
    • assigned_to: cb4960
     
  • cb4960

    cb4960 - 2022-03-19

    In v4.6.3, replaced Unicode single quote (’) with an ASCII single quote ('). Also replaced (“) and (”) with (").

     

Log in to post a comment.

MongoDB Logo MongoDB