ZBar bar code reader / Bugs / #87 Wrong character enconcoding

#87 Wrong character enconcoding

Milestone: version_0.10

Status: open

Owner: nobody

Labels: None

Priority: 3

Updated: 2023-02-04

Created: 2015-12-21

Creator: Mikhail Linyuchev

Private: No

Hi,

We have a trouble with encoding in some QR codes on Android devices.
For example QR code with single russian letter "м" in lowercase is decoded incorrectly.
QR code is attached.

Other users have reported the same issue:
https://sourceforge.net/p/zbar/bugs/73/

Any ideas how to fix this?

Thanks in advance.
Mike

1 Attachments

wrong_encoding.png

Discussion

Mikhail Linyuchev - 2015-12-24

Looks like this is not quite a bug, but the group of factors that can lead to incorrect decoding QR codes from the most online QR generators:

1) The most online generators encode text in UTF-8 but do not add BOM prefix
2) If BOM is absent ZBar uses SJIS decoder by default (before 8859-1 and UTF-8)
3) Iconv during converting from SJIS may consider some cirrilic text in UTF-8 as correct input, so we have some japanese in output

So, changing default decoder to UTF-8 is resolved my issue.
file qrdectxt.c, function qr_code_data_list_extract_text:
Before:
enc_list[0]=sjis_cd;
enc_list[1]=latin1_cd;
enc_list[2]=utf8_cd;
After:
enc_list[2]=sjis_cd;
enc_list[1]=latin1_cd;
enc_list[0]=utf8_cd;

Last edit: Mikhail Linyuchev 2015-12-29

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Adalbert Hanßen - 2023-02-04

Does zbarimg emit UTF-8? According to the German Wikipedia, UTF-8 does not require a Byte Order Mark (BOM) in UTF-8 encoded files. According to Wikipedia, a BOM causes trouble with several text editors.

However, if zbarimg writes its output tin the standard output device, why should it add a BOM?

Is your bug the same one as #73, the unability to correctly emit some accented characters and Umlaute and German character ß?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Wrong character enconcoding

Group

Searches

Help

#87 Wrong character enconcoding

Discussion