Maybe it is not a problem with the reader but with the generator of the QR.
I am using Google Chart API as well as Android Barcode Scanner to generate the QRCode.
On the QRCode of the picture on the Wikipedia page, I can read the Japanese without encoding error :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Almost no QR code encoders actually include a marker indicating the character set (I have never actually seen one do so), so we have to guess. It is very difficult to reliably detect the proper character set in a QR code, because the strings are often so short that they are valid in multiple different character sets. Your particular string is valid SJIS (which is very commonly used) both when encoded as UTF-8 and as ISO-8859-1. However, SJIS does not have a é character, so you cannot convert your string to SJIS.
Encoders aside, the standard does not actually provide a mechanism for marking the data as UTF-8 (it only supports the various ISO standards, CP437, and SJIS). However, one thing you _can_ do is include a UTF-8 BOM (byte order mark) at the front of the text you feed to the encoder (see http://en.wikipedia.org/wiki/Byte-order_mark\). This allows the decoder to reliably identify the text as UTF-8, and doesn't require any changes to the encoder. The following URL demonstrates this:
As I had the same problem (with German umlauts like ü instead of accented French):
(for all, that are including ZBar in an own App, like I do with the iOS SDK!)
Nobody/Anonymous is right - somehow. It's SJIS.
When the QR doesn't include any marker concerning the character set, ZBar obviously assumes this - maybe because Kanji is somehow the default (as QRs were invented in Japan)?
Anyway.
He's wrong assuming, you always can influence the encoding of the QR yourself.
I get QRs with vCards from someone else, readily printed, I have no chance to get them to include such BOMs.
What I do with that in my App:
take the NSString from the ZBarSymbol.data, convert it like
const char *sjis = [zdata cStringUsingEncoding:NSShiftJISStringEncoding];
and convert it again with
result = [NSString stringWithCString:sjis encoding:NSUTF8StringEncoding];
Such I get two Kanji-Characters (encoded with three bytes) "back" into a two-byte Umlaut.
Just for the record.
Last edit: LeadSuccess 2013-11-08
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You should read some é characters but it is chinese characters iinstead.
MEBKM:TITLE:探そうモビで専門学校探し!;URL:http¥://sagasou.mobi;;
Maybe it is not a problem with the reader but with the generator of the QR.
I am using Google Chart API as well as Android Barcode Scanner to generate the QRCode.
On the QRCode of the picture on the Wikipedia page, I can read the Japanese without encoding error :)
I have tried to change the encoding but without success.
http://chart.apis.google.com/chart?cht=qr&chs=200x200&choe=UTF-8&chl=R%C3%A9my+Hubscher
http://chart.apis.google.com/chart?cht=qr&chs=200x200&choe=ISO-8859-1&chl=R%C3%A9my+Hubscher
http://chart.apis.google.com/chart?cht=qr&chs=200x200&choe=SHIFT_JIS&chl=R%C3%A9my+Hubscher
Almost no QR code encoders actually include a marker indicating the character set (I have never actually seen one do so), so we have to guess. It is very difficult to reliably detect the proper character set in a QR code, because the strings are often so short that they are valid in multiple different character sets. Your particular string is valid SJIS (which is very commonly used) both when encoded as UTF-8 and as ISO-8859-1. However, SJIS does not have a é character, so you cannot convert your string to SJIS.
Encoders aside, the standard does not actually provide a mechanism for marking the data as UTF-8 (it only supports the various ISO standards, CP437, and SJIS). However, one thing you _can_ do is include a UTF-8 BOM (byte order mark) at the front of the text you feed to the encoder (see http://en.wikipedia.org/wiki/Byte-order_mark\). This allows the decoder to reliably identify the text as UTF-8, and doesn't require any changes to the encoder. The following URL demonstrates this:
http://chart.apis.google.com/chart?cht=qr&chs=200x200&choe=UTF-8&chl=%EF%BB%BFR%C3%A9my+Hubscher
Both zbar and zxing, at least, should properly recognize the data as UTF-8, and strip the BOM from the decoded text.
Ok, it works as expected. :)
As I had the same problem (with German umlauts like ü instead of accented French):
(for all, that are including ZBar in an own App, like I do with the iOS SDK!)
Nobody/Anonymous is right - somehow. It's SJIS.
When the QR doesn't include any marker concerning the character set, ZBar obviously assumes this - maybe because Kanji is somehow the default (as QRs were invented in Japan)?
Anyway.
He's wrong assuming, you always can influence the encoding of the QR yourself.
I get QRs with vCards from someone else, readily printed, I have no chance to get them to include such BOMs.
What I do with that in my App:
take the NSString from the ZBarSymbol.data, convert it like
const char *sjis = [zdata cStringUsingEncoding:NSShiftJISStringEncoding];
and convert it again with
result = [NSString stringWithCString:sjis encoding:NSUTF8StringEncoding];
Such I get two Kanji-Characters (encoded with three bytes) "back" into a two-byte Umlaut.
Just for the record.
Last edit: LeadSuccess 2013-11-08