Menu

#21 Scan QRCode with Unicode information inside

latest_mercurial
closed
QR Code (7)
5
2013-11-08
2009-10-23
Natim
No

When I am trying to read a QRCode to share a contact. I have an encoding issue. Do you know how I can work around this ?

The expected result is : Rémy Hubscher
The readed result is : Rテゥmy Hubscher

Thank you for your help

Discussion

  • Natim

    Natim - 2009-10-23

    You should read some é characters but it is chinese characters iinstead.

     
  • Natim

    Natim - 2009-10-23

    MEBKM:TITLE:探そうモビで専門学校探し!;URL:http¥://sagasou.mobi;;

     
  • Natim

    Natim - 2009-10-23

    Maybe it is not a problem with the reader but with the generator of the QR.
    I am using Google Chart API as well as Android Barcode Scanner to generate the QRCode.

    On the QRCode of the picture on the Wikipedia page, I can read the Japanese without encoding error :)

     
  • spadix

    spadix - 2009-10-23
    • labels: --> QR Code
    • milestone: --> latest_mercurial
    • assigned_to: nobody --> tterribe
     
  • Nobody/Anonymous

    Almost no QR code encoders actually include a marker indicating the character set (I have never actually seen one do so), so we have to guess. It is very difficult to reliably detect the proper character set in a QR code, because the strings are often so short that they are valid in multiple different character sets. Your particular string is valid SJIS (which is very commonly used) both when encoded as UTF-8 and as ISO-8859-1. However, SJIS does not have a é character, so you cannot convert your string to SJIS.

    Encoders aside, the standard does not actually provide a mechanism for marking the data as UTF-8 (it only supports the various ISO standards, CP437, and SJIS). However, one thing you _can_ do is include a UTF-8 BOM (byte order mark) at the front of the text you feed to the encoder (see http://en.wikipedia.org/wiki/Byte-order_mark\). This allows the decoder to reliably identify the text as UTF-8, and doesn't require any changes to the encoder. The following URL demonstrates this:

    http://chart.apis.google.com/chart?cht=qr&chs=200x200&choe=UTF-8&chl=%EF%BB%BFR%C3%A9my+Hubscher

    Both zbar and zxing, at least, should properly recognize the data as UTF-8, and strip the BOM from the decoded text.

     
  • Natim

    Natim - 2009-10-23

    Ok, it works as expected. :)

     
  • spadix

    spadix - 2009-10-29
    • status: open --> closed
     
  • LeadSuccess

    LeadSuccess - 2013-11-08

    As I had the same problem (with German umlauts like ü instead of accented French):
    (for all, that are including ZBar in an own App, like I do with the iOS SDK!)

    Nobody/Anonymous is right - somehow. It's SJIS.
    When the QR doesn't include any marker concerning the character set, ZBar obviously assumes this - maybe because Kanji is somehow the default (as QRs were invented in Japan)?
    Anyway.

    He's wrong assuming, you always can influence the encoding of the QR yourself.

    I get QRs with vCards from someone else, readily printed, I have no chance to get them to include such BOMs.

    What I do with that in my App:
    take the NSString from the ZBarSymbol.data, convert it like
    const char *sjis = [zdata cStringUsingEncoding:NSShiftJISStringEncoding];

    and convert it again with
    result = [NSString stringWithCString:sjis encoding:NSUTF8StringEncoding];

    Such I get two Kanji-Characters (encoded with three bytes) "back" into a two-byte Umlaut.

    Just for the record.

     

    Last edit: LeadSuccess 2013-11-08

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.