Menu

Unicode problems

Help
Anonymous
2003-05-10
2012-09-19
  • Anonymous

    Anonymous - 2003-05-10

    Using version 0.92 I'm hitting what seems to be a serious issue with Unicode support.  When using the unicode="latin1" keyword in a database connection I'm seeing two different problems:

    1 - Some data selected from the database is returned as a raw string, and some as unicode  For example a string containing '\x92' is brought back a raw string, whereas data containing '\xfb' is brought back as Unicode.  Given that they are both invalid ASCII shouldn't they both be brought back as Unicode strings?

    2 - If the conversion from Unicode to latin1 fails I get back an exception.  It would be helpful if you could specify in the constructor the error policy ('strict', 'ignore', or 'replace') that you want to use.

    Does anyone know how to tackle #1?  It looks like I'm going to have to stop using the Unicode keyword and do the work myself.

     
    • Anonymous

      Anonymous - 2003-05-10

      An update on what's going on here: It seems that the Unicode conversion is only done for columns of type VARCHAR, not type LONGTEXT.

      This sounds like a bug to me, anyone else think so?

       
      • Andy Dustman

        Andy Dustman - 2003-05-12

        In MySQL, LONGTEXT is exactly the same datatype as BLOB (or maybe LONGBLOB). Because of this, those columns are treated as BLOBs. If you don't use BLOBs, you can update your type converter dictionary so that those columns are returned as unicode strings. Arguably, BLOBs should be returned as character arrays (for details, type in the interpreter: help("array") ), but it's imposible to please everyone in this case.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.