Help save net neutrality! Learn more.
Close

#37 set utf8 for its database default character set

4.0
closed
i18n/l10n (13)
9
2012-02-20
2012-02-07
No

In release 3.64, the install script doesn't indicate the database's CHARSET and each field's COLLATION. Then input and output characters are expected to Latin1 (collation is latin1_swedish_ci). This doesn't matter as long as using ISO-8859-1(Latin1) characters.

But it matters to utilize the other characters such as Japanese (EUC-JP) because MySQL convert the query and its result character set between client-defined (PHP) character set and database-defined (MySQL) character set. In detail, please refer to MySQL manual.
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

With this conversion, all Japanese characters is mapped into unknown characters in Latin1 database. This setting is meaningless for non-latin languages.

If the database is set as UTF-8, the other characters than latin1 is stored correctly. There is a strong demand to set UTF-8 collation for the database.

Discussion

  • Anonymous - 2012-02-08
    • summary: set utf8-related collation for its database --> set utf8 for its database default character set
     
  • Anonymous - 2012-02-08
    • status: open --> closed
     
  • Anonymous - 2012-02-18

    If plugin generates its own tables without indicating its charset/collation, database's default charset/collation is applied. The default settings of MySQL server is responsibility for the server administrator.

    Then core table is utf8_generic_ci but plugin's table is, for example, latin1_swedish_ci.
    (Here latin1_swedish_ci is MySQL's default collation because it's originally created by Swedish company.)

    This causes the lack of characters during auto-conversion for query and return value.

     
  • Anonymous - 2012-02-18
    • status: closed --> open
     
  • gRegor

    gRegor - 2012-02-20
    • status: open --> closed
     

Log in to post a comment.