Help save net neutrality! Learn more.

#37 set utf8 for its database default character set

i18n/l10n (13)

In release 3.64, the install script doesn't indicate the database's CHARSET and each field's COLLATION. Then input and output characters are expected to Latin1 (collation is latin1_swedish_ci). This doesn't matter as long as using ISO-8859-1(Latin1) characters.

But it matters to utilize the other characters such as Japanese (EUC-JP) because MySQL convert the query and its result character set between client-defined (PHP) character set and database-defined (MySQL) character set. In detail, please refer to MySQL manual.

With this conversion, all Japanese characters is mapped into unknown characters in Latin1 database. This setting is meaningless for non-latin languages.

If the database is set as UTF-8, the other characters than latin1 is stored correctly. There is a strong demand to set UTF-8 collation for the database.


  • Anonymous - 2012-02-08
    • summary: set utf8-related collation for its database --> set utf8 for its database default character set
  • Anonymous - 2012-02-08
    • status: open --> closed
  • Anonymous - 2012-02-18

    If plugin generates its own tables without indicating its charset/collation, database's default charset/collation is applied. The default settings of MySQL server is responsibility for the server administrator.

    Then core table is utf8_generic_ci but plugin's table is, for example, latin1_swedish_ci.
    (Here latin1_swedish_ci is MySQL's default collation because it's originally created by Swedish company.)

    This causes the lack of characters during auto-conversion for query and return value.

  • Anonymous - 2012-02-18
    • status: closed --> open
  • gRegor

    gRegor - 2012-02-20
    • status: open --> closed

Log in to post a comment.