Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#37 set utf8 for its database default character set

4.0
closed
Mocchi
i18n/l10n (13)
9
2012-02-20
2012-02-07
Mocchi
No

In release 3.64, the install script doesn't indicate the database's CHARSET and each field's COLLATION. Then input and output characters are expected to Latin1 (collation is latin1_swedish_ci). This doesn't matter as long as using ISO-8859-1(Latin1) characters.

But it matters to utilize the other characters such as Japanese (EUC-JP) because MySQL convert the query and its result character set between client-defined (PHP) character set and database-defined (MySQL) character set. In detail, please refer to MySQL manual.
http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html

With this conversion, all Japanese characters is mapped into unknown characters in Latin1 database. This setting is meaningless for non-latin languages.

If the database is set as UTF-8, the other characters than latin1 is stored correctly. There is a strong demand to set UTF-8 collation for the database.

Discussion

  • Mocchi
    Mocchi
    2012-02-08

    • summary: set utf8-related collation for its database --> set utf8 for its database default character set
     
  • Mocchi
    Mocchi
    2012-02-08

    • status: open --> closed
     
  • Mocchi
    Mocchi
    2012-02-18

    If plugin generates its own tables without indicating its charset/collation, database's default charset/collation is applied. The default settings of MySQL server is responsibility for the server administrator.

    Then core table is utf8_generic_ci but plugin's table is, for example, latin1_swedish_ci.
    (Here latin1_swedish_ci is MySQL's default collation because it's originally created by Swedish company.)

    This causes the lack of characters during auto-conversion for query and return value.

     
  • Mocchi
    Mocchi
    2012-02-18

    • status: closed --> open
     
  • gRegor
    gRegor
    2012-02-20

    I added the default charset to the install script's "create database" SQL in this revision: http://nucleuscms.svn.sourceforge.net/viewvc/nucleuscms?revision=1670&view=revision

    I'm marking this as closed since the installation SQL now sets the charset explicitly for database creation as well as table creation.

     
  • gRegor
    gRegor
    2012-02-20

    • status: open --> closed