#108 Character encoding error when using mod_perl with mysql

open
nobody
None
5
2010-01-15
2010-01-15
Tobias Mieth
No

Hi,

I have discovered the following error when using codestriker with mod_perl and mysql on the windows platform. After the restart of the webserver all tasks in codestriker run fine especially the ones that query the database. However, after a while, maybe a day or so, the client character set encoding (mysql) switches from UTF8 to latin1 which isn't desired at all since the database stores and returns everything in UTF8. In the end the code that is to review gets mangled and UTF8-characters become unreadable due to the latin1 decoding.

This behaviour can be observed by checking the msql_variables:

before:

character_set_client = utf8
character_set_connection = utf8
character_set_database = utf8
character_set_filesystem = binary
character_set_results = utf8
character_set_server = utf8
character_set_system = utf8

after:

character_set_client = latin1
character_set_connection = latin1
character_set_database = utf8
character_set_filesystem = binary
character_set_results = latin1
character_set_server = utf8
character_set_system = utf8

Although "MySQL.pm" explicitly sets the client character encoding by issuing:

$dbh->do("SET NAMES 'utf8'");
$dbh->do("SET character_set_results='utf8'");

in getConnection(), these values are not persistent. The reason for this is the auto-reconnect feature of mysql (http://dev.mysql.com/doc/refman/5.0/en/auto-reconnect.html)! According to DBD::mysql (http://search.cpan.org/~capttofu/DBD-mysql-4.013/lib/DBD/mysql.pm) this feature is disabled by default when using just perl. Anyways, "if either the GATEWAY_INTERFACE or MOD_PERL envionment variable is set, DBD::mysql will turn mysql_auto_reconnect on.". Which is the case here. This wouldn't be a problem if mysql would keep all the character settings when reconnecting but it doesn't.

Therefore, if a auto-reconnect occurs (which is by default after 8 hours, see http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html#sysvar_wait_timeout\) "Session variables are reinitialized to the values of the corresponding variables. This also affects variables that are set implicitly by statements such as SET NAMES." and we're back to latin1.

In order to fix that issue for me, i have made the following modifications to codestriker.

1.) i have disabled the mysql_auto_reconnect in "MySQL.pm" (see attached file)
2.) i had to modify the "DBI.pm" in order to deal with timed out connections (see attached file)

my setup

Codestriker 1.9.10
Windows 2003 Server Standard Edition
Apache 2.2
mod_perl 2.000004
Active Perl 5.10.1
mysql 5.1.39

I would be happy if someone could confirm my observations??

Greetings Tobias

Discussion

  • Tobias Mieth
    Tobias Mieth
    2010-01-15

     
    Attachments
  • Tobias Mieth
    Tobias Mieth
    2010-01-15

     
    Attachments