Menu

#38 incorrect character handling

open
Interface (13)
5
2003-08-09
2003-07-23
No

CVSMonitor incorrectly assumes character coding for logs.

It should either use Unicode or allow to specify
character set.

The best way of doing this is to use Unicode for every
page cvsmonitor produces.
Log entries should be converted from a given character
set to Unicode based on the choice in admin panel for
every module (separetely) is handles.

Discussion

  • Adam Kennedy

    Adam Kennedy - 2003-08-09

    Logged In: YES
    user_id=153576

    If I were to make a move towards fixing this, would you be
    willing to provide advice and testing for any fixes?

    I do have some experience with Unicode, but assistance would
    be essential to fixing this.

     
  • Adam Kennedy

    Adam Kennedy - 2003-08-09
    • assigned_to: nobody --> adamkennedy
     
  • Bartosz Zapalowski

    Logged In: YES
    user_id=734772

    I will provide testing and advice as much as I can.

    As far as I got to know you can just use the following code
    to convert the strings to unicode:
    use Encoding;
    from_to($string, $string_encoding, "utf-8");

    where $string_encoding is the enconding specified for the
    CVS module.

    The strings coming from cvsmonitor and not from cvs logs
    etc. just you just have to recode to unicode.

    There is, however, one problem which was too hard for me to
    trace. I put at the beggining of the HTML outputing code the
    sentence:
    print "Content-type: text/html; charset=iso-8859-2\n\n";

    Then when I pointed my webbrowser to cvsmonitor the first line
    in the outputed HTML was
    Content-type: text/html; charset=ISO-8859-1

    and cvsmonitor stopped working (it's database got broken).

    However, in cvsmonitor file there is no such string as
    'ISO-8859-1' no matter the case. I couldn't find the module
    which forced printing this line.

     
  • Adam Kennedy

    Adam Kennedy - 2003-08-09

    Logged In: YES
    user_id=153576

    The charset thing is an option to CGI::header. The CGI
    module defaults to iso-8859-1, but you can set it manually
    if needed.

    The database is probably going to be the hardest part of
    this. I'll need to check up on some things before we attempt it.

     
  • Bartosz Feński

    Logged In: YES
    user_id=770596

    I'm sure that he will help you with testing these features.
    I will try to help you too. It's important to me to have Polish
    characters properly encoded on webpages generated by
    CVS Monitor.
    Take a look at
    http://debian.linux.org.pl/cgi-bin/cvsmonitor/cvsmonitor.pl
    It doesn't look good. Especially log entries describing changes.
    If you create some patches/fixes to handle it properly I'll
    willingly test them.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.