Menu

translation to a non-iso8859-1 language?

2002-08-22
2012-10-11
  • Michael Bravo

    Michael Bravo - 2002-08-22

    Hello,

    I'm considering making a Russian language translation for phpwiki. However, any encoding you take for Russian (and there are at least two major ones - windows-1251 for the majority of Windows users, and koi8-r for Unix/Linux users), it is not compatible with iso8859-1. Would that affect the usability of the translation significantly? Would the fact that I will pick only one encoding somehow undermine gettext assumption that there is only one encoding for a give country code?

     
    • Geoffrey T. Dairiki

      I think translating into a language which uses another character set is basically straightforward, but there are some issues:

      PhpWiki has not been tested well with CHARSET set to other than iso-8859-1.   I'm sure some code fixes will be required.

      There's been talk before of making PhpWiki dynamically pick it's language based on headers ('Accept-Language', I think it is) sent by the browser.  I don't think dynamically switching character sets would be easy though.  (Not a big issue, I guess --- you only get to dynamically switch languages to those languages which share the configured charset, though.)

      (Some of the PhpWiki developers may not have the requisite font sets installed, and so may have trouble testing/debugging and code problems associated with the charset change.)

      I'm sure there are other issues which I've not thought about.
      I'd suggest moving this discussion to the phpwiki-talk mailing list --- I think more people will see it over there...

      Anyhow, in answer to your first question: I think a Russian translation would be useful --- though there may be a few surmountable difficulties...

      I do not understand your second question.  ("Would the fact that...")  Please clarify.

       
      • Michael Bravo

        Michael Bravo - 2002-08-22

        I'm running phpWiki 1.3.3 with CHARSET="koi8-r" - it seems to be ok, but I didn't really test it. I need more content in Russian, and then I'll have to test stuff that uses regexps and other searches - I'm pretty sure there will be glitches.

        As to the second question - gettext somehow presupposes, that there is only one .po file for a given country code. For Russian language files, it is usual to have at least two translated files - one in koi8-r, and another in cp1251. This is probably an entirely extraneous question, I just want to better understand how it all fits together.

        To make my own answer to this, making one translation in koi8-r will probably allow to provide service to web users on arbitrary platforms, by manipulating Apache via AddDefaultCharset directive. All major browsers these days can autoconvert these Russian charsets while rendering.

         
        • Geoffrey T. Dairiki

          As for using gettext with multiple charsets, that's no real problem.  The solution is to use different posix LOCALEs for each charset.  A fully specified LOCALE is something like "ru_RU.koi8r". (The first ru is the language, the second RU is the country, and then the part after the dot is, obviously, the charset.)

          However, as pointed out by jmpoure, unless you hack PhpWiki to handle UTF-8 somehow, the page text will be stored in a specific single-byte character set --- unless you provide some way to translate (which, I suppose is possible), you're kind of stuck serving up the pages in that single character set...  I think.

           
    • Jean-Michel POURE

      You need to convert phpwiki to full UTF-8. Therefore:
      1) Install PHP with mb_ extensions and turn "overide" on to overide Latin1 text functions with their multi-byte equivalent.
      2) Use an UTF-8 database. MySQL is not UTF-8 compatible and will never be. Use PostgreSQL.
      3) You web pages need to be encoded in UTF-8.

      Good luck,
      Jean-MIchel POURE jm.poure@freesurf.fr

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.