Internationalization (I18N) comments

Tom Morris
2004-02-23
2004-02-24
  • Tom Morris
    Tom Morris
    2004-02-23

    Here are some initial comments on the I18N Standards document dated 18 Jan 2004.

    Section 2 - I don't think the assumption of using the system default locale is a good one.  The user should be able to choose an application specific locale which is different from the system default.

    For a genealogy application, multiple locales need to be supported -- at least one for the UI and a different one for current report (ie multilingual researchers will have a prefered language for the UI, but will generate reports in the language of their correspondant.)

    If any of the core routines are used in a multithreaded, multiuser server environment (eg a web server), they'll need to support a locale per connected user.

    Character set issues need to be addressed.  Perhaps this all happens magically in the supporting Java classes, but I wouldn't bet on it.  BIDI and MBCS need to be supported.

    Section 3 - Having a single huge resource file doesn't scale (I'm dealing with this now in another genealogy application that I'm translating).  It's much better to have a file per package, perhaps with some hierarchical searching/defaulting.  This also automatically allows plugins to provide their own resources.

    Never, EVER, use string concatentation.  Any use of "+, +", or StringBuffer.append() for user visible text is WRONG.  (Yup, I'm shouting).

    Using string concatentation for keys is bad too because it makes it hard for tools to help you find mismatches between your properties file and your code.

    There are standard formatting tasks commonly done on genealogy objects that should be supported directly to help coders avoid the temptation to do their own.  Things like <surname>, <given names> (ID#) should be coded once in the supporting class (and localized/internationalized there).

    Section 5.1 - Strings need to be either tagged with their character set or translated to some canonical character set.  The former is probably preferable.  That also means that there are potentially input and output transformations required if the user interface is using a different character set.

    Section 6 - I don't know enough about database technology to comment knowledgeably, but I'm suspicious of assuming that the database will collate correctly without external assistance.  In particular, I'm pretty sure that collating sequences are  locale specific, so they need to change when the user selects a different output language/locale for a report. 

     
    • Tom,
      just one little comment here. What is magical about surnames? For the majority of Norwegian genealogy, the concept of surnames just doesn't apply.

       
      • Tom Morris
        Tom Morris
        2004-02-24

        OK, that was a bad example.  The main point is that in every culture there's one or a small number of standard ways that genealogy software packages format names and the knowledge of how to do that formatting should be encapsulated in a single place.

         
    • Ed Ridpath
      Ed Ridpath
      2004-02-24

      Let's see if I can address - I can tell many of these threads will require thourough review to ensure we capture or conclude on them.

      Anyway, section 2 default locale - i think I agree that whcih locale was in use for the UI  would go this way: 1) Specific change ie user selected a new locale from a menu 2) Appliciation specific ie taken from GeneaPro config file 3) System default.  Either way, once the locale is set, there is no need to worry about it as long as we use the locale aware output functions.  And I see these as part of the UI code for the most part - so therefore support multiple simultanuos users with different locales.

      I think different  locales can be supported for the UI and output reports - that is a very good point

      Charactor sets is listed as a possible TODO - we are expecting the database or Java classes to handle, but as you mention, this may not be a good assumption.

      Resource files - it makes sense that in a large program these become unweildy - separating by package, might make sense, although the UI package will still have a large file - this is worth more research.

      As far as standard formating objects, in principle I agree, and as Leif pointed out, might want to make them more generaic (ie format a name and then provide differnt name types).

       
    • Tom Morris
      Tom Morris
      2004-02-24

      After I wrote this I came across a properties file which had a comment to the effect of "don't put things here, put them in the Java class."

      This seems exactly backwards to me.  It is much easier for translators to deal with property files than Java source modules and they're much less likely to mess it up.

      I think the only time you want to be creating Java resource classes is when there are data types which can't be represented in a properties file.