Menu

#5058 Locale guessing of msgcat fails on (some) Windows 7

obsolete: 8.5.11
closed-fixed
5
2012-07-02
2012-06-21
No

I experienced on a german-swiss windows 7 computer, that :
<session on tcl 8.5.11>
% package require msgcat
1.4.4
% msgcat::mcpreferences
en_us en {}
</session>

(the last result should be "de_ch de {}")

The reason for that is, that the language is detected observing the
registry key:
[HKEY_CURRENT_USER\Control Panel\International\Locale]
It contains a numerical Language ID (LCID).

Possible values are in the registry subtree:
[HKEY_LOCAL_MACHINE\SOFTWARE\Classes\MIME\Database\Rfc1766]

Extract:
0409 en_us <- this is set
0407 de <- this might be acceptable
0807 de_ch <- this would be correct

The observed value is "0409" which is en_us.
The correct value would be "de_ch". Still acceptable is "de".

I found other reports, that this registry key is not reliable using
windows 7:
http://social.technet.microsoft.com/Forums/en/w7itproinstall/thread/8597901e-47a8-4457-a8dc-653a260808b3

following the links at the end
http://blogs.msdn.com/b/michkap/archive/2010/03/19/9980203.aspx
it is proposed, that:
- there is a new key (since Vista I think) which contain directly the
IETF language tag (which is close to the locale):
[HKEY_CURRENT_USER\Control Panel\International\LocaleName]

In this case, this would help. The contained value "de-CH" is correct.

----

Anyway, the recommended method is to use the system call:
GetSystemDefaultLCID()
and from Vista on:
GetUserDefaultLocaleName()

---

Two possible methods to solve the issue:

1) extend msgcat to first look to LocaleName
This is officially not supported but would solve in this case
The registry entry "LocaleName" is seen as more reliable, as all modern APIs use this instead of CLID's.
A patch for this case is attached.

2) extend tcl to return the current system locale by an api
This requires binary code

Discussion

1 2 > >> (Page 1 of 2)
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-21

    Patch for msgcat 1.4.4 resulting in msgcat-1.4.5

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-22

    committed to branch bug-3536888

     
  • Don Porter

    Don Porter - 2012-06-22
    • assigned_to: dgp --> oehhar
     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-28

    Fixed two things in the bug-3536888 branch:
    - the variable "key" was used before defined
    - didn't work on cygwin, now it does.

    Harald, I think it's ready to be merged to
    core-8-5-branch. Do you agree?
    Please test it on your german-swiss windows 7
    machine. To me everything looks fine now.

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Dear Jan,
    thank you for the two modifications.
    "key" was lost somewhere...
    Test is succesfull.

    Two remarks:
    1) for me, the purpose of '[info sharedlibextension] ne ".dll"' is not obvious.
    I would like a comment like: "on windows but not cygwin"
    Or I would like a more explicit "windows and not cygwin" check using the platform array/packages

    What exactly happens on cygwin ? Is there no registry package ?

    2) Small optimisation:
    Replace:
    #
    # The rest of this routine is special processing for Windows;
    # all other platforms, get out now.
    #
    if {[info sharedlibextension] ne ".dll"} {
    mclocale C
    return
    }
    #
    # On Windows, try to set locale depending on registry settings,
    # or fall back on locale of "C".
    #
    if {[catch {
    package require registry
    }]} {
    mclocale C
    return
    }
    by:
    # The rest of this routine is special processing for Windows;
    # all other platforms, get out now.
    #
    if {[info sharedlibextension] ne ".dll"
    || [catch {package require registry}]
    } {
    mclocale C
    return
    }
    #
    # On Windows, try to set locale depending on registry settings,
    # or fall back on locale of "C".
    -END-

    Thank you,
    Harald

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Patch of 2012-06-29 against msgcat-1.4.4

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    I agree with your comments. The registry package works on
    Cygwin as on Windows, only $tcl_platform(platform) is "unix",
    so msgcat didn't try to use the registry package. The easiest
    way to say "Windows or Cygwin" is [info sharedlibextension],
    because those two platforms are the only ones using ".dll"
    So, comments adapted acoording to that.

    Suggested optimization is OK to me too.

    Now updated in bug-3536888 branch.

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Or we could use
    package present registry
    instead
    [info sharedlibextension] ne ".dll"

    This would clearly indicate what is required, the registry package.

    ----
    Additional future thoughts:
    - We could replace the fix translation table CLID->locale by registry access, as this table is contained in the registry:
    [HKEY_LOCAL_MACHINE\SOFTWARE\Classes\MIME\Database\Rfc1766]

    I have to check, if this table is available on Win XP, it is ok since Vista.

    I would only do this on tcl 8.6, as it might include some incompatibilities.
    So thats another story...

    - Also I would prefer the registry methods over the environ variable method on windows, as:
    - LANG is sometimes set, for example by an installed CYGWIN
    - LANG is normally less detailed - the country is missing while the registry method extracts the country.

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    > Or we could use
    > package present registry
    > instead
    > [info sharedlibextension] ne ".dll"

    That wouldn't work: If "registry" is available
    but not loaded yet, this would resturn false.

    > - Also I would prefer the registry methods over the environ variable method
    > on windows, as:
    > - LANG is sometimes set, for example by an installed CYGWIN
    > - LANG is normally less detailed - the country is missing while the
    > registry method extracts the country.

    By default, LANG is not set in CYGWIN unless the user
    explicitely sets it. So, this is usefull as a way to override
    the registry setting. I wouldn't change that.

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Both points ok !
    Thanks,
    Harald

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    So, will you do the merge, or do you prefer
    that I do it for you. (I don't know how familiar
    you are already with fossil, this would be
    a good test)

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Please do it.

    I am not ready jet, no login etc...

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    merged to core-8-5-branch and trunk

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29
    • status: open --> closed-fixed
     
  • Donal K. Fellows

    Basing off a [package require registry] would be acceptable to me; that is a package that is definitively not available on non-Windows platforms. The cost of populating the package database will have already been borne too; this is *inside* a package in the first place, so it will be possible to say definitively whether the package is there or not with fairly low cost.

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    I personally don't like to do a
    if {[catch {package require registry}]} { ...}
    as this throws systematically an error on non-windows platforms.
    This will pollute the ::errorInfo variable which is IMHO not good practice for a core package.

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    Did a quick compare with the table in the registry:
    HKEY_LOCAL_MACHINE\SOFTWARE\Classes\MIME\Database\Rfc1766
    found 3 locales which were missing, so added them now.

    >Basing off a [package require registry] would be acceptable to me
    A [package require] is expensive when the package is not found, as
    it has to traverse all possible directories. So I fully agree with Harald.
    [info sharedlibextension] is the cheapest way I know of.

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Thank you for the discussion.

    My point is not speed but clearty.
    I don't like to do a "package require" in cases where I know it will fail and thus I will pollute the ::errorInfo without real error.

    I don't like [info sharedlibextension] to much, as we use a side effect to detect the windows platform.
    Is there no other way, to detect the windows platform ?
    If not, I would love to have this in the platform package.
    Thats why I asked Jan to add a comment and thats what he did.

    Wiki page
    http://wiki.tcl.tk/1649
    still states, that cygwin tcl returns platform "windows", but this was changed as far as I know.
    Could someone add/correct this page in respect to Cygwin ?

    Further development of msgcat: I neither like, that the locale search in Init throws an error to report a non-matching locale. This may pollute ::errorInfo too.
    IMHO this is ok for user code. Packages should not do that.

    Just 2 cents,
    Harald

     
  • Donal K. Fellows

    Well, actually doing:

    if {"registry" in [package names]} {...

    would work since we're after the first [package require].

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    I found, that the table to translate windows lang IDs to locales does sometimes use the modifier, which was not jet on my radar:
    Example:
    43 uz
    0443 uz_UZ@latin
    0843 uz_UZ@cyrillic

    I suppose, that this information is contained in the LocaleName "script" part:

    I have found:
    http://msdn.microsoft.com/en-us/library/aa226765%28v=sql.80%29.aspx
    which says:

    uz-UZ-Cyrl Uzbek (Cyrillic) - Uzbekistan
    uz-UZ-Latn Uzbek (Latin) - Uzbekistan

    What puzzels me, is that
    http://tools.ietf.org/html/rfc5646
    uses this format:
    uz-Cyril-UZ Uzbek (Cyrillic) - Uzbekistan
    uz-Latn-UZ Uzbek (Latin) - Uzbekistan
    which is what I have implemented. But I currently ignore the script field.

    On my Windows Vista, those keys are not present in the registry:
    [HKEY_LOCAL_MACHINE\SOFTWARE\Classes\MIME\Database\Rfc1766]
    A modifier is never used.

    Well, quite funny all that.
    Richard Suchenwirth would know...

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    > Wiki page
    > http://wiki.tcl.tk/1649
    > still states, that cygwin tcl returns platform "windows", but this was
    > changed as far as I know.
    > Could someone add/correct this page in respect to Cygwin ?

    Modified now. Since feburary, Cygwin started to base its
    port on 'unix' while previsiously it was based on win32.
    That's why all fields changed. I changed the cygwin
    port to use the Win32 functions again, so now
    most fields return the same as before, except
    the tcl_platform(platform), which became "unix"

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Thanks for the update of the Wiki page.
    Thus
    tcl_platform(os) eq "Windows NT"
    or
    string equal -length 3 "Win" $::tcl_platform(os)
    to also get "Win31" and "Windows 95"
    would do the job ?

     
  • Donal K. Fellows

    Try [string match Win*] for that...

     
  • Jan Nijtmans

    Jan Nijtmans - 2012-06-29

    ...Until there comes a system named "ReactOS", which uses .dll's,
    or a system named "Winnux" which turns out to be a linux-derivate...

    Please leave it as is! Maybe it doesn't look as nice,
    but its the cheapest and most trustworthy compared
    to the alternatives. My bikeshed is green ...... ;-)

     
  • Harald Oehlmann

    Harald Oehlmann - 2012-06-29

    Following
    http://msdn.microsoft.com/en-us/library/windows/desktop/dd373814%28v=vs.85%29.aspx

    the script parameter is now translated to a mdoifier: sr-Latn-CS -> sr_cs@latin

    For instance, only two script values are supported:
    Latn -> latin
    Cyrl-> cyrillic
    Others which were found:
    ???? -> modern (in msgcat lang id translation table: 0c0a es_ES@modern)
    I suppose, this is not a script
    Hant -> ? (in RFC4646: zh-Hant (Chinese written using the Traditional Chinese script))
    Hans -> ? (in RFC4646: zh-Hans (Chinese written using the Simplified Chinese script))

    The attached path is against the current trunk to implement this feature.

    -Harald

     
1 2 > >> (Page 1 of 2)