From: Bruno H. <br...@cl...> - 2011-08-23 22:35:35
|
Hi Sam, > on a CentOS system I see > $ ./clisp -E utf-8 -q -norc -L french -x '(sys::text "Bye.")' > "? bient??t!" > on a Ubuntu system I see > $ ./clisp -E utf-8 -q -norc -L french -x '(sys::text "Bye.")' > "Bye." > How do I debug this? The checklist from the gettext FAQ at <http://www.gnu.org/software/gettext/FAQ.html#integrating_noop> should provide the answer. Bruno |
From: Bruno H. <br...@cl...> - 2011-08-24 07:16:37
|
Sam wrote: > setlocale(5, "fr_FR") = NULL This explains it. > Apparently because "locale -a" does not list fr_FR. But it will likely have the locale "fr_FR.UTF-8" (or "fr_FR.utf8" which is the name under which it is stored on disk)? I think the code in spvw_language.d lines 137..140 should be changed from setenv (LC_MESSAGES, locale, 1); setlocale (LC_MESSAGES, locale); to char locale_utf8[32]; strcpy (locale_utf8, locale); strcat (locale_utf8, ".UTF-8"); setenv (LC_MESSAGES, locale_utf8, 1); if (setlocale (LC_MESSAGES, locale_utf8) == NULL) { setenv (LC_MESSAGES, locale, 1); setlocale (LC_MESSAGES, locale); } so that it tries first fr_FR.UTF-8 and then, if that fails, fr_FR. By the way, in line 132, the Danish locale is "da_DK", not "da_DA", > Does setlocale set errno? (the man page is silent). No, setlocale simply returns NULL when it fails. > I think init_language is called too early - before the memory is > initialized, so we cannot signal lisp errors yet. > I think it should be called via C_set_current_language after the memory > is initialized and it should raise lisp errors on any failure This is debatable. The usual approach with missing translations and missing locales is to be lenient. If you signal an error, clisp will not start up at all, and that is not useful for the user. Bruno |
From: Sam S. <sd...@gn...> - 2011-08-24 14:13:28
|
> * Bruno Haible <oe...@py...> [2011-08-24 09:16:11 +0200]: > > Sam wrote: >> setlocale(5, "fr_FR") = NULL > This explains it. what's inexplicable is that the error is silently ignored. >> Apparently because "locale -a" does not list fr_FR. > > But it will likely have the locale "fr_FR.UTF-8" (or "fr_FR.utf8" > which is the name under which it is stored on disk)? no, all I have is C, POSIX and a bunch of en_* locales (I am an English-only bigot, remember? ;-) > I think the code in spvw_language.d lines 137..140 should be > changed from > setenv (LC_MESSAGES, locale, 1); > setlocale (LC_MESSAGES, locale); > to > char locale_utf8[32]; > strcpy (locale_utf8, locale); > strcat (locale_utf8, ".UTF-8"); > setenv (LC_MESSAGES, locale_utf8, 1); > if (setlocale (LC_MESSAGES, locale_utf8) == NULL) { > setenv (LC_MESSAGES, locale, 1); > setlocale (LC_MESSAGES, locale); > } > so that it tries first fr_FR.UTF-8 and then, if that fails, fr_FR. is it possible to have fr_FR.UTF-8 and not fr_FR? let's see, on a machine with many locales: $ locale -a | wc 650 650 6924 $ locale -a | egrep '\.' | cut -d. -f1 | sort -u > with-dot $ locale -a | egrep -v '[.@]' | sort > no-dot $ comm -23 with-dot no-dot as_IN az_AZ tt_RU $ locale -a | grep tt_RU tt_RU.utf8 $ locale -a | grep as_IN as_IN.utf8 $ locale -a | grep az_AZ az_AZ.utf8 $ interesting. so it is possible. fine, but since both locale and locale_utf8 are known constants, I will do it without strcpy et al. thanks for the suggestion! > By the way, in line 132, the Danish locale is "da_DK", not "da_DA", fixed, thanks. >> Does setlocale set errno? (the man page is silent). > > No, setlocale simply returns NULL when it fails. > >> I think init_language is called too early - before the memory is >> initialized, so we cannot signal lisp errors yet. >> I think it should be called via C_set_current_language after the memory >> is initialized and it should raise lisp errors on any failure > > This is debatable. The usual approach with missing translations and > missing locales is to be lenient. If you signal an error, clisp will > not start up at all, and that is not useful for the user. Okay, so when called during startup it will be a warning, not an error. My point is that ignoring errors is always wrong. If setting locale fails, the user should be notified about it and not left wondering why (sys::test "Bye.") returns "Bye." - is it because the word was not translated or something else? At any rate, apparently you do not object to moving init_language past initmem and letting it indicate a failure as long as it does not prevent clisp from starting up. Now, what _is_ a failure? 1. setlocale --> NULL 2. textdomain --> NULL ==> errno 3. bindtextdomain --> NULL ==> errno 4. bind_textdomain_codeset --> errno != 0 (may return NULL on success!) is this right? -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://pmw.org.il http://truepeace.org http://openvotingconsortium.org http://iris.org.il http://www.PetitionOnline.com/tap12009/ Experience comes with debts. |
From: Sam S. <sd...@gn...> - 2011-08-25 04:57:56
|
> Okay, so when called during startup it will be a warning, not an error. > My point is that ignoring errors is always wrong. implemented. Bruno, please check that you like the error messages. thanks. -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.04 (natty) X 11.0.11001000 http://memri.org http://mideasttruth.com http://thereligionofpeace.com http://dhimmi.com http://jihadwatch.org http://camera.org Vegetarians eat Vegetables, Humanitarians are scary. |
From: Sam S. <sd...@gn...> - 2011-08-24 14:41:52
|
> * Bruno Haible <oe...@py...> [2011-08-24 09:16:11 +0200]: > > Sam wrote: >> setlocale(5, "fr_FR") = NULL > This explains it. BTW, could you please explain this comment: /* Given the above, the following line is only needed for those platforms for which gettext is compiled with HAVE_LOCALE_NULL defined. */ also, could you please confirm that this is necessary: /* Invalidate the gettext internal caches. */ textdomain(textdomain(NULL)); -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://openvotingconsortium.org http://ffii.org http://pmw.org.il http://truepeace.org http://jihadwatch.org http://iris.org.il http://camera.org Just because you're paranoid doesn't mean they AREN'T after you. |
From: Bruno H. <br...@cl...> - 2011-08-24 16:01:59
|
Sam wrote: > BTW, could you please explain this comment: > > /* Given the above, the following line is only needed for those platforms > for which gettext is compiled with HAVE_LOCALE_NULL defined. */ On glibc systems, setlocale(cat,NULL) is known to return a locale string in the expected format, and locales can be created by users. Therefore gettext() uses setlocale(); on the other platforms gettext() relies on the environment variables and the setlocale() call is therefore actually redundant. > also, could you please confirm that this is necessary: > > /* Invalidate the gettext internal caches. */ > textdomain(textdomain(NULL)); textdomain(textdomain(NULL)) has the effect of incrementing the (hidden) variable _nl_msg_cat_cntr. gettext has a cache that stores translations, indexed by (domain, category) pairs. This cache is invalidated when _nl_msg_cat_cntr is incremented. This is necessary when the encoding has changed. > >> setlocale(5, "fr_FR") = NULL > > This explains it. > > what's inexplicable is that the error is silently ignored. The error is silently ignored because the return value of setlocale() is ignored. > >> Apparently because "locale -a" does not list fr_FR. > > > > But it will likely have the locale "fr_FR.UTF-8" (or "fr_FR.utf8" > > which is the name under which it is stored on disk)? > > no, all I have is C, POSIX and a bunch of en_* locales (I am an > English-only bigot, remember? ;-) OK, then my proposed code would not help on your system. But some Linux distros are now shipping only *.UTF-8 locales, no fr_FR locale any more. > so when called during startup it will be a warning, not an error. > My point is that ignoring errors is always wrong. > If setting locale fails, the user should be notified about it OK, fine with me. > At any rate, apparently you do not object to moving init_language past > initmem and letting it indicate a failure as long as it does not prevent > clisp from starting up. I'm not sure there aren't GETTEXT or GETTEXTL calls during initmem or the rest of the start-up. If there are some, it seems safer to me to - continue to call init_language as early as it is now, - but store its result or success in some variable, and - emit the warning message after everything is initialized. > Now, what _is_ a failure? > > 1. setlocale --> NULL Yes, this is a failure. It means the system does not support translations in the specified locale. > 2. textdomain --> NULL ==> errno Indicates a malloc failure. > 3. bindtextdomain --> NULL ==> errno Likewise. > 4. bind_textdomain_codeset --> errno != 0 (may return NULL on success!) This function can fail if malloc fails. I'm not aware that it could return NULL on success. Bruno |
From: Sam S. <sd...@gn...> - 2011-08-24 16:33:56
|
> * Bruno Haible <oe...@py...> [2011-08-24 18:01:09 +0200]: > > Sam wrote: >> BTW, could you please explain this comment: >> >> /* Given the above, the following line is only needed for those platforms >> for which gettext is compiled with HAVE_LOCALE_NULL defined. */ > > On glibc systems, setlocale(cat,NULL) is known to return a locale string in the > expected format, and locales can be created by users. Therefore gettext() > uses setlocale(); on the other platforms gettext() relies on the environment > variables and the setlocale() call is therefore actually redundant. So what happens on those non-glibc systems where this call is redundant? Is NULL returned? >> At any rate, apparently you do not object to moving init_language past >> initmem and letting it indicate a failure as long as it does not prevent >> clisp from starting up. > > I'm not sure there aren't GETTEXT or GETTEXTL calls during initmem > or the rest of the start-up. yes, there are some. > If there are some, it seems safer to me to > - continue to call init_language as early as it is now, > - but store its result or success in some variable, and > - emit the warning message after everything is initialized. I want the warning to be more specific, so, I guess, init_language will have to take a flag arg indicating whether to warn with fprintf or signal with ERROR. >> 4. bind_textdomain_codeset --> errno != 0 (may return NULL on success!) > > This function can fail if malloc fails. I'm not aware that it could return NULL on success. RETURN VALUE If successful, the bind_textdomain_codeset function returns the current codeset for domain domainname, after possibly changing it. The result- ing string is valid until the next bind_textdomain_codeset call for the same domainname and must not be modified or freed. If a memory alloca- tion failure occurs, it sets errno to ENOMEM and returns NULL. If no codeset has been set for domain domainname, it returns NULL. I find the combination of the last two sentences confusing. They appear to indicate that either -- NULL return value does not necessarily indicate failure (e.g., if nothing has changed) -- It is possible that NULL is returned (i.e., the call failed) but errno is not set. The first option seems to actually contradict the first sentence, so, I guess, the second is correct. Right? Or, just like with bindtextdomain & textdomain, NULL return value == failure, and the manual page should be fixed? -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.6 (Final) X 11.0.60900031 http://memri.org http://www.PetitionOnline.com/tap12009/ http://dhimmi.com http://truepeace.org http://palestinefacts.org http://www.memritv.org Computers are like air conditioners: they don't work with open windows! |
From: Sam S. <sd...@gn...> - 2011-08-24 01:46:49
|
Hi Bruno, > * Bruno Haible <oe...@py...> [2011-08-24 00:10:06 +0200]: > >> on a CentOS system I see >> $ ./clisp -E utf-8 -q -norc -L french -x '(sys::text "Bye.")' >> "? bient??t!" >> on a Ubuntu system I see >> $ ./clisp -E utf-8 -q -norc -L french -x '(sys::text "Bye.")' >> "Bye." >> How do I debug this? > > The checklist from the gettext FAQ at > <http://www.gnu.org/software/gettext/FAQ.html#integrating_noop> should > provide the answer. $ ltrace ./clisp -q -norc -L french -x '(sys::text "Bye.")' 2>&1 | grep setlocale setlocale(0, "C") = "C" setlocale(5, "") = "C" setlocale(0, "") = "C" setlocale(2, "") = "C" setlocale(3, "") = "C" setlocale(4, "") = "C" setlocale(5, "fr_FR") = NULL $ ltrace ./clisp -q -norc -L french -x '(sys::text "Bye.")' 2>&1 | grep textdomain textdomain(NULL) = "messages" textdomain("messages") = "messages" bindtextdomain("clisp", "/home/sds/src/clisp/current/buil"...) = "/home/sds/src/clisp/current/buil"... bindtextdomain("clisplow", "/home/sds/src/clisp/current/buil"...) = "/home/sds/src/clisp/current/buil"... bind_textdomain_codeset(0x6204ce, 0x6204dd, 5, 0, 0) = 0x1a34580 $ ltrace ./clisp -q -norc -L french -x '(sys::text "Bye.")' 2>&1 | grep gettext dgettext(0x6204ce, 0x7fffb2046fa0, 0x7fffb2046fa0, 147, 0x7fffb20472e0) = 0x7fffb2046fa0 dgettext(0x6204ce, 0x7fffb2047b40, 0x7fffb2047b40, 4, 0x7fffb2047bb0) = 0x7fffb2047b40 dgettext(0x6204ce, 0x7fffb2046fa0, 0x7fffb2046fa0, 147, 0x7fffb20472e0) = 0x7fffb2046fa0 So, why does setlocale fail? Apparently because "locale -a" does not list fr_FR. Does setlocale set errno? (the man page is silent). I think init_language is called too early - before the memory is initialized, so we cannot signal lisp errors yet. I think it should be called via C_set_current_language after the memory is initialized and it should raise lisp errors on any failure so that we will never see *current-language* = FRENCH and messages delivered in English. WDYT? -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.04 (natty) X 11.0.11001000 http://jihadwatch.org http://www.PetitionOnline.com/tap12009/ http://iris.org.il http://ffii.org http://thereligionofpeace.com If at first you don't suck seed, try and suck another seed. |