Re: [Tuxpaint-i18n] [Tuxpaint-devel] Aa, qx, QX, qy, QY and other strings
An award-winning drawing program for children of all ages
Brought to you by:
wkendrick
From: Albert C. <aca...@gm...> - 2009-06-02 09:14:07
|
On Mon, Jun 1, 2009 at 1:16 PM, Bill Kendrick <nb...@so...> wrote: > For a while now, Tux Paint has had a feature whereby it tries to > remove or re-order fonts such that the most useful ones (to users of the > current locale) are presented first. > > In other words, if a font doesn't have the characters necessary to type > in the current locale's language, we remove (or at least deemphasize) that > font. This is especially important when scrolling is disabled. In that case, only the topmost fonts may be used. > * Does it have both uppercase and lowecase letters > (if that makes sense in the locale). > > We test this in English by seeing if "qx" and "QX", and "qy" and "QY" > both render. In your locale, it'd be helpful to translate these to > uppercase and lowercase characters common to your locale. > > * If the locale cannot support ASCII characters, both the "qx"/"QX" > and "qy"/"QY" pairs need to be translated to something in the local langage. > (That way, fonts that don't support your locale will be filtered out.) > > * If the locale does support ASCII, then only translate the "qx"/"QX" lines. > LEAVE THE "qy"/"QY" ones untranslated (or simply enter "qy" and "QY" > for them as the translations). > (This way, fonts that don't support your locale, but are still useful > because your locale supports ASCII, will remain.) Yes. This should work OK. Perhaps it would be better to split the ASCII and non-ASCII apart, then flag languages according to how much they value ASCII. Oh well; the current code seems to do a decent job. > * We gather a score for a font based on whether it supports a variety of > strings. For your locale, translate the following into whatever makes > sense: > > oO - Test whether uppercase and lowercase characters work > (it's ok if it does not, but is scored lower). > > `\%_@$~#{}<>^&* - "Uncommon" punctuation. (In European locales, > you might want to check for the Euro symbol too, > for example.) This is stuff you could live without in a novelty font. It's commonly missing. > ,.?! - Common punctuation. (In Spanish locales, for example, > you'd want the upside-down ? and ! ) This is really critical for using the font. > 017 - Digits. (Honestly, I'm not sure how one would localize this.) Some particularly lame novelty fonts lack the digits. Some languages do not use ASCII digits. BTW, in case digits show up somewhere in the UI, glibc can translate them if you use the "I" (upper case eye) modifier. Like this: "%Id" > O0 - Distinct circle-like characters. (I admit I don't understand how > scoring actually applies to this test. Albert?) > > 1Il| - Distinct line-like characters. (Ditto) This is to prefer fonts with distinct characters. It's confusing if you can't tell the difference. It's not so easy to explain why the computer has all these symbols if they all look the same. In general, indistinct characters is a sign of a poor font. > grep -C 2 "qx" po/*.po | grep msgstr | grep -v "qx" | grep -v \"\" > grep -C 2 "QX" po/*.po | grep msgstr | grep -v "QX" | grep -v \"\" > # a number of locales translate, but only a fraction of all locales > > grep -C 2 "qy" po/*.po | grep msgstr | grep -v "qy" | grep -v \"\" > grep -C 2 "QY" po/*.po | grep msgstr | grep -v "QY" | grep -v \"\" > # norwegian locales translate this -- not sure if that's appropriate...? I think it's an error, because plain ASCII is slightly useful. Norwegian probably should translate one pair of these only. Translate both whenever plain ASCII fonts are of zero value. > grep -C 2 "oO" po/*.po | grep msgstr | grep -v "oO" | grep -v \"\" > # swedish checks for a variety of accented chars. > # korean checks for a pair of korean chars. Probably this isn't the best. The code might be improved by having distinct non-translatable test strings for ASCII, and a way to indicate the importance of ASCII. Swedish loses the ability to prefer ASCII-only fonts with case distinction over ASCII-only fonts that lack it. It gains the ability to distinguish between fonts that lack case distinction for Swedish accented letters. It's pretty unlikely that a font would have case distinction for ASCII but not also for any accented characters. In other words, testing ASCII is highly likely to take care of accented characters as well. I have no idea what the Korean translation is doing. In my xterm those characters look like spaces. It'd be reasonable to not use gettext on this string. Translation is only really useful if all of these apply: 1. the language does not use the Latin alphabet 2. the alphabet has a case distinction 3. a font fails to provide the case distinction 4. a font fails to provide ASCII (and was unblacklisted) For example, suppose that there are two Cyrillic fonts which completely lack the ASCII letters. If one of those fonts is also lacking case distinction, then it should be scored lower than the other. If this situation really does exist, then translation could be useful. The same goes for Greek. Note that this situation can not exist unless the translator also disables the blacklisting of fonts that lack ASCII letters. (by translating "qx", "QX", "qy", and "QY") > grep -C 2 "%_" po/*.po | grep msgstr | grep -v "%_" | grep -v \"\" > # no translations > > grep -C 2 ",\.?\!" po/*.po | grep msgstr | grep -v \"\" | grep -v ",\.?\!" > # only arabic checks (for right-to-left variations of these chars) > # I think spanish should check for upside-down ? and ! > # I think many should check for their quote characters (e.g., French) I think that Arabic is doing the right thing. Other languages using non-Latin end-of-sentence punctuation ought to do this. Given what little I know about Spanish, I agree with you. If the upside-down '?' is really critical for tolerable sentences, then it should be checked. Quote characters are not critical. If the French translator is having score problems related to them, then they might best be added to the low-priority punctuation string. Quote characters are far less important than things like the period and question mark. > grep -C 2 "017" po/*.po | grep msgstr | grep -v \"\" | grep -v "017" > # gujarati checks (has its own set of digits) Farsi and Arabic do too, AFAIK. > ... and similar for the other strings. (But, again, I don't _quite_ understand > their use.) The O0 string appears to be wrongly translated in all cases except po/gu.po-msgstr, which simply suppresses the unused ASCII digit. None of the other translations should exist. I think some East Asian languages have an end-of-sentence character that might need to be added on to the end of "O0". Any sort of large circle character should be in the string. (and likewise for the vertical line characters) |