From: Adam B. <kil...@an...> - 2012-05-11 22:35:47
|
dat/descript/ was the low-hanging fruit, dat/database/ is mostly a logistics issue (doesn't fit into the Transifex workflow) but has no real questions about code that lives inside Crawl itself. So, let's go for the biggest part: translating non-fixed messages. My plan is as follows: messages, desc snippets, etc, are given in a printf()-like form. The format part goes through something akin to gettext[1], with arguments using "grammar functions" like: (monster-on-monster combat with a weapon) "$actor(%s) hits $actor(%s) with $item(%s)." which translates to: "$actor(%s,nom) uderza $actor(%s,acc) $item(%s,ins)." These functions need in general be Turing-complete, although they will often resort to some kind of a simple table lookup. Examples of such functions in English are: * pluralise() -- 120 lines * apostrophise() -- trivial * article() -- could subsume DESC_THE/DESC_A * possessive() and friends Such grammar functions would need to support recursion: for example in Polish, nominative for "orc skeleton" (ie, "$actor(orc skeleton,nom)") resolves to "szkielet $actor(orc,gen)" and that in turn to "szkielet orka". Even English has that if we generalize weapon brand naming, which is probably required for translating them in a sane way. Default arguments: very often, a function depends on some property of a neighbour word. Like, a verb needs to match a noun which precedes it. It would greatly simplify things if the verb function could default based on context. So, I'd pass the list of functions and their arguments one level up: "$noun(%s) $verb(%s)" would be a shorthand for: "$noun(%s) $verb(%s,$gender($noun(%1$s)))" as the second argument to $verb (the noun's gender) would be guessed automatically. This example raises another issue: printf() syntax for reordering fields is too ugly to live. Also, support for mixing types of arguments would require complex error-prone code. Thus, my idea here is: * some-file.cc would write translation_message_function("You have %d %s.", item->quantity, item->name().c_str()) {later about the function's name} * but the string would then be turned into "You have $1 $2."! This seems complex, but I believe it's worth it: we'd have the compiler check C++ code for us, and translators would never see the printf() syntax. Capitalization: mprf() vs mprf_nocap() was pretty much a disaster. I'd replace it with explicit cap/decapitalization: $. would uppercase the next letter, $, would keep its case. Messages start in $. mode by default. UNRESOLVED ISSUES: 1. What with messages that differ in English by only a small part? mprf("Your intelligent allies are %sforbidden to pick up anything at all.", now.c_str()); -- now may be "now " or "". This is likely to change the whole sentence in many languages, and thus can't be done by a local part. Does anyone have a better idea than duplicating the whole thing into two strings, one with "now" one without? 2. How to name the equivalents of mprf() and simple_monster_message()? Don't laugh, we'll use them in thousands of places so the name is important. Shouldn't be long, needs to be greppable. I see no use for a mpr() -- the format string is always static, optimization would be negligible and code duplication bad. The new simple_monster_message() I'd move into the actor class; again, with only a print-like version. [1]. Can't use real gettext inside Crawl proper, can use its tools. -- // If you believe in so-called "intellectual property", please immediately // cease using counterfeit alphabets. Instead, contact the nearest temple // of Amon, whose priests will provide you with scribal services for all // your writing needs, for Reasonable and Non-Discriminatory prices. |