From: Silas S. B. <ss...@ca...> - 2009-01-16 12:55:43
|
Dear eSpeak listers, I originally advised jonsd that, in cases where there are multiple tone 3s in Mandarin Chinese, a tone 3 should be changed into a tone 2 if and only if there are an ODD number of consecutive tone 3's immediately after it. Hence ni3hao3 is spoken as ni2hao3, hao3lao3ban3 as hao3lao2ban3, ba3bao3xian3qu3xiao1 as ba2bao3xian2qu3xiao1. This understanding was based on the rules set out in beginners' Mandarin textbooks. However, it recently came to my attention that the phrase 把保险取消 (ba3 bao3xian3 qu3xiao1, "to cancel an insurance policy") is spoken more like "ba3 bao2xian3 qu3xiao1" by every Chinese native I asked, even though this leaves two consecutive tone 3's (xian3 qu3). One said my naive rendition "ba2 bao3 xian2 qu3 xiao1" is "definitely wrong" (not just "acceptable as a variation", but "definitely wrong"), and 2 others independently said (without any prompting on my part) that I had probably been taught a rule to never put two tone 3's together but I should now forget this rule. A Wikipedia author recently had this to say on http://en.wikipedia.org/wiki/Standard_Mandarin : ---------------------------------------------- Pronunciation also varies with context according to the rules of tone sandhi. The most prominent phenomenon of this kind is when there are two third tones in immediate sequence, in which case the first of them changes to a rising tone, the second tone. In the literature, this contour is often called two-thirds tone or half-third tone, though generally, in Standard Mandarin, the "two-thirds tone" is the same as the second tone. If there are three third tones in series, the tone sandhi rules become more complex, and depend on word boundaries, stress, and dialectal variations. Tone sandhi rules at a glance 1. When there are two 3rd tones (˨˩˦) in a row, the first syllable becomes 2nd tone (˧˥), and the second syllable becomes half-3rd tone (˨˩). ex: 老鼠 (lǎoshǔ) becomes [lao˧˥ʂu˨˩] 2. When there are three 3rd tones in a row, things get more complicated. If the first word is two syllables, and the second word is one syllable, the first two syllables become 2nd tones, and the last syllable stays 3rd tone: ex: 保管好 (bǎoguǎn hǎo) becomes [pao˧˥kuan˧˥xao˨˩˦] If the first word is one syllable, and the second word is two syllables, the first syllable becomes half-3rd tone (˨˩), the second syllable becomes 2nd tone, and the last syllable stays 3rd tone: ex: 老保管 (lǎo bǎoguǎn) becomes [lao˨˩pao˧˥kuan˨˩˦] 3. When a 3rd tone is followed by a first, second or fourth tone, or most neutral tone syllables, it usually becomes a half-3rd tone. The half-3rd tone is a tone that only falls but does not rise. ex: 美妙 (měimiào) becomes [mei˨˩miao˥˩] ---------------------------------------------- Rule 2 is a nightmare. It means we have to know where the word boundaries are, at least in the case of two-syllable words that start or end with tone 3. When I made zh_listx, I did so on the assumption that we don't really have to know the word boundaries. Getting word boundaries right automatically is difficult - Wenlin always throws back "segmentation ambiguities" at the user, pinyinannotator.com often gets it wrong, etc. I thought it won't matter if in most cases the location of the word break does not affect the actual pronunciation, so I made no attempt to get word breaks right (and in fact a couple of the rules I've recently added, 的地 and 地的, deliberately cut across word breaks, as a hack to try to ensure 地 is pronounced correctly when used in literature with unknown place names). I can certainly modify my zh_listx generator script so that it always includes two-syllable words that start or end with tone-3 even if these are not strictly needed as "exceptions". Then the above rule 2 can be implemented by looking at the spaces as well as the tones. BUT I'm not convinced that will fully solve the problem, because the rules as stated above are not complete. For example, what if there are two two-syllable words, each with two 3rd tones? or a one-syllable word followed by two two-syllable words, all with 3rd tones? or a 3-syllable word that not everyone agrees if and how to break up? The rules above do not say anything about those cases, or how things are prioritised when two or more cases overlap (as they do in ba3 bao3xian3 qu3xiao1). And it gets worse. A current PhD candidate in Cambridge who's investigating tone teaching got 2 professionals to record the phrase 好老板 (hao3 lao3ban3, "good boss") for one of her experiments, and one said it a bit like hao2 lao2ban3 (but with the first tone 2 at a higher starting and ending pitch than the second), which is sort-of an application of rule 2 above, while the other said hao3 lao2ban3, which is applying my initial naive understanding of the rules. Perhaps he had read this and was trying to be "right" even at the expense of naturalness? (That's a problem you can have if you tell people to record words very slowly for learners, especially if they're from regions that normally have accents and therefore might over-correct in their attempts to be "standard".) I do not have permission to attach the recordings for the eSpeak development archive (I've only got them because I gave her some technical assistance) but I can try to summarise: 32m / 32f - 好老板 hao3 lao3ban3 (nice boss) - as discussed above 33m / 33f - I think this is supposed to be 老捕懂 but I'm not sure. It is 3 * 1-syllable words. And the voices do exactly what they did on the previous recording. It is possible that their pronunciation of this was influenced by the fact that they had just pronounced something similar (when reading lists it's easy to get stuck into a pattern and pronounce words differently from how you'd normally read them in non-list contexts). 34m / 34f - 保守党 bao3shou3dang3 (Conservative Party) - BOTH speakers change the first 2 tone 3's in this case (the 1st 2 become rising tones, with the 1st rising tone starting and ending at a slightly higher pitch than the 2nd, in BOTH voices). This is a 3-syllable word that cannot really be broken down into 2+1 or 1+2, but if it CAN be broken down then I think it would break into 2+1 保守 + 党). So I have no idea what is going on here wrt. the sandhi rules. 35m / 35f - 小老虎 xiao3 lao3hu3 (little tiger cub) - very similar to hao3 lao3ban3. But note that some dictionaries list xiao3lao3hu3 as ONE WORD, so it's debatable whether it's xiao3+lao3hu3 or xiao3lao3hu3. See what I mean about the problem with automatic segmentation. I would argue that, in this case, IF it can be broken down into 2 words THEN the break WOULD go after the 1st syllable. But how are we going to get this kind of data for ALL 3+-syllable words in the language? In some cases I can get it automatically by querying a dictionary for both the 1st 2 and the last 2 syllables and see if either of them make a word with pronunciation that's similar enough to their use in the 3-syllable word, but I have a feeling that there will be an awful lot of cases when BOTH 1+2 and 2+3 can make valid words - and then what do we do? Parse the definitions and compare semantics? Find free frequency data on all words? I was really hoping that basic pronunciation would not be affected by something this complicated. 36m / 36f 小老头 xiao3 lao3tou2 ("little old man") - both voices change the xiao3 into xiao2 and leave the rest unchanged (no surprises there). AND - the PhD candidate who is doing all this (who is a native Chinese from Guilin) thinks that the Wikipedia article is WRONG, although she is not yet able to explain how to correct it. She says she's going to have to think very seriously about this kind of thing in her thesis. So, to summarize, (1) The rules for 3 or more consecutive 3rd-tones are more complicated than we thought. (2) They might depend on where the word boundaries are (which are difficult to find accurately when the input is in Chinese hanzi). (3) We still don't know for sure exactly what the rules are, and we do have cases that are interpreted differently by different natives. But natives do think our basic interpretation is "wrong", at least in the case of 把保险取消. So what do we do now? Wait for the thesis? Help her write it? ..... 我糊涂了 (I'm feeling muddled) Silas (The Chinese-takeaway kitchen worker who wanted help cancelling her mobile phone insurance had no idea what this was going to lead to!) -- Silas S Brown http://people.pwf.cam.ac.uk/ssb22 |