|
From: Silas S. B. <ss...@ca...> - 2009-01-18 15:23:27
|
Got some more data here: (1) wǒ xiǎng qǐng nín Natives 1 and 2: wó xiáng qǐng nín Native 3, and SinoVoice demo on http://www.sinovoice.com.cn/english/tiyan.html : wó xiǎng qǐng nín (2) Měiyǔ bǔxíbān Natives 1 and 3 and SinoVoice: Méiyǔ bǔxíbān Native 2: Méiyú bǔxíbān (3) kěyǐ tǎolùn Natives 1 and 3 and SinoVoice: kéyǐ tǎolùn (4) ..gěi nǐ bǎochí Natives 1 and 3 and SinoVoice: ..géi nǐ bǎochí (5) jiàodǎo nǐ shǐ nǐ dé yìchu Natives 1 and 3 and SinoVoice (and previous group): jiàodǎo nǐ shí nǐ dé yìchu (6) zhǐyǒu shǎoshùrén Natives 1 and 3 and SinoVoice: zhíyǒu shǎoshùrén (7) kěyǒu-kěwú Native 1 and SinoVoice: kéyóu-kěwú Native 3: kéyǒu-kěwú (8) zhìshǎo yǒu liǎng ge Natives 1 and 3 and SinoVoice: zhìshǎo yóu liǎng ge (9) lìngrén nányǐ lǐjiě de Native 1 and SinoVoice: lìngrén nányǐ líjiě de Native 3: lìngrén nányí líjiě de (10) kěyǐ zěnyàng Native 1: kéyí zěnyàng Native 3 and SinoVoice: kéyǐ zěnyàng (11) shénme fāngfǎ kěyǐ gǎishàn Natives 1 and 3 (and SinoVoice although it seemed to fault on the "me"): shénme fāngfǎ kéyǐ gǎishàn Native 2 and previous group: shénme fāngfǎ kéyí gǎishàn (12) kěyǐ gǎishàn Native 1: kéyí gǎishàn Native 3 and SinoVoice: kéyǐ gǎishàn On the surface, it seems that most of the above can be explained by 3 rules: Rule 1. The pattern "33 3..." is changed to "23 3...", and this is fixed in the output before the rest of the input is evaluated. This covers examples 2,3,6,10,11,12. Rule 2. In the pattern "(not-3) 3 3 3 2", irrespective of word boundaries, the 3332 is changed to 2332. This covers examples 1,2,4,7. Rule 3. Apply the beginners' rule to the rest. This covers examples 8 and 9. BUT those rules do not catch example 5 properly. We still haven't got to the bottom of this. Silas -- Silas S Brown http://people.pwf.cam.ac.uk/ssb22 "What is now proved was once only imagined" - William Blake |