eSpeak: speech synthesis / Bugs / #104 stack corruption on extra long east-asian words

stack corruption on extra long east-asian words

#104 stack corruption on extra long east-asian words

Milestone: v1.0 (example)

Status: closed-fixed

Owner: Jonathan Duddington

Labels: None

Priority: 5

Updated: 2013-12-20

Created: 2013-07-16

Creator: Michael Curran

Private: No

eSpeak can crash due to stack corruption when given a long sequence of japanese or chinese characters proceeding one or more English letters.
Example: attached crash.txt.
This file contains "this will crash" followed by a word containing an 'A' and 52 U+65e5 chinese characters.
espeak -b 1 -f crash.txt
This command on Windows results in the speech "this will crash" and then a crash of eSpeak.
Passing the same string to espeak_synthesize in a compiled espeak dll also crashes the host application.

Some debugging suggests that transposeAlphabet is writing past the end of its 'buf' local variable. I.e. N_WORD_BYTES is too small for this example.

If I use less than 52 chinese characters it does not crash.

Also, if I do not place the english character at the beginning of the word it does not crash. Nor does it crash if the language is set to zh_TW.

I'm guessing that the rules for what constitutes a word are different in these situations.

this has been tested in eSpeak 1.47.11 and 1.47.11D.

This bug impacts NVDA as simply reading many random chinese phrases while using an english voice will cause NVDA to crash. E.g. a tweet from a Japanese speaker:
"さっきNVDA日本語開発チームの技術屋さんで全盲は一人と書いたけど技術屋に翻訳とアプリケーションモジュール開発者も含めると全盲は二人です。"

1 Attachments

crash.txt

Discussion

Michael Curran - 2013-11-06

Any luck with this one? Let me know if you need more info on this one. High priority for NVDA as this dauses a crash. I've debugged it as far as I can, but I'm lost from here.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jonathan Duddington - 2013-11-29

status: open --> open-fixed

assigned_to: Jonathan Duddington
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jonathan Duddington - 2013-11-29

This should be fixed now in eSpeak 1.47.14 at
http://espeak.sf.net/test/latest.html

Please confirm.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Curran - 2013-11-29

Changing line 2902 of dictionary.cpp so that the 'buf' local variable in transposeAlphabet is of length N_WORD_BYTES+5 rather than N_WORD_BYTES seems to fix this bug, no matter how many chinese characters you actually test with. Interestingly the max length of the text passed into transposeAlphabet never seems to go over 159 (N_WORD_BYTES-1) but clearly transposeAlphabet when dealing with chinese characters in this way needs the extra 6 bytes

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Curran - 2013-11-29

err, sorry, I didn't see your last comment. I'll test and see. Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Michael Curran - 2013-11-29

Yes, 1.47.14 fixes the bug. Thanks!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Jonathan Duddington - 2013-12-20

status: open-fixed --> closed-fixed
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

stack corruption on extra long east-asian words

Group

Searches

Help

#104 stack corruption on extra long east-asian words

Discussion