Menu

Unicode language support?

Help
UF grad
2008-09-04
2012-09-22
  • UF grad

    UF grad - 2008-09-04

    I search through the forums, but found only one comment that you can train unicode languages by converting characters to some other ASCII representation.

    http://sourceforge.net/forum/message.php?msg_id=3126838

    I converted my transcription, phone list, and dic to UTF-8, but it still did not work. So i guess it is true that Sphinx training module currently does not support UNICODE, right?

     
    • Nickolay V. Shmyrev

      > http://sourceforge.net/forum/message.php?msg_id=3126838

      I don't think you understood it properly. Only phoneset must be ascii, the rest can be utf8.

      > I converted my transcription, phone list, and dic to UTF-8, but it still did not work

      'did not work' is not a good description of the problem. It's not possible to help unless you provide more information.

       
      • UF grad

        UF grad - 2008-09-05

        Found the problem now...

        The log file was saying that there were extra space in the phone list, but I could not see in the vi. Then notice that vi show the file as a [dos] format. :D

        Run the dos2unix on both dic and phone file, everything is working now. :-)
        I created those files in the windows, and try to look at the problem in the wrong places because i thought Sphinx did not support UTF-8 :)
        Thank you for confirming.

         

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.