Menu

How to output Chinese in Windows

Help
Youzhi Yu
2011-01-06
2012-09-22
  • Youzhi Yu

    Youzhi Yu - 2011-01-06

    I am running Pocketsphinx_continuous in Windows. When I do English recognition
    it works fine, but when I do Chinese recognition the output is all messy code
    (unreadable characters). I think it is because it is not using the correct
    character set. I tried to convert the result of ps_get_hyp() with
    WideCharToMultiByte(CP_ACP,...) and stepped into the code but didn't succeed.
    Could anyone tell me what character set the internal functions are using for
    the Chinese string, and how to output it to standard output or files
    correctly? Many thanks.

     
  • Nickolay V. Shmyrev

    Pocketsphinx outputs UTF-8 characters. You can change your console codepage to
    UTF-8 using

    chcp 65001
    

    command. You can convert to wide bytes and back to encoding you need with
    double calls of MultiByteToWideChar. and WideCharToMultiByte

     
  • Youzhi Yu

    Youzhi Yu - 2011-01-07

    Thanks nshmyrev, I have managed to output Chinese result to the console by
    using MultiByteToWideChar and WriteConsole. But the recognition result is not
    too good. With the default language model file in the install folder, the
    result wasn't correct even once (a few are close). After defining a Chinese
    version 'goforward' grammar file and applying it, the result was 80% correct.

    What's the best result of Chinese recognition to your knowledge (maybe with
    high-end devices, better accent, trained acoustic model )?

     
  • daniel chen

    daniel chen - 2011-01-07

    After defining a Chinese version 'goforward' grammar file and applying it, the
    result was 80% correct.

    how to "defining a Chinese version 'goforward' grammar file", thanks!

     
  • daniel chen

    daniel chen - 2011-01-07

    i do the same :
    t the recognition result is not too good. With the default language model file
    in the install folder, the result wasn't correct even once

    so please help me how to "defining a Chinese version 'goforward' grammar file"
    ? thanks!

     
  • daniel chen

    daniel chen - 2011-01-07

    thanks firstary !

    my msn : danielchendc@live.cn

    i research ASR and TTS too, you can add my MSN if you want ,thanks!

     
  • daniel chen

    daniel chen - 2011-01-07

    After defining a Chinese version 'goforward' grammar file and applying it, the
    result was 80% correct.
    ===
    i run Pocketsphinx_continuous.exe in Windows ,and pass the arguments at
    command line as -hmm "hmm path" -lm "lm path" -dict "dic path" , the ASR
    result seldom right. how do you pass "grammar file" parameter to the program.

     
  • daniel chen

    daniel chen - 2011-01-07

    cmd line :
    pocketsphinx_continuous.exe -hmm
    D:\PocketSphinx\pocketsphinx\model\hmm\zh\tdt_sc_8k
    -lm D:\PocketSphinx\pocketsphinx\model\lm\zh_CN\gigatdt.5000.DMP
    -dict D:\PocketSphinx\pocketsphinx\model\lm\zh_CN\mandarin_notone.dic

     
  • Nickolay V. Shmyrev

    What's the best result of Chinese recognition to your knowledge (maybe with
    high-end devices, better accent, trained acoustic model )?

    Best result depends on the task, since I don't know which task do you target
    it's hard to get a good advise. Overall I think you'll be interested to read
    the tutorial first

    http://cmusphinx.sourceforge.net/wiki/tutorial

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.