Menu

the recognition takes A VERY LONG TIME

Help
Terry
2012-08-15
2012-09-22
  • Terry

    Terry - 2012-08-15

    Hi!
    I used a grammer like this to recognize a Chinese sentence by PocketSphinx:

    JSGF V1.0;

    grammar testgrammar;
    public <sentence> = 昔 歲 逢 太 平 | 山 林 二 十 年 | 泉 源 在 庭 戶 | 洞 壑 當 門 前 | 井 稅 有 常 期
    | 日 晏 猶 得 眠 | 忽 然 遭 世 變 | 數 歲 親 戎 旃 | 今 來 典 斯 郡 | 山 夷 又 紛 然 | 城 小 賊 不 屠 | 人 貧
    傷 可 憐 | 是 以 陷 鄰 境 | 此 州 獨 見 全 | 使 臣 將 王 命 | 豈 不 如 賊 焉 | 令 彼 徵 歛 者 | 迫 之 如 火 煎
    | 誰 能 絕 人 命 | 以 作 時 世 賢 | 思 欲 委 符 節 | 引 竿 自 刺 船 | 將 家 就 魚 麥 | 歸 老 江 湖 邊 | 石 魚
    湖 | 似 洞 庭 | 夏 水 欲 滿 君 山 青 | 山 為 樽 | 水 為 沼 | 酒 徒 歷 歷 坐 洲 島 | 長 風 連 日 作 大 浪 | 不
    能 廢 人 運 酒 舫 | 我 持 長 瓢 坐 巴 邱 | 酌 飲 四 座 以 散 愁 | 謝 公 最 小 偏 憐 女 | 自 嫁 黔 婁 百 事 乖 |
    顧 我 無 衣 搜 藎 篋 | 泥 他 沽 酒 拔 金 釵 | 野 蔬 充 膳 甘 長 藿 | 落 葉 添 薪 仰 古 槐 | 今 日 俸 錢 過 十 萬
    | 與 君 營 奠 復 營 齋 | 昔 日 戲 言 身 後 事 | 今 朝 都 到 眼 前 來 | 衣 裳 已 施 行 看 盡 | 針 線 猶 存 未 忍
    開 | 尚 想 舊 情 憐 婢 僕 | 也 曾 因 夢 送 錢 財 | 誠 知 此 恨 人 人 有 | 貧 賤 夫 妻 百 事 哀 | ........ ; </sentence>

    There are 3211 sentences in a rule. I just want to recognize a sentence among
    them.

    But it took so much time...

    I have to wait between 1 minute to 3 minutes before I get the result. It seems
    that the time pocketsphinx took was always over 1 minute.

    Is that normal? The grammar I wrote is wrong? I can't concatenate 3211
    sentence in a rule? Because it's too long?

    What can I do to reduce the processing time?

    Thank you!

    p.s.

    the environment was:

    OS: Ubuntu 12.04 32-bit
    CPU: Intel Core i5-2410M 2.3GHz (Acer Aspire Timeline X 4830TG)

    And I used the fixed-point version of pocketsphinx. Not floating point.

     
  • bic-user

    bic-user - 2012-08-15

    Have you try to limit your grammar at least with 50 sentences? What it takes
    to recognize speech in that way? 3211 - is really a lot. Try to create
    statistical language model from your sentences. Read the tutorial to get how
    to do that: http://cmusphinx.sourceforge.net/wiki/tutoriallm. It will be also useful t read this: http
    ://cmusphinx.sourceforge.net/wiki/pocketsphinxhandhelds?s
    =pocketsphinx. Hope this helps

     
  • Terry

    Terry - 2012-08-15

    Because these sentence are ancient Chinese poem. Chinese children will recite
    them in the school.

    Thank you for your replay. So PocketSphinx's limit is 50 sentences? We can't
    use a grammar which contains over 1000 sentences? Is there other solutions?

     
  • bic-user

    bic-user - 2012-08-15

    Because these sentence are ancient Chinese poem.

    I haven't asked "why".

    So PocketSphinx's limit is 50 sentences?

    I haven't told you anything about limits. I ask you to limit your grammar with
    50 sentences and look how long it will take to recognize on smaller grammar.
    if it won't increase speed -> you use bad configuration.

    Is there other solutions?

    I've told you the one.

     
  • Terry

    Terry - 2012-08-16

    Sorry, my poor English :P

    Thank you, I will try.

     
  • Terry

    Terry - 2012-08-16

    I've try some way:

    1. It still took a long time when I use statistical language model.

    2. I tried some arguments in the website. Some of them works. I added these arguments:
      -maxwpf 5
      -maxhmmpf 3000
      -kdmaxdepth 7
      -kdmaxbbi 16

    Now every audio file takes about 1 minute. I also tried to use other values.
    But the time is still about 1 minute. I really don't know how to improve the
    speed significantly... Is it possible to reduce more time? For instance, let
    the recognition time be 5 seconds? 1 minute is still too long...

    1. It will be fast if I put 50 sentences in a rule. But I want to recognize a sentence among 3211 sentences. Because I want to compare the recognition rate to another speech recognition software. It can input 3211 sentences and the time is not long at all.

    2. I tried to use this kind of grammar:

    JSGF V1.0;

    grammar testgrammar;
    <1> = 昔 歲 逢 太 平;
    <2> = 山 林 二 十 年;
    <3> = 泉 源 在 庭 戶;
    <4> = 洞 壑 當 門 前;
    <5> = 井 稅 有 常 期;
    <6> = 日 晏 猶 得 眠;
    <7> = 忽 然 遭 世 變;
    <8> = 數 歲 親 戎 旃;
    <9> = 今 來 典 斯 郡;
    <10> = 山 夷 又 紛 然;
    <11> = 城 小 賊 不 屠;
    <12> = 人 貧 傷 可 憐;
    <13> = 是 以 陷 鄰 境;
    <14> = 此 州 獨 見 全;
    <15> = 使 臣 將 王 命;
    <16> = 豈 不 如 賊 焉;
    <17> = 令 彼 徵 歛 者;
    <18> = 迫 之 如 火 煎;
    .....

    It was very fast but the recognition result was a tragedy. How to write the
    grammar if I have so many sentences?

    Thank you!

     
  • bic-user

    bic-user - 2012-08-16

    It was very fast but the recognition result was a tragedy.

    you mean accuracy?

    How to write the grammar if I have so many sentences?

    Do not write grammar, create statistical language model. It's different
    things. Read tutorial http://cmusphinx.sourceforge.net/wiki/tutoriallm carefully.

     
  • Terry

    Terry - 2012-08-16

    Yes, the accuracy was very bad.

    I wonder whether the statistical model will work or not... I thought the
    accuracy of statistical model is worse than grammar. But I will try. Thanks.

     
  • Terry

    Terry - 2012-08-16

    I used a statistical model...but the time is too long. Even if I add the new
    arguments.
    It takes over 2 minutes to finish the recognition job.

     
  • bic-user

    bic-user - 2012-08-16

    Accuracy will be lower, but your poem is too big for grammar. can you please
    provide language model you're trying to use and dictionary for it? How you're
    trying to launch recognition? pocketsphinx_continuous?? also provide configs
    with which you init decoder. Language model works good(fast) with much bigger
    corpuses.

     
  • Terry

    Terry - 2012-08-16

    I use the example program in the official website to launch recognition. So
    the configs are in the program. But I made some changes in the program.

    can you provide e-mail address? I can't input the hyperlink here because the
    system thought my reply is a spam...

    by the way, I add the argument "-lponlybeam 7e-15" and the speed is faster.
    But the accuracy doesn't look good...

    thank you!

     
  • bic-user

    bic-user - 2012-08-16

    use file share. it won't be recognized as spam. what example program did you
    use? pocektsphinx-continuous? what changes did you made? Accuracy is another
    question. Please provide also log output from terminal so I can reproduce your
    test.

    and the speed is faster

    how much it takes to recognize the sentence?

     
  • Terry

    Terry - 2012-08-16

    I use the basic usage example in the website: http://cmusphinx.sourceforge.ne
    t/wiki/tutorialpocketsphinx
    . From "initialization" to "decoding a file stream." The change has
    nothing to do with the recognition - Just some input/output code. You can read
    my program.

    Here are all files: https://dl.dropbox.com/u/4064482/sphinx/sphinx_problems.z
    ip
    . You can see
    readme.txt for the details of files.

    It take about 3 or 4 seconds. But the recognition rate of 3200 sentence is
    very low...only 4.62%.

    thank you!

     
  • bic-user

    bic-user - 2012-08-17

    3-4 seconds is ok. Sorry, dude. It's hard for me to talk with you on
    recognition accuracy. I've never used that model, and I don't know is it good
    enough. I've noticed that you don't specify "-samprate 8000". Your acoustic
    model is 8k, and by default pocketsphinx uses 16000. So launch your executable
    with this option. Learn more about pocketsphinx parameteres, some of them can
    also improve performance. And for new question (recognition accuracy), please
    start new thread. Cheers.

     

Log in to post a comment.