CMU Sphinx / Forums / Help: the recognition takes A VERY LONG TIME

Terry - 2012-08-15

Hi!
I used a grammer like this to recognize a Chinese sentence by PocketSphinx:

JSGF V1.0;

grammar testgrammar;
public <sentence> = 昔歲逢太平 | 山林二十年 | 泉源在庭戶 | 洞壑當門前 | 井稅有常期
| 日晏猶得眠 | 忽然遭世變 | 數歲親戎旃 | 今來典斯郡 | 山夷又紛然 | 城小賊不屠 | 人貧
傷可憐 | 是以陷鄰境 | 此州獨見全 | 使臣將王命 | 豈不如賊焉 | 令彼徵歛者 | 迫之如火煎
| 誰能絕人命 | 以作時世賢 | 思欲委符節 | 引竿自刺船 | 將家就魚麥 | 歸老江湖邊 | 石魚
湖 | 似洞庭 | 夏水欲滿君山青 | 山為樽 | 水為沼 | 酒徒歷歷坐洲島 | 長風連日作大浪 | 不
能廢人運酒舫 | 我持長瓢坐巴邱 | 酌飲四座以散愁 | 謝公最小偏憐女 | 自嫁黔婁百事乖 |
顧我無衣搜藎篋 | 泥他沽酒拔金釵 | 野蔬充膳甘長藿 | 落葉添薪仰古槐 | 今日俸錢過十萬
| 與君營奠復營齋 | 昔日戲言身後事 | 今朝都到眼前來 | 衣裳已施行看盡 | 針線猶存未忍
開 | 尚想舊情憐婢僕 | 也曾因夢送錢財 | 誠知此恨人人有 | 貧賤夫妻百事哀 | ........ ; </sentence>

There are 3211 sentences in a rule. I just want to recognize a sentence among
them.

But it took so much time...

I have to wait between 1 minute to 3 minutes before I get the result. It seems
that the time pocketsphinx took was always over 1 minute.

Is that normal? The grammar I wrote is wrong? I can't concatenate 3211
sentence in a rule? Because it's too long?

What can I do to reduce the processing time?

Thank you!

p.s.

the environment was:

OS: Ubuntu 12.04 32-bit
CPU: Intel Core i5-2410M 2.3GHz (Acer Aspire Timeline X 4830TG)

And I used the fixed-point version of pocketsphinx. Not floating point.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bic-user - 2012-08-15

Have you try to limit your grammar at least with 50 sentences? What it takes
to recognize speech in that way? 3211 - is really a lot. Try to create
statistical language model from your sentences. Read the tutorial to get how
to do that: http://cmusphinx.sourceforge.net/wiki/tutoriallm. It will be also useful t read this: http
://cmusphinx.sourceforge.net/wiki/pocketsphinxhandhelds?s=pocketsphinx. Hope this helps

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-15

Because these sentence are ancient Chinese poem. Chinese children will recite
them in the school.

Thank you for your replay. So PocketSphinx's limit is 50 sentences? We can't
use a grammar which contains over 1000 sentences? Is there other solutions?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bic-user - 2012-08-15

Because these sentence are ancient Chinese poem.

I haven't asked "why".

So PocketSphinx's limit is 50 sentences?

I haven't told you anything about limits. I ask you to limit your grammar with
50 sentences and look how long it will take to recognize on smaller grammar.
if it won't increase speed -> you use bad configuration.

Is there other solutions?

I've told you the one.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-16

Sorry, my poor English :P

Thank you, I will try.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-16

I've try some way:

It still took a long time when I use statistical language model.

I tried some arguments in the website. Some of them works. I added these arguments:
-maxwpf 5
-maxhmmpf 3000
-kdmaxdepth 7
-kdmaxbbi 16

Now every audio file takes about 1 minute. I also tried to use other values.
But the time is still about 1 minute. I really don't know how to improve the
speed significantly... Is it possible to reduce more time? For instance, let
the recognition time be 5 seconds? 1 minute is still too long...

It will be fast if I put 50 sentences in a rule. But I want to recognize a sentence among 3211 sentences. Because I want to compare the recognition rate to another speech recognition software. It can input 3211 sentences and the time is not long at all.

I tried to use this kind of grammar:

JSGF V1.0;

grammar testgrammar;
<1> = 昔歲逢太平;
<2> = 山林二十年;
<3> = 泉源在庭戶;
<4> = 洞壑當門前;
<5> = 井稅有常期;
<6> = 日晏猶得眠;
<7> = 忽然遭世變;
<8> = 數歲親戎旃;
<9> = 今來典斯郡;
<10> = 山夷又紛然;
<11> = 城小賊不屠;
<12> = 人貧傷可憐;
<13> = 是以陷鄰境;
<14> = 此州獨見全;
<15> = 使臣將王命;
<16> = 豈不如賊焉;
<17> = 令彼徵歛者;
<18> = 迫之如火煎;
.....

It was very fast but the recognition result was a tragedy. How to write the
grammar if I have so many sentences?

Thank you!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bic-user - 2012-08-16

It was very fast but the recognition result was a tragedy.

you mean accuracy?

How to write the grammar if I have so many sentences?

Do not write grammar, create statistical language model. It's different
things. Read tutorial http://cmusphinx.sourceforge.net/wiki/tutoriallm carefully.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-16

Yes, the accuracy was very bad.

I wonder whether the statistical model will work or not... I thought the
accuracy of statistical model is worse than grammar. But I will try. Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-16

I used a statistical model...but the time is too long. Even if I add the new
arguments.
It takes over 2 minutes to finish the recognition job.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bic-user - 2012-08-16

Accuracy will be lower, but your poem is too big for grammar. can you please
provide language model you're trying to use and dictionary for it? How you're
trying to launch recognition? pocketsphinx_continuous?? also provide configs
with which you init decoder. Language model works good(fast) with much bigger
corpuses.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-16

I use the example program in the official website to launch recognition. So
the configs are in the program. But I made some changes in the program.

can you provide e-mail address? I can't input the hyperlink here because the
system thought my reply is a spam...

by the way, I add the argument "-lponlybeam 7e-15" and the speed is faster.
But the accuracy doesn't look good...

thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bic-user - 2012-08-16

use file share. it won't be recognized as spam. what example program did you
use? pocektsphinx-continuous? what changes did you made? Accuracy is another
question. Please provide also log output from terminal so I can reproduce your
test.

and the speed is faster

how much it takes to recognize the sentence?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Terry - 2012-08-16

I use the basic usage example in the website: http://cmusphinx.sourceforge.ne
t/wiki/tutorialpocketsphinx . From "initialization" to "decoding a file stream." The change has
nothing to do with the recognition - Just some input/output code. You can read
my program.

Here are all files: https://dl.dropbox.com/u/4064482/sphinx/sphinx_problems.z
ip. You can see
readme.txt for the details of files.

It take about 3 or 4 seconds. But the recognition rate of 3200 sentence is
very low...only 4.62%.

thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

bic-user - 2012-08-17

3-4 seconds is ok. Sorry, dude. It's hard for me to talk with you on
recognition accuracy. I've never used that model, and I don't know is it good
enough. I've noticed that you don't specify "-samprate 8000". Your acoustic
model is 8k, and by default pocketsphinx uses 16000. So launch your executable
with this option. Learn more about pocketsphinx parameteres, some of them can
also improve performance. And for new question (recognition accuracy), please
start new thread. Cheers.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

the recognition takes A VERY LONG TIME

Speech Recognition Toolkit

Forums

Help

the recognition takes A VERY LONG TIME document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

JSGF V1.0;

JSGF V1.0;

the recognition takes A VERY LONG TIME