error, when run perl ./scripts_pl/50.cd_hmm_

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

error, when run perl ./scripts_pl/50.cd_hmm_

Forum: Help

Creator: nguyen duy nam

Created: 2012-02-04

Updated: 2012-09-22

nguyen duy nam - 2012-02-04

hi, my proplem when run perl ./scripts_pl/50.cd_hmm_tied/slave_convg.pl
and i veiwe in logdir 50.cd_hmm_tied/words.1.1-1.bw
"model_def_io.c", line 436: Unable to open
/Application/Data/speechtotext/demo/mothai/model_architecture/words.1000.mdef
for reading; No such file or directory
i dont know why i miss file words.1000.mdef
pls, help me

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-04

Please check logs for details. Not just the last log, but previous logs too.
That's the common rule for training, see troubleshooting section in

http://cmusphinx.sourceforge.net/wiki/tutorialam

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-04

thanks, i passed it, and i trained ok 100% with 39 audio file each file length
7s, but when i run pocketsphinx_continuous
it incorrectly, say the word program return other word.
can you give me an advice
can you tell me when use words.cd_cont_1000, and when to use
words.cd_cont_1000_2 words.cd_cont_1000_1 ... Because when I finished it
generated in the folder model_parameters:

words.cd_cont_1000
words.cd_cont_1000_1
words.cd_cont_1000_2
words.cd_cont_1000_4
words.cd_cont_1000_8
words.cd_cont_initial
words.cd_cont_untied
words.ci_cont
words.ci_cont_flatinitial

thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-04

you can test it for me, my folder trainning:
http://www.mediafire.com/?9lip6lgx59u7f57

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-05

hi Nickolay
can you test and for me an advice.
and can you to me know how to train one word with many( ex 10) speaker 10
times. do i understand that to do it i need setup file fileids:

speaker_1/file_1
speaker_2/file_2
speaker_3/file_3
speaker_4/file_4
speaker_5/file_5
speaker_6/file_6
speaker_7/file_7
speaker_8/file_8
speaker_9/file_9
speaker_10/file_10

and file transcription

ONE file_1
ONE file_2
ONE file_3
ONE file_4
ONE file_5
OnE file_6
ONE file_7
ONE file_8
ONE file_9
ONE file_10

and file dic have only one word "ONE O N E". sorry if i wrong, you can fix it
for me.
thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-05

Sorry, in your database there are so many things done wrong that I really
suspect you missed the tutorial

http://cmusphinx.sourceforge.net/wiki/tutorialam

If you go there and read the tutorial text carefully you could get it done
right. Now it's all plain wrong.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-05

thanks your reply, yes in tutorial http://cmusphinx.sourceforge.net/wiki/tuto
rialam have many thing i
not clearly. i will read again to make clear them.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-05

hi, nickolay.
if my database audio haven't audio test, so i don't need file test.fileids and
test.transcript?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-07

hi Nickolay
i readed again the tutorial text, and i thing i make them ok, but when i run
pocketsphinx_continuous -hmm model_parameters/words.cd_cont_untied -lm
etc/words.lm.DMP -dict etc/words.dic
it till return result wrong. can you test again for me.
my folder train:
http://www.mediafire.com/?d1d1m838z9xlkql
i record audio in Audacity( 1 Mono, 16 000 HZ, 16 bit PCM) and save file
Microsoft WAV 16 bit

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-07

i dont know why it error 100%
**

./scripts_pl/decode/slave.pl
MODULE: DECODE Decoding using models previously trained
Decoding 2 segments starting at 0 (part 1 of 1)
0%
Aligning results to find error rate
SENTENCE ERROR: 100.0% (2/2) WORD ERROR RATE: 100.0% (2/2)
**

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-07

pls. help me

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-08

Well now it's better but you missed the first paragraph of the tutorial:

When you need to train

You want to create an acoustic model for new language/dialect
OR you need specialized model for small vocabulary application
AND you have plenty of data to train on:
1 hour of recording for command and control for single speaker
5 hour of recordings of 200 speakers for command and control for many speakers
10 hours of recordings for single speaker dictation
50 hours of recordings of 200 speakers for many speakers dictation
AND you have knowledge on phonetic structure of the language
AND you have time to train the model and optimize parameters (1 month)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-09

ok, thanks you very much.
i knowed about it. and the folder train that i sended to you do test my setup
right or wrong. then now i'm recording database.
thanks

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-09

hi. nickolay
1. can you tell me when use words.cd_cont_1000, and when to use words.cd_cont_1000_2 words.cd_cont_1000_1 ... Because when I finished it generated in the folder model_parameters:

words.cd_cont_1000
words.cd_cont_1000_1
words.cd_cont_1000_2
words.cd_cont_1000_4
words.cd_cont_1000_8
words.cd_cont_initial
words.cd_cont_untied
words.ci_cont
words.ci_cont_flatinitial

other question: how to decrease SENTENCE ERROR and WORD ERROR RATE
thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-09

can you tell me when use words.cd_cont_1000, and when to use
words.cd_cont_1000_2 words.cd_cont_1000_1 ... Because when I finished it
generated in the folder model_parameters:

See http://cmusphinx.sourceforge.net/wiki/tutorialam#using_the_model

other question: how to decrease SENTENCE ERROR and WORD ERROR RATE

Train better model

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

nguyen duy nam - 2012-02-10

thanks.

Train better model

What is a parameter to setting better model?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2012-02-10

What is a parameter to setting better model?

Parameters are described in tutorial. Model becomes better mainly because of
bigger and more carefully annotated data, not because of a parameters.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.