CMU Sphinx / Forums / Help: Problem with word-based model ...plz help

anupam - 2010-11-03

Hi ,
I am working on a project where my aim is to recognize indian places name
(i want to recognize SINGLE WORD AT A TIME).

1) I am training my AM in fedora Linux and integrating with a java application
in windows.

2)I am using sphinxbase, sphinxTrain, sphinx3 (all nightly versions).

3)I read and followed the instructions written in http://cmusphinx.sourceforg
e.net/wiki/tutorialam and
used word dependent phones according to the suggestion given in the tutorial.

4)The training went well but few points i would like to make

4a) In MODULE: 50 Training Context dependent models
at Phase 3: Forward-Backward i have a line written as
This step had 780 ERROR messages and 0 WARNING messages. Please check the log
file for details.

Although the status comes as "COMPLETED".

5) Similarly at the point of decode i face the following

Decoding 4 segments starting at 0 (part 1 of 1)
sphinx3_decode Log File
This step had 4 ERROR messages and 5 WARNING messages. Please check the log
file for details
And the status is printed completed.

6) while i am integrating the AM with java application in windows (using
netbeans 6.8) with corresponding changes in the configuration file it show the
NULL POINTER EXCEPTION saying HMM for S_SUCHITRA not found.
S_SUCHITRA is a phopne in my phone list.

I AM POSTING THE WHOLE A M jar file here http://www.mediafire.com/?btwtr6z6bbc52zg

its name is acmt2.jar and it contains the stuff below...
acmt2.html
bin
bwaccumdir
etc
falignout
feat
logdir
model_architecture
model_parameters
python
qmanager
result
scripts_pl
trees
wav

please consider only the last entries at acmt2.html

7)one more thing i recoded the wav at **8khz but when i got error i changed
the sampling frequency to 16khz using gold wave and the training completed
successfully **

Please i am new to this. if i am posting my doubt in a wrong way let me know .
If some more details are needed please let me know.
PLEASE HELP ME.
regards
Anupam

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-11-03

You don't have enough data to train. You seems ignore the tutorial

When you need to train

You want to create an acoustic model for new language/dialect
OR you need specialized model for small vocabulary application
AND you have plenty of data to train on:
1 hour of recording for command and control for single speaker
5 hour of recordings of 200 speakers for command and control for many speakers
10 hours of recordings for single speaker dictation
50 hours of recordings of 200 speakers for many speakers dictation
AND you have knowledge on phonetic structure of the language
AND you have time to train the model and optimize parameters (1 month)

When you don't need to train

You need to improve accuracy - do acoustic model adaptation instead
You don't have enough data - do acoustic model adaptation instead
You don't have enough time
You don't have enough experience

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

anupam - 2010-11-03

Thank you very much but can you please suggest me an approach in which i would
be able to recognize one word at a time which are NOT ENGLISH WORDS .

thanks once again
regards
Anupam

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Nickolay V. Shmyrev - 2010-11-03

approach is the same, you just need more training data. At least 1 hours of
recordings.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

anupam - 2010-11-03

thank you very much :D

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Problem with word-based model ...plz help

Speech Recognition Toolkit

Forums

Help

Problem with word-based model ...plz help document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Problem with word-based model ...plz help