n_tied_num.mdef and alltriphones.mdef for reading was not found

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

n_tied_num.mdef and alltriphones.mdef for reading was not found

Forum: Help

Creator: Anonymous

Created: 2017-03-25

Updated: 2017-03-25

Anonymous - 2017-03-25

While I was training my native language (Amharic) using SphinxTrain-5prealpha.... Fatal error happened when creating the PRUNE TREE and Training the Context dependent models.

It is supposed to create it by it self. Is there any configuration that it is needed or configuration which I have missed?

FYI: Training wav file is about 19 hours WAVFILE_SRATE 16000.

check this link to see sphinx_train.cfg file

See the attachments for Error and Log file ipng format

Last edit: Anonymous 2017-03-25

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-03-25
  
  You need to share whole model training folder.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2017-03-26

Hay Nickolay, here is the training model https://www.dropbox.com/s/u64ajzwpvipibtd/amharic.zip?dl=0

In advance, Thanks for your help

Last edit: Anonymous 2017-03-26

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-03-26
  
  You are using syllable units for training acoustic model, while it is possible, you need a lot of memory to train such model which you probably don't have on your machine.
  
  You need to rework your phonetic dictionary to use real phones, not syllables.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2017-03-27

Was that my problem because I don't have enough memory? I now have 6GB and I don't what you mean by alot.

I tried to solve the problem by coping and renaming untied.mdef to alltriphones.mdef and
copied and renamed ci.mdef to n_tied_num.mdef. The original copies still exist and I re-run the training and finished without any errros. But got
WORD ERROR RATE : 42%
SENTENSE ERROR RATE: 94%

is there any problem by coping and renaming the above mdef files?

Last edit: Anonymous 2017-03-27

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-03-27
  
  is there any problem by coping and renaming the above mdef files?
  
  Yes, you get context-independent system instead of context dependent one.
  
  Was that my problem because I don't have enough memory?
  
  You problem is wrong phoneset.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Anonymous - 2017-03-30

Hay Nickolay Thanks for the heads up. Offcourse the error was on the dictionary. The dictionary contained words which does not exist on the train transcription. I had to remove all aforementioned words from the dictionary. Now, it successfully created the mdef file.

And NIckolay, I will try to fix the phoneset using real phones. Is there any resource that guides me through creating phonemes for any language.

Last edit: Anonymous 2017-03-30

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.