Hello,
I'm training an Arabic acoustic model using (sphinxbase , pocketsphinx , sphinxtrain ) , the data set is about 10 hours of recording,it contains a total of 415 sentences ( 367 training , 48 testing ) recorded by 40 (20 male and 20 female) of 663 words ,
and this is MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 2485 words using 44 phones
passed
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
passed
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
passed
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
passed
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 10.6419638888889
Rule of thumb suggests 3000, however there is no correct answer
WARNING
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 2482
Words in filler dictionary: 3
passed
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
passed
after setup the database and change the configuration file,according to Number of tied states (senones) and Densities in a manner starting of senon =100 and densities 2 , 4 ,8 ...64 then senon=150 and densities 2.4. .. 64 and to senon = 500 and so on .. To reach best combination give me best Accuracy , the best accuracy got in seonon= 100 and desities = 128 ( Recognition Rate (%) = 96.87 , Word Error Rate (WER)=3.37 ) this result is acceptable ??? how i know criteria for number of seonin and densities compared with number oh houre ??
i attached word document contain results of all combination ,
Hello,
I'm training an Arabic acoustic model using (sphinxbase , pocketsphinx , sphinxtrain ) , the data set is about 10 hours of recording,it contains a total of 415 sentences ( 367 training , 48 testing ) recorded by 40 (20 male and 20 female) of 663 words ,
and this is MODULE: 00 verify training files
Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
Found 2485 words using 44 phones
passed
Phase 2: Checking to make sure there are not duplicate entries in the dictionary
passed
Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
passed
Phase 4: Checking number of lines in the transcript file should match lines in fileids file
passed
Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
Estimated Total Hours Training: 10.6419638888889
Rule of thumb suggests 3000, however there is no correct answer
WARNING
Phase 6: Checking that all the words in the transcript are in the dictionary
Words in dictionary: 2482
Words in filler dictionary: 3
passed
Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
passed
after setup the database and change the configuration file,according to Number of tied states (senones) and Densities in a manner starting of senon =100 and densities 2 , 4 ,8 ...64 then senon=150 and densities 2.4. .. 64 and to senon = 500 and so on .. To reach best combination give me best Accuracy , the best accuracy got in seonon= 100 and desities = 128 ( Recognition Rate (%) = 96.87 , Word Error Rate (WER)=3.37 ) this result is acceptable ??? how i know criteria for number of seonin and densities compared with number oh houre ??
i attached word document contain results of all combination ,
Thank You .
Last edit: safia hammad 2015-07-25
Yes
It is covered in tutorial:
http://cmusphinx.sourceforge.net/wiki/tutorialam#configure_model_type_and_model_parameters