i do some training for Arabic acoustic model using 73 training hour
my dictionry contains 17000 word
the transcription contain 100,000 words
senones 3000
Gaussians 16
the obtained likelihood about 10.2
when i do decoding using the same training data i got WER 20% and SER 55%
what do you think ? is it OK
is the training data enough for 17000 word in dictionary?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Mostly fine. I'd use 4000 senones, the rest should be tuned for this particular database. Things like forced alignment should or vtln por proper language weight should give few more percents.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i do some training for Arabic acoustic model using 73 training hour
my dictionry contains 17000 word
the transcription contain 100,000 words
senones 3000
Gaussians 16
the obtained likelihood about 10.2
when i do decoding using the same training data i got WER 20% and SER 55%
what do you think ? is it OK
is the training data enough for 17000 word in dictionary?
I do some trainig for Arabic but just aTotal Hours Training witch =1.15328803418804.
and my dictionary contains just 26.
But I have got some errors witch I dont know where they come like
ERROR: "........\src\libs\libmodinv\gauden.c", line 1700: var (mgau= 117, feat= 0, density=0, component=35) < 0
Have you any idea about this kind of error?? If I had this error, my model wouldnt work with sphinx4??
Thank you for any replay or response.
This forum has search function. I encourage you to use it:
https://sourceforge.net/forum/message.php?msg_id=6292641
Mostly fine. I'd use 4000 senones, the rest should be tuned for this particular database. Things like forced alignment should or vtln por proper language weight should give few more percents.