hi daniel this is rohit i have replicated the rm/s4 script database to my local spoken language and got some unbelievable results
1. I have trained the whole system on sphinx train at first the wer is 26.8 (on all approaches including vtln+mllr)
2.The same experiment when conducted on HTK the wer is 23.4
3.The same experiment when done in the rm/s4 script the maximum wer i got is 6.4 and minimum is 1.49
is this a valid result even i tried with another language and that also provide similar best results why kaldi is providing best results when compared to other
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
It's good that it's better but I would not expect it to be that much better. Are you sure that you are scoring correctly and that you didn't include your testing data in your training data?
Dan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
i dint use RM data i have prepared data from local language and i'm sure that there is no testing data inside training data and also only 900 words of testing data(4151 words in full) exists in training data
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The "alignment" (.ali) output of the non-lattice-generating decoders has word-boundary information in it but you need to do some work to get the actual word boundaries.
There is probably a program like ali-to-ctm that gets the word boundaries for you. One of the example scripts has a script score_sclite.sh which will generate ctm's, you can look at that script for an example. Probably this one works with lattices rather than alignments though.
Dan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
REPLAY TO Dan
thank you, and I have read these scripts. now, I would construct a phone decoder and these are basic 12 phones in total, except the sil. so this is my lexcion.txt
bat bat
bila bila
bn bn
corl corl
fat fat
fn fn
labi labi
mat mat
none none
radl radl
when I compile graph, an error has occurred as follows :
FATAL: FstCompiler: Symbol "#2" is not mapped to any integer arc ilabel, symbol table = data/phones_disambig.txt, source = standard input, line = 6
ERROR: FstHeader::Read: Bad FST header: standard input
what happens? Thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You'll have to look in the script and see where #2 got involved, and why. This disambiguation symbol does not exist in your setup but possibly some file you provided had it in for some reason and some script was expecting to see it.
Dan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
reply to Dan
thank you very much. I have solved this problem. becasue of multi-pronounciation of word defination. there are at least 2 muliti-pronounciation in kaldi but my lexicon.txt did not have multi-pronounciation. actually, I think it is a bug in kaldi, do you think so?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't know what you mean by "there are at least two multi-pronunciation in Kaldi". The question is, which file is it taking the symbol #2 from and how did it get there?
Dan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Reply to Dan:
You are right! I have solved this problem. And it is a problem what is happened in a script. I have checked and revised that script file, now it is ok!
Thank you Dan ,!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Welcome to Open Discussion
hi daniel this is rohit i have replicated the rm/s4 script database to my local spoken language and got some unbelievable results
1. I have trained the whole system on sphinx train at first the wer is 26.8 (on all approaches including vtln+mllr)
2.The same experiment when conducted on HTK the wer is 23.4
3.The same experiment when done in the rm/s4 script the maximum wer i got is 6.4 and minimum is 1.49
is this a valid result even i tried with another language and that also provide similar best results why kaldi is providing best results when compared to other
It's good that it's better but I would not expect it to be that much better. Are you sure that you are scoring correctly and that you didn't include your testing data in your training data?
Dan
Actually those WER numbers you quote sound to me like the numbers from the actual RM data. There must be some mixup.
i dint use RM data i have prepared data from local language and i'm sure that there is no testing data inside training data and also only 900 words of testing data(4151 words in full) exists in training data
I got 25% absolute WER reduction with SGMM on a certain Hindi data. So it's not totally impossible with SGMM. GMM results were in the same ballpark.
hi nobody(I don't find your name here)
can you say the statistics of your data
我想问,如何产生带时间分割点的识别结果呢?谢谢;
I want to know how to generate the recognition result with time boundary, Thank you.
The "alignment" (.ali) output of the non-lattice-generating decoders has word-boundary information in it but you need to do some work to get the actual word boundaries.
There is probably a program like ali-to-ctm that gets the word boundaries for you. One of the example scripts has a script score_sclite.sh which will generate ctm's, you can look at that script for an example. Probably this one works with lattices rather than alignments though.
Dan
REPLAY TO Dan
thank you, and I have read these scripts. now, I would construct a phone decoder and these are basic 12 phones in total, except the sil. so this is my lexcion.txt
bat bat
bila bila
bn bn
corl corl
fat fat
fn fn
labi labi
mat mat
none none
radl radl
when I compile graph, an error has occurred as follows :
FATAL: FstCompiler: Symbol "#2" is not mapped to any integer arc ilabel, symbol table = data/phones_disambig.txt, source = standard input, line = 6
ERROR: FstHeader::Read: Bad FST header: standard input
what happens? Thank you
my phone_disambig.txt is :
<eps> 0
bat 1
bila 2
bn 3
corl 4
fat 5
fn 6
labi 7
mat 8
none 9
radl 10
sil 11
#1 12
You'll have to look in the script and see where #2 got involved, and why. This disambiguation symbol does not exist in your setup but possibly some file you provided had it in for some reason and some script was expecting to see it.
Dan
reply to Dan
thank you very much. I have solved this problem. becasue of multi-pronounciation of word defination. there are at least 2 muliti-pronounciation in kaldi but my lexicon.txt did not have multi-pronounciation. actually, I think it is a bug in kaldi, do you think so?
I don't know what you mean by "there are at least two multi-pronunciation in Kaldi". The question is, which file is it taking the symbol #2 from and how did it get there?
Dan
Reply to Dan:
You are right! I have solved this problem. And it is a problem what is happened in a script. I have checked and revised that script file, now it is ok!
Thank you Dan ,!