I wrote the below grammar :
<s> = <hi> <names>;
<hi> = hi | hello;
<names> = gary | mary | sophie | tony | scott;
and converted it to the fst
0 1 hi hi
0 1 hello hello
1 2 gary gary
1 2 mary mary
1 2 sophie sophie
1 2 tony tony
1 2 scott scott
2 0
I used the below command to convert text to binary format.
cat text.fst | epsdisambig.pl | s2eps.pl | fstcompile --isymbols=isyms.txt --osymbols=osyms.txt --keep_isymbols=false --keep_osymbols=false | fstrmepsilon > G.fst
Then I reference kaldi/egs/yesno to prepare input file : lexicon.txt , lexicon_nosil.txt.
Before calling run.sh to generate HCLG.fst, I use my own build G.fst wihtout using arpa
My lexicon.txt
<SIL> SIL
hi HH AY1
hello HH AH0 L OW1
gary G EH1 R IY0
mary M EH1 R IY0
sophie S OW1 F IY0
tony T OW1 N IY0
scott S K AA1 T
When decoding , if I say "hi sohphie", I get the answer "gary sophie".
But my grammar has no this rule. What's wrong with my G.fst or something else wrong?
Thanks.
Last edit: gary 2015-07-21
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Everything looks right in what you described. Possibly there was a
mismatch in a words.txt somewhere, or maybe you somehow ended up decoding
using a different G.fst than what you described in your post.
Dan
I wrote the below grammar :
= <hi> <names>;
<hi> = hi | hello;
<names> = gary | mary | sophie | tony | scott;
and converted it to the fst
0 1 hi hi
0 1 hello hello
1 2 gary gary
1 2 mary mary
1 2 sophie sophie
1 2 tony tony
1 2 scott scott
2 0
I used the below command to convert text to binary format.
cat text.fst | epsdisambig.pl | s2eps.pl | fstcompile
--isymbols=isyms.txt --osymbols=osyms.txt --keep_isymbols=false
--keep_osymbols=false | fstrmepsilon > G.fst
Then I reference kaldi/egs/yesno to prepare input file : lexicon.txt ,
lexicon_nosil.txt.
Before calling run.sh to generate HCLG.fst, I use my own build G.fst
wihtout using arpa
My lexicon.txt
<SIL> SIL
hi HH AY1
hello HH AH0 L OW1
gary G EH1 R IY0
mary M EH1 R IY0
sophie S OW1 F IY0
tony T OW1 N IY0
scott S K AA1 T
When decoding , if I say "hi sohphie", I get the answer "gary sophie".
But my grammar has no this rule. What's wrong with my G.fst or something
else wrong?
Thanks.
Dear all
I wrote the below grammar :
<s> = <hi> <names>;
<hi> = hi | hello;
<names> = gary | mary | sophie | tony | scott;
and converted it to the fst
0 1 hi hi
0 1 hello hello
1 2 gary gary
1 2 mary mary
1 2 sophie sophie
1 2 tony tony
1 2 scott scott
2 0
I used the below command to convert text to binary format.
cat text.fst | epsdisambig.pl | s2eps.pl | fstcompile --isymbols=isyms.txt --osymbols=osyms.txt --keep_isymbols=false --keep_osymbols=false | fstrmepsilon > G.fst
I drawed the G.fst picture.
http://i.imgur.com/s1F5CCL.png
Then I reference kaldi/egs/yesno to prepare input file : lexicon.txt , lexicon_nosil.txt.
Before calling run.sh to generate HCLG.fst, I use my own build G.fst wihtout using arpa
My lexicon.txt
<SIL> SIL
hi HH AY1
hello HH AH0 L OW1
gary G EH1 R IY0
mary M EH1 R IY0
sophie S OW1 F IY0
tony T OW1 N IY0
scott S K AA1 T
When decoding , if I say "hi sohphie", I get the answer "gary sophie".
But my grammar has no this rule. What's wrong with my G.fst or something else wrong?
Thanks.
Last edit: gary 2015-07-21
Everything looks right in what you described. Possibly there was a
mismatch in a words.txt somewhere, or maybe you somehow ended up decoding
using a different G.fst than what you described in your post.
Dan
On Mon, Jul 20, 2015 at 6:58 PM, gary gary2015@users.sf.net wrote:
Dear Dan
Thank you very much.
I solved this problem.
The reason is as you said : mismatch in a words.txt (L.fst and G.fst)