In allphone mode language model should contain phones:
Language model created by QuickLM on Птн Июн 5 00:20:32 MSD 2009
Copyright (c) 1996-2002
Carnegie Mellon University and Alexander I. Rudnicky
This model based on a corpus of 105 sentences and 31 words
The (fixed) discount mass is 0.5
\data\
ngram 1=31
ngram 2=131
ngram 3=184
\1-grams:
-2.1690 AA -0.2526
-1.3450 AE -0.2537
-1.1276 AH -0.1877
-2.3450 AO -0.2948
-3.1232 AW -0.2813
-2.5211 AY -0.2813
-1.0979 B -0.1708
-1.3523 D -0.1890
-1.9471 EH -0.2332
-1.8010 ER -0.1791
-1.6760 EY -0.1936
-3.1232 F -0.2941
-2.8222 G -0.2918
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you, Nickolay.
Would you please tell me how to make a LM just like yours?
I have tried QuickLM http://www.speech.cs.cmu.edu/tools/lm.html
I input some sentences, and get result like these:
Language model created by QuickLM on Sun Jun 7 22:09:00 EDT 2009
Carnegie Mellon University (c) 1996
This model based on a corpus of 2 sentences and 26 words
The (fixed) discount mass is 0.5
\data\
ngram 1=22
ngram 2=23
ngram 3=22
\1-grams:
-1.7160 Because -0.2840
-1.7160 This -0.2926
-1.4150 a -0.2840
-1.7160 based -0.2926
-1.7160 be -0.2926
I don't know how to make the LM contians phones.
I have also tried allphone of Sphinx v3.6. It doesn't need to input LM.
And it worked just fine.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
> Would you please tell me how to make a LM just like yours?
Create a text with phonetic transcription:
<s> W AH N </s>
then use QuickLM to generate language model from it. You can also get one in an4:
sphinx3/model/lm/an4/an4.tg.phone.arpa
> I have also tried allphone of Sphinx v3.6. It doesn't need to input LM.
And it worked just fine.
This also works just fine if you don't pass -lm altogether. But to increase accuracy you need a phone lm. And if you pass -lm, use proper lm with phones.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have tried the LM of sphinx3/model/lm/an4/an4.tg.phone.arpa, and it really works.
Thanks a lot.
If I want to build a system to check the accuracy of user's pronunciation.
For example: when the user speaks "apple", I want to find out he speaks
"AE P AH L" or "AA P AH L".
Is the allphone mode applicable to this job, or I should find another way?
Thank you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you very much.
I will try forced alignment.
It would be very nice if you can tell me where can I find detail information about "forced alignment".
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Use sphinx3_align, it takes a sentence and a dicitonary and checks if spoken recording matches any of the variants in the dictionary suggested. Say you have two variants:
HELLO H AH L OW
HELLO(2) H OH L OW
After sphinx3_align run you get the actual result
HELLO(2)
For more information on how to run it and what is forced alignement please read the docs/google.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi~
I am using sphinx v3.7 on Linux.
I want to try allphone mode, but the result is always "SIL".
If I use the default mode, the result is OK.
But if I add " -mode allphone ", the result is like:
Backtrace(arctic_0006)
FV:arctic_0006> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
fv:arctic_0006> SIL 0 134 4057135 0 4057135 5829650
FV:arctic_0006> TOTAL 4057135 0
Here is how I use sphinx3_decode:
sphinx3_decode \
-lm ./my_dic/alarm.sent.arpabo.DMP \
-dict ./my_dic/alarm.dic \
-fdict ./my_dic/lm_giga_5k_nvp.sphinx.filler \
-mdef /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mdef \
-mean /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/means \
-var /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/variances \
-mixw /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/mixture_weights \
-tmat /usr/src/sphinx/sphinx3/model/hmm/hub4_cd_continuous_8gau_1s_c_d_dd/transition_matrices \
-ctl a \
-cepdir /usr/src/sphinx/adapt \
-mode allphone
Could someone please tell me what I miss?
Thank you very much.
In allphone mode language model should contain phones:
Language model created by QuickLM on Птн Июн 5 00:20:32 MSD 2009
Copyright (c) 1996-2002
Carnegie Mellon University and Alexander I. Rudnicky
This model based on a corpus of 105 sentences and 31 words
The (fixed) discount mass is 0.5
\data\
ngram 1=31
ngram 2=131
ngram 3=184
\1-grams:
-2.1690 AA -0.2526
-1.3450 AE -0.2537
-1.1276 AH -0.1877
-2.3450 AO -0.2948
-3.1232 AW -0.2813
-2.5211 AY -0.2813
-1.0979 B -0.1708
-1.3523 D -0.1890
-1.9471 EH -0.2332
-1.8010 ER -0.1791
-1.6760 EY -0.1936
-3.1232 F -0.2941
-2.8222 G -0.2918
Thank you, Nickolay.
Would you please tell me how to make a LM just like yours?
I have tried QuickLM http://www.speech.cs.cmu.edu/tools/lm.html
I input some sentences, and get result like these:
Language model created by QuickLM on Sun Jun 7 22:09:00 EDT 2009
Carnegie Mellon University (c) 1996
This model based on a corpus of 2 sentences and 26 words
The (fixed) discount mass is 0.5
\data\
ngram 1=22
ngram 2=23
ngram 3=22
\1-grams:
-1.7160 Because -0.2840
-1.7160 This -0.2926
-1.4150 a -0.2840
-1.7160 based -0.2926
-1.7160 be -0.2926
I don't know how to make the LM contians phones.
I have also tried allphone of Sphinx v3.6. It doesn't need to input LM.
And it worked just fine.
> Would you please tell me how to make a LM just like yours?
Create a text with phonetic transcription:
<s> W AH N </s>
then use QuickLM to generate language model from it. You can also get one in an4:
sphinx3/model/lm/an4/an4.tg.phone.arpa
> I have also tried allphone of Sphinx v3.6. It doesn't need to input LM.
And it worked just fine.
This also works just fine if you don't pass -lm altogether. But to increase accuracy you need a phone lm. And if you pass -lm, use proper lm with phones.
I have tried the LM of sphinx3/model/lm/an4/an4.tg.phone.arpa, and it really works.
Thanks a lot.
If I want to build a system to check the accuracy of user's pronunciation.
For example: when the user speaks "apple", I want to find out he speaks
"AE P AH L" or "AA P AH L".
Is the allphone mode applicable to this job, or I should find another way?
Thank you.
> Is the allphone mode applicable to this job, or I should find another way?
Not quite, it's better to use forced alignment and include all possible variants in the dictionary.
Thank you very much.
I will try forced alignment.
It would be very nice if you can tell me where can I find detail information about "forced alignment".
Use sphinx3_align, it takes a sentence and a dicitonary and checks if spoken recording matches any of the variants in the dictionary suggested. Say you have two variants:
HELLO H AH L OW
HELLO(2) H OH L OW
After sphinx3_align run you get the actual result
HELLO(2)
For more information on how to run it and what is forced alignement please read the docs/google.
I got it. I'll study that.
Thank you.