Hi Nickolay,
I am following cmusphinx tutorial and i have added and replaced some words in the existing dictionary. For example,
1 W AH N
10 W AH N Z IY R OW
11 W AH N W AH N
11TH W AH N W AH N T IY EY CH
12 W AH N T UW
19 W AH N AY N
1939 W AH N AY N TH R IY N AY N
1946 W AH N AY N F OW R S IH K S
1984 W AH N AY N EY T F OW R
1985 W AH N AY N EY T F AY V
1988 W AH N AY N EY T EY T
1989 W AH N AY N EY T N AY N
1ST W AH N EH S T IY
2 T UW
25 T UW F AY V
3 TH R IY
39 TH R IY N AY N
4 F OW R
7000 S EH V AH N Z IY R OW Z IY R OW Z IY R OW
72 S EH V AH N T UW
7500 S EH V AH N F AY V Z IY R OW Z IY R OW
8 EY T
9 N AY N
i have replaced '8' with 'EIGHT'
output before replacement : YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT. THE PROVISIONS OF CHAPTER 8 THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN
output after replacement : YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT. THE PROVISIONS OF CHAPTER THE THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN
In the second output instead of 'THE' , 'EIGHT' should come. i am unable to add or replace words in the dictionary.
Thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay,
Thank you so much for the reply.
I have updated the language model, replaced '8' with 'EIGHT' and tested. Its working now.
Output: YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT
THE PROVISIONS OF CHAPTER EIGHT THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN
i checked for one word but what if i have multiple words to extend the dictionary. then how to update the language model with those multiple words.
Thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the reply.
Earlier i have used LM tool to create language model. For testing purpose i have created lm of 20 sentences. can i use LM tool instead of SRILM tool for large scale language model?
Thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
output: A new language model generated called 'newlm.lm'
If a word i.e 'arms' is there in new language model but not in the dictionary. To add that word in the dictionary can i directly add one line i.e arms AA R M Z or i have to do something else to update?
Thank you
Last edit: Nickolay V. Shmyrev 2016-11-06
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, i am so sorry to bother you. i am not familiar with this.
Earlier i have used online LM tool and created both language model as well as dictionary at the same time using online LM tool. http://www.speech.cs.cmu.edu/tools/lmtool-new.html
Now i am using SRILM tool for large scale language model. I have used above 'ngram-count' command to create language model. Is it possible to create phonetic dictionary also using SRILM?
Thank you so much.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hey,
Thank you for your response.
For testing purpose, i have created language model using SRILM tool and right now i am using cmudict-en-us.dict dictionary. i followed cmusphinx tutorial and trained acoustic model.
I have used a command to recognize the audio file:
pocketsphinx_continuous -hmm en-us-adapt -lm chapter.lm.bin -dict cmudict-en-us.dict -mllr mllr_matrix -infile Chapter_2_2.wav
Sentence in transcription:
you to all these factors practice of motor insurance is influenced by the motor vehicle act the provisions of chapter 8 that is regarding the TP insurance was made effective with effect from July 1946 still talking about the motor vehicles act of 19
Output by using the above command:
to the law this that you is that is award various autos the was why is the motor vehicle act
the provisions of chapter the that is regarding the p p p short is was is that effect deal with effect from to amendment in what is the this to in award the motor vehicle sacrament in
Accuracy is not good. I have added some words in the dictionary which are there in the langauage model
For example: 8 EY T
19 W AH N AY N
and some words are already there in the dictionary as well as in the language model,still it is not recognizing the word. i dont know where i am wrong. can you please help?
Thank you
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay,
Thanks for the reply. I have attached a link kindly go through it.
It contains- six wav files : Chapter_2_0.wav, Chapter_2_1.wav, Chapter_2_2.wav,.....
- language model : chapter2.lm, chapter2.lm.bin
- dictionary : cmudict-en-us.dict
- transcription and fileids : motor89.transcription, motor89.fileids
- files which are generated while adapting acoustic model : gauden_count, mixw_counts, mllr_matrix, tmat_counts
- en-us-adapt folder
i have trained and tested for 89 wav files, here i am attaching 6 wav files. If you need other files also then please let me know i will upload that. Once again thank you very much for the help. https://drive.google.com/drive/folders/0B_IHphmLx3m7SFB5MUV3YTBXbkk?usp=sharing
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
MLLR adaptation is not effective for PTM model, you need to use MAP adaptation. You also need to test adaptation accuracy as described in adaptation tutorial.
Ideally you also need to use much more adaptation data. You even have to train Indian English model to get a good accuracy.
Special dictionary has to be designed for Indian English for training too, for example if you replace
8 EY T
with
8 IY T
it will recognize your sample better.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I am so sorry, forgot to mention that i am doing it for Indian English accent. I have used both MLLR and MAP adaptation and created 'en-us-adapt' folder using MAP.
I didn't get Indian English dictionary thats why i am using UK English phonetic words for dictionary and trying to append it with US English dictionary. can you tell me how do i train Indian English model? from where? i am not able to get it. I will add more data for adaptation.
Thank you for your response.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you so much for your response. Due to time constraint I will send you the transcribed Indian English data soon. Can you share the all steps which you will use to train the model.? Actually i need to know that for future use. how i can thank you. I appreciate all of your help.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay,
On Immediate basis i am attaching a sample data file of Indian English accent. How much data you need to train the model for better accuracy? let me know i will provide you the large data asap.
Kindly go through the link. https://drive.google.com/drive/folders/0B_IHphmLx3m7Zk9ZdXpkVXpnUWM?usp=sharing
There are two files of same data. One is in simple text format and the other is in transcription file format i.e.
Also i have attached some wav files for the testing purpose.If you need more files then please let me know.
Thanking you..
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for the rhelp. i have read in this tutorial : http://cmusphinx.sourceforge.net/wiki/tutorialam
there are no. of hours along with the no. of speakers define to train the model. maximum hours 50 is given. I am just clearing my doubt Once you train the model from 100-200 hours of transcribed data. will it be generic model? I am sorry to bother you. i just want to learn this from you and want to clear my all doubts. At present, i dont have so much data but i will provide you soon.
Thank you very much..
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi Nickolay,
I am following cmusphinx tutorial and i have added and replaced some words in the existing dictionary. For example,
1 W AH N
10 W AH N Z IY R OW
11 W AH N W AH N
11TH W AH N W AH N T IY EY CH
12 W AH N T UW
19 W AH N AY N
1939 W AH N AY N TH R IY N AY N
1946 W AH N AY N F OW R S IH K S
1984 W AH N AY N EY T F OW R
1985 W AH N AY N EY T F AY V
1988 W AH N AY N EY T EY T
1989 W AH N AY N EY T N AY N
1ST W AH N EH S T IY
2 T UW
25 T UW F AY V
3 TH R IY
39 TH R IY N AY N
4 F OW R
7000 S EH V AH N Z IY R OW Z IY R OW Z IY R OW
72 S EH V AH N T UW
7500 S EH V AH N F AY V Z IY R OW Z IY R OW
8 EY T
9 N AY N
i have replaced '8' with 'EIGHT'
output before replacement : YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT. THE PROVISIONS OF CHAPTER 8 THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN
output after replacement : YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT. THE PROVISIONS OF CHAPTER THE THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN
In the second output instead of 'THE' , 'EIGHT' should come. i am unable to add or replace words in the dictionary.
Thank you
You need to update the language model, not just the dictionary
Hi Nickolay,
Thank you so much for the reply.
I have updated the language model, replaced '8' with 'EIGHT' and tested. Its working now.
Output: YOU TO ALL IS THAT IS THAT IS THE MOTOR USE AUTOS YOU TO AS BY THE LORD WHO VEHICLE ACT
THE PROVISIONS OF CHAPTER EIGHT THAT IS REGARDING THE TP THE THE SHORT IS WAS IN THE EFFECT DEAL WITH EFFECT FROM JULY AMENDMENT IN WHAT IS THE THAT STILL TALKING LORD THE MOTOR VEHICLE THAT 19 IN
i checked for one word but what if i have multiple words to extend the dictionary. then how to update the language model with those multiple words.
Thank you
It is covered in http://cmusphinx.sourceforge.net/wiki/tutoriallmadvanced
For numbers it is recommended to postprocess the text output, not insert numbers in the language model.
Thanks for the reply.
Earlier i have used LM tool to create language model. For testing purpose i have created lm of 20 sentences. can i use LM tool instead of SRILM tool for large scale language model?
Thank you
No
Hi Nickolay,
Thank you for your response. I have installed SRILM tool and used a command -
output: A new language model generated called 'newlm.lm'
If a word i.e 'arms' is there in new language model but not in the dictionary. To add that word in the dictionary can i directly add one line i.e arms AA R M Z or i have to do something else to update?
Thank you
Last edit: Nickolay V. Shmyrev 2016-11-06
Yes you can
Hi, i am so sorry to bother you. i am not familiar with this.
Earlier i have used online LM tool and created both language model as well as dictionary at the same time using online LM tool. http://www.speech.cs.cmu.edu/tools/lmtool-new.html
Now i am using SRILM tool for large scale language model. I have used above 'ngram-count' command to create language model. Is it possible to create phonetic dictionary also using SRILM?
Thank you so much.
No
You can use https://github.com/cmusphinx/g2p-seq2seq instead.
Okay.. Thanks. i will use 'g2p-seq2seq' for the dictionary.
For better and large scale language model, from where i can get the text?
Thank you
Crawl the web
Hey,
Thank you for your response.
For testing purpose, i have created language model using SRILM tool and right now i am using cmudict-en-us.dict dictionary. i followed cmusphinx tutorial and trained acoustic model.
I have used a command to recognize the audio file:
pocketsphinx_continuous -hmm en-us-adapt -lm chapter.lm.bin -dict cmudict-en-us.dict -mllr mllr_matrix -infile Chapter_2_2.wav
Sentence in transcription:
you to all these factors practice of motor insurance is influenced by the motor vehicle act the provisions of chapter 8 that is regarding the TP insurance was made effective with effect from July 1946 still talking about the motor vehicles act of 19
Output by using the above command:
to the law this that you is that is award various autos the was why is the motor vehicle act
the provisions of chapter the that is regarding the p p p short is was is that effect deal with effect from to amendment in what is the this to in award the motor vehicle sacrament in
Accuracy is not good. I have added some words in the dictionary which are there in the langauage model
For example: 8 EY T
19 W AH N AY N
and some words are already there in the dictionary as well as in the language model,still it is not recognizing the word. i dont know where i am wrong. can you please help?
Thank you
To get help on the accuracy you need to provide the data files that you used.
By using a text file i.e data.txt i have created language model srilm tool
Kindly find attached file
It is better to provide everything as a single archive. You can upload to dropbox/google drive and give here a link.
Hi Nickolay,
Thanks for the reply. I have attached a link kindly go through it.
It contains- six wav files : Chapter_2_0.wav, Chapter_2_1.wav, Chapter_2_2.wav,.....
- language model : chapter2.lm, chapter2.lm.bin
- dictionary : cmudict-en-us.dict
- transcription and fileids : motor89.transcription, motor89.fileids
- files which are generated while adapting acoustic model : gauden_count, mixw_counts, mllr_matrix, tmat_counts
- en-us-adapt folder
i have trained and tested for 89 wav files, here i am attaching 6 wav files. If you need other files also then please let me know i will upload that. Once again thank you very much for the help.
https://drive.google.com/drive/folders/0B_IHphmLx3m7SFB5MUV3YTBXbkk?usp=sharing
MLLR adaptation is not effective for PTM model, you need to use MAP adaptation. You also need to test adaptation accuracy as described in adaptation tutorial.
Ideally you also need to use much more adaptation data. You even have to train Indian English model to get a good accuracy.
Special dictionary has to be designed for Indian English for training too, for example if you replace
with
it will recognize your sample better.
I am so sorry, forgot to mention that i am doing it for Indian English accent. I have used both MLLR and MAP adaptation and created 'en-us-adapt' folder using MAP.
I didn't get Indian English dictionary thats why i am using UK English phonetic words for dictionary and trying to append it with US English dictionary. can you tell me how do i train Indian English model? from where? i am not able to get it. I will add more data for adaptation.
Thank you for your response.
Acoustic model training tutorial is here:
http://cmusphinx.sourceforge.net/wiki/tutorialam
You can also provide sufficient amount of transcribed Indian English data, I'll train the model for you.
Thank you so much for your response. Due to time constraint I will send you the transcribed Indian English data soon. Can you share the all steps which you will use to train the model.? Actually i need to know that for future use. how i can thank you. I appreciate all of your help.
Hi Nickolay,
On Immediate basis i am attaching a sample data file of Indian English accent. How much data you need to train the model for better accuracy? let me know i will provide you the large data asap.
Kindly go through the link.
https://drive.google.com/drive/folders/0B_IHphmLx3m7Zk9ZdXpkVXpnUWM?usp=sharing
There are two files of same data. One is in simple text format and the other is in transcription file format i.e.
Also i have attached some wav files for the testing purpose.If you need more files then please let me know.
Thanking you..
The total duration of the data you provided is just 3 minutes. You need to provide 100-200 hours of transcribed data to train the model.
Thank you for the rhelp. i have read in this tutorial : http://cmusphinx.sourceforge.net/wiki/tutorialam
there are no. of hours along with the no. of speakers define to train the model. maximum hours 50 is given. I am just clearing my doubt Once you train the model from 100-200 hours of transcribed data. will it be generic model? I am sorry to bother you. i just want to learn this from you and want to clear my all doubts. At present, i dont have so much data but i will provide you soon.
Thank you very much..
Generic models are trained with 1000-10000 hours of data. Google trains with 6 lakhs hours, those are generic.