I was going to train an acoustic model using the Librispeech data and was wondering if there are any sample scripts that anyone already has anywhere.
I also wanted to know if its possible to use Kaldi generated models in Sphinx and if so how? Can they be used even on a mobile device using Pocketsphinx or are they too large for that purpose?
Thanks,
Hitesh
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for the info.
I was able to setup most of the files required to run training. I encountered problems with the dictionary though. During the verification after feature extraction, I get warnings such as the following:
WARNING: This word: BOZZLE was in the transcript file, but is not in the dictionary ( MISSUS BOZZLE WHO WELL UNDERSTOOD THAT BUSINESS WAS BUSINESS AND THAT WIVES WERE NOT BUSINESS FELT NO ANGER AT THIS AND HANDED HER HUSBAND HIS BEST COAT ). Do cases match?
I tried using the librispeech-lexicon from http://www.openslr.org/11/, but noticed that it doesn't contain a lot of the words that are actually there in the transcriptions. This seems to not allow the training to proceed.
Also, this contains phones such as AH0, AH1, etc. which are not there in the default phone set. Would it be a better idea to add these to the phoneset or could I convert all of them to their root phones, such as AH here, in the dictionary itself?
I also tried using the dictionary and phoneset from http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/sphinxdict/ but that errors out as well.
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I tried using the librispeech-lexicon from http://www.openslr.org/11/, but noticed that it doesn't contain a lot of the words that are actually there in the transcriptions. This seems to not allow the training to proceed.
You can create missing pronunciations with g2p
Also, this contains phones such as AH0, AH1, etc. which are not there in the default phone set. Would it be a better idea to add these to the phoneset or could I convert all of them to their root phones, such as AH here, in the dictionary itself?
There is no such thing as "default phone set". You create phoneset based on the dictionary you decide to use.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi everyone,
I was going to train an acoustic model using the Librispeech data and was wondering if there are any sample scripts that anyone already has anywhere.
I also wanted to know if its possible to use Kaldi generated models in Sphinx and if so how? Can they be used even on a mobile device using Pocketsphinx or are they too large for that purpose?
Thanks,
Hitesh
sphinxtrain has same scripts for everything, you just need to prepare the data in a proper format.
No.
Not now.
Hi Nickolay,
Thanks for the info.
I was able to setup most of the files required to run training. I encountered problems with the dictionary though. During the verification after feature extraction, I get warnings such as the following:
I tried using the librispeech-lexicon from http://www.openslr.org/11/, but noticed that it doesn't contain a lot of the words that are actually there in the transcriptions. This seems to not allow the training to proceed.
Also, this contains phones such as AH0, AH1, etc. which are not there in the default phone set. Would it be a better idea to add these to the phoneset or could I convert all of them to their root phones, such as AH here, in the dictionary itself?
I also tried using the dictionary and phoneset from http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict/sphinxdict/ but that errors out as well.
Thanks.
You can create missing pronunciations with g2p
There is no such thing as "default phone set". You create phoneset based on the dictionary you decide to use.
Is there a way I can ignore words not in the dictionary or treat them as OOV?
No