First of all thanks for your developments, they are awasome.
I'm planning to implement an ASR application which does the following:
Control with a button when to listen and when to finish listening
Show a sentence to say and check if the sentence is correctly repeated. If it is, show it (in green for exampel) if it's not correctly (still to determinate how to take this decission) show it in red and be able to repeat the operation.
I'll need a DLL (built with pocketsphinx & base I guess) to build a C# plugin
where the GUI is handled.
I want the application working with both spanish and english languages, and
the sentences, a priori will be like 100.
I have read how to build a language model and I tried with few sentences &
online tool. That's a very easy way to create the language model and I think
it can be enough for my purpose. Does this work if I upload a corpus in
spanish (or any other language)?
About the acoustic model, can I use some generic acoustic model? I know I can
adapt that later, but can I use some good acoustic model to test my language
models? Where can I find some in spanish and english? Can I use those [https:/
/sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/]
(https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20
Models/) directly?
And dictionary, the best option is to use that one which corresponds to the
language model, right?
Thanks again for your great work,
Regards.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
First of all thanks for your developments, they are awasome.
You are welcome
I'm planning to implement an ASR application which does the following: -
Control with a button when to listen and when to finish listening - Show a
sentence to say and check if the sentence is correctly repeated. If it is,
show it (in green for exampel) if it's not correctly (still to determinate how
to take this decission) show it in red and be able to repeat the operation
Sounds like a great application, however you might want to check the FAQ
section before you will start with the design
About the acoustic model, can I use some generic acoustic model? I know I
can adapt that later, but can I use some good acoustic model to test my
language models?
And dictionary, the best option is to use that one which corresponds to the
language model, right?
There is no dependency between dictionary and language model. For most models
the dictionary is provided with the acoustic model. It's the case for Spanish
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
So, is it LM and DICT independent from Acoustic model or not?
No, they are not independent. Set of words in language model must match set of
words in dictionary. Set of phones in dictionary must match set of phones in
acoustic model.
What I'm doing wrong?
You are doing almost everything correct. Error message says that it fails to
record audio from the microphone. Probably your input is muted in volume
settings.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-07-29
Yes.... it was microphone problem... thanks :D
If I want to build a LM in spanish, I guess online LMTool is not valid.... is
it? Do I have to use text2wfreq & text2idngram & idngram2lm &
sphinx_lm_convert ¿? Can I use them in Windows? I'll try tomorrow....
I downloaded Voxforge Spanish model but I cannot see the dictionary you told
me was included... is it that noisedict file?
Thanks once again and best regards.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Do I have to use text2wfreq & text2idngram & idngram2lm & sphinx_lm_convert
¿?
Yes
Can I use them in Windows? I'll try tomorrow.
You can.
but I cannot see the dictionary you told me was included.
etc/voxforge_es_sphinx.dic
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-08-02
I found the spanish dictionary and also voxforge english dictionary
(cmudict.07a). I tried previous example with english dictionary and works but
when I try the following example:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda ....\model\hmm\es\voxforge-es-0.1.1\model_param
eters\voxforge_es_sphinx.cd_cont_1500/feature_transform
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 2.000000e+002
-ncep 13 13
-nfft 512 256
-nfilt 40 32
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 3.500000e+003
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-002
INFO: acmod.c(238): Parsed model-specific feature parameters from
....\model\hm
m\es\voxforge-
es-0.1.1\model_parameters\voxforge_es_sphinx.cd_cont_1500/feat.par
ams
ERROR: "fe_interface.c", line 100: FFT: Number of points must be greater or
equa
l to frame size (409 samples)
In the other hand, I'm trying to build a statistical language model with
text2wfreq, text2idngram, idngram2lm and sphinx_lm_convert. I downloaded
pocketsphinx 0.7 (windows) and I found sphinx_lm_convert in sphinxbase
project, but I cannot find the others. Do I have to download a different
package? Where can I find them?
Many thanks and best regards :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-08-02
Ok, samprate solved the problem :)
I downloaded cmuclmtk 0.7 but it doesn't work (Win XP).... I couldn't open
pocketsphinx 0.7 (visual 2008), so I compiled previous snapshot (0.6).
The error is that it doesn't find MSVCR100.dll ... :?
So I decided to download previous version... and:
text2wfreq < weather.txt | wfreq2vocab > weather.tmp.vocab
text2wfreq : Reading text from standard input...
wfreq2vocab : Will generate a vocabulary containing the most
frequent 20000 words. Reading wfreq stream from stdin...
text2wfreq : Done.
wfreq2vocab : Done.
But then I cannot generate the arpa format LM because "text2idngram -vocab
weather.vocab -idngram weather.idngram < weather.closed.txt" doesn't find some
file (i think).
If I see what was generated by text2wfreq, I just can see the
weather.tmp.vocab file.... I also change that name to weather.vocab but
idngrab and closed are missing....
Last question... I tried LMTool with spanish, and I think that the .dic
generated cannot be used with that LM... Or at least it doesn't work for me,
maybe because words are pronounced like if was english¿? Using voxforge
dictionary works well :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The error is that it doesn't find MSVCR100.dll ... :?
This dll is part of VS 2010
But then I cannot generate the arpa format LM because "text2idngram -vocab
weather.vocab -idngram weather.idngram < weather.closed.txt" doesn't find some
file (i think).
text2idngram.exe -vocab weather.tmp.vocab <weather.txt> weather.idngram
text2idngram
Vocab : weather.tmp.vocab
N-gram buffer size : 100
Hash table size : 2000000
Temp directory : /usr/tmp/
Max open files : 20
FOF size : 10
n : 3
Initialising hash table...
Reading vocabulary...
Allocating memory for the n-gram buffer...
Reading text into the n-gram buffer...
20,000 n-grams processed for each ".", 1,000,000 for each line.
Sorting n-grams...
Writing sorted n-grams to temporary file C:\DOCUME~1\enne\CONFIG~1\Temp\text2idn
gram.temp.21
Merging 1 temporary files...
2-grams occurring: N times > N times Sug. -spec_num value
0 75 85
1 69 6 16
2 5 1 11
3 1 0 10
4 0 0 10
5 0 0 10
6 0 0 10
7 0 0 10
8 0 0 10
9 0 0 10
10 0 0 10
3-grams occurring: N times > N times Sug. -spec_num value
0 80 90
1 78 2 12
2 2 0 10
3 0 0 10
4 0 0 10
5 0 0 10
6 0 0 10
7 0 0 10
8 0 0 10
9 0 0 10
10 0 0 10
text2idngram : Done.
Which looks better... but when I try to create the LM:
idngram2lm.exe -vocab_type 0 -idngram weather.idngram -vocab
weather.tmp.vocab -arpa weather.arpa
n : 3
Input file : weather.idngram (binary format)
Output files :
ARPA format : weather.arpa
Vocabulary file : weather.tmp.vocab
Cutoffs :
2-gram : 0 3-gram : 0
Vocabulary type : Closed
Minimum unigram count : 0
Zeroton fraction : 1
Counts will be stored in two bytes.
Count table size : 65535
Discounting method : Good-Turing
Discounting ranges :
1-gram : 1 2-gram : 7 3-gram : 7
Memory allocation for tree structure :
Allocate 100 MB of memory, shared equally between all n-gram tables.
Back-off weight storage :
Back-off weights will be stored in four bytes.
Reading vocabulary.
read_wlist_into_siht: a list of 58 words was read from "weather.tmp.vocab".
read_wlist_into_array: a list of 58 words was read from "weather.tmp.vocab".
WARNING: appears as a vocabulary item, but is not labelled as a
context cue.
Allocated space for 5000000 2-grams.
Allocated space for 12500000 3-grams.
table_size 59
Allocated 60000000 bytes to table for 2-grams.
Allocated (2+25000000) bytes to table for 3-grams.
Processing id n-gram file.
20,000 n-grams processed for each ".", 1,000,000 for each line.
Error : n-grams are not correctly ordered. Error occurred at ngram 19.
Maybe text2idngram is wrong :?
By the way, will I have some problems when I repeat the process in spanish? Do
you know if accents have problems? I mean simbols like ´, `,¨.....(e.g.
camión, pingüino)
Regards and thanks :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Unknown (or unprocessed) command line options:
-idngram weather.idngra
Looks like you are using obsolete version. Maybe you want to try latest one.
By the way, will I have some problems when I repeat the process in spanish?
Who knows
Do you know if accents have problems?
Accents are supported
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-08-04
Well, as I told you I'm using previous version because last one uses
MSVCR100.dll
idngram2lm.exe -version
idngram2lm.exe from the CMU-Cambridge SLM Toolkit, Version 3 alpha
Anyway I think it would be possible to continue without installing VS2010,
with previous version. It just fails last step (idngram2language model).
Or maybe the problem is in text2idngram as I indicated.... I don't know...
Any idea?
Thanks
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-08-08
Should I open a new thread with this last question? If so, I'll do it, but
before that I prefer to ask it here...
Thanks :)
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Anonymous
-
2011-08-11
Finally I installed VS 2010 and the tools work.
In sentence recognition, wich is the best way to build a LM? I tried with CMU
tools and some sentences (also tried weather example with few sentences) and
the result is that it doesn't recognize any word....
Maybe is because the LM is too short to build it in that mode¿? I tried with a
similar (few sentences) LM with LMTools (online) and seems to work better...
If I build a LM, which level of recognition can I achive? I have seen that is
a word-level recognition. I mean, if I build a LM with sentences, and I say a
sentence but in a different order than I have, it's also recognized. I thought
that sentence-level recognition was possible.
So, the - tags what they do? I read the tutorial, FAQ, some online
information from different websites... and I think I miss some basic
information.
Any guidelines? Which kind of LM do I need for my purpose?
Many thanks and best regards.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all,
First of all thanks for your developments, they are awasome.
I'm planning to implement an ASR application which does the following:
I'll need a DLL (built with pocketsphinx & base I guess) to build a C# plugin
where the GUI is handled.
I want the application working with both spanish and english languages, and
the sentences, a priori will be like 100.
I have read how to build a language model and I tried with few sentences &
online tool. That's a very easy way to create the language model and I think
it can be enough for my purpose. Does this work if I upload a corpus in
spanish (or any other language)?
About the acoustic model, can I use some generic acoustic model? I know I can
adapt that later, but can I use some good acoustic model to test my language
models? Where can I find some in spanish and english? Can I use those [https:/
/sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/]
(https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20
Models/) directly?
And dictionary, the best option is to use that one which corresponds to the
language model, right?
Thanks again for your great work,
Regards.
Hello
You are welcome
Sounds like a great application, however you might want to check the FAQ
section before you will start with the design
http://cmusphinx.sourceforge.net/wiki/faq#qhow_to_implement_pronunciation_eva
luation
Please also read the tutorial first
http://cmusphinx.sourceforge.net/wiki/tutorial
You can use other ways to create a language model. See
http://cmusphinx.sourceforge.net/wiki/tutoriallm
Yes
https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20
Models/
Yes
There is no dependency between dictionary and language model. For most models
the dictionary is provided with the acoustic model. It's the case for Spanish
Thanks nshmyrev, very usefil information :)
I built a LM with LMTools following the example from http://cmusphinx.sourcef
orge.net/wiki/tutoriallm
but if I write:
pocketsphinx_continuous -lm 8521.lm -dict 8521.dic (sic from manual)
I get the error that the hmm is not specified. I tried with:
pocketsphinx_continuous -lm ....\4363\4363.lm -dict ....\4363\4363.dic -hmm
....\model\hmm\en\voxforge-
en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000
But I have this output:
Current configuration:
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-argfile
-ascale 20.0 2.000000e+001
-backtrace no no
-beam 1e-48 1.000000e-048
-bestpath yes yes
-bestpathlw 9.5 9.500000e+000
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict ....\4363\4363.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-008
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-064
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+000
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-029
-fwdtree yes yes
-hmm ....\model\hmm\en\voxforge-en-0.4\model_paramet
ers\voxforge_en_sphinx.cd_cont_5000
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm ....\4363\4363.lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+000
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+002
-lpbeam 1e-40 1.000000e-040
-lponlybeam 7e-29 7.000000e-029
-lw 6.5 6.500000e+000
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-mixw
-mixwfloor 0.0000001 1.000000e-007
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+000
-pbeam 1e-48 1.000000e-048
-pip 1.0 1.000000e+000
-pl_beam 1e-10 1.000000e-010
-pl_pbeam 1e-5 1.000000e-005
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-sendump
-senmgau
-silprob 0.005 5.000000e-003
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-004
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-usewdphones no no
-uw 1.0 1.000000e+000
-var
-varfloor 0.0001 1.000000e-004
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-029
-wip 0.65 6.500000e-001
-wlen 0.025625 2.562500e-002
INFO: cmd_ln.c(512): Parsing command line:
\
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 40 \
-ncep 13 \
-lowerf 133.333334 \
-upperf 6855.4976 \
-nfft 512 \
-wlen 0.025625 \
-transform legacy \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda ....\model\hmm\en\voxforge-en-0.4\model_paramet
ers\voxforge_en_sphinx.cd_cont_5000/feature_transform
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+002
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.562500e-002
INFO: acmod.c(238): Parsed model-specific feature parameters from
....\model\hm
m\en\voxforge-
en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/feat.param
s
INFO: fe_interface.c(288): You are using the internal mechanism to generate
the
seed.
INFO: feat.c(848): Initializing feature stream to type: '1s_c_d_dd',
ceplen=13,
CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: acmod.c(153): Reading linear feature transformation from
....\model\hmm\e
n\voxforge-
en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/feature_trans
form
INFO: mdef.c(520): Reading model definition: ....\model\hmm\en\voxforge-
en-0.4\
model_parameters\voxforge_en_sphinx.cd_cont_5000/mdef
INFO: bin_mdef.c(173): Allocating 104810 * 8 bytes (818 KiB) for CD tree
INFO: tmat.c(205): Reading HMM transition probability matrices:
....\model\hmm\
en\voxforge-
en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/transition_m
atrices
INFO: acmod.c(117): Attempting to use SCHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
....\model\hmm\en\v
oxforge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/means
INFO: ms_gauden.c(292): 5120 codebook, 1 feature, size
16x29
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
....\model\hmm\en\v
oxforge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/variances
INFO: ms_gauden.c(292): 5120 codebook, 1 feature, size
16x29
INFO: ms_gauden.c(356): 175 variance values floored
INFO: acmod.c(119): Attempting to use PTHMM computation module
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
....\model\hmm\en\v
oxforge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/means
INFO: ms_gauden.c(292): 5120 codebook, 1 feature, size
16x29
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
....\model\hmm\en\v
oxforge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/variances
INFO: ms_gauden.c(292): 5120 codebook, 1 feature, size
16x29
INFO: ms_gauden.c(356): 175 variance values floored
ERROR: "ptm_mgau.c", line 801: Number of codebooks exceeds 256: 5120
INFO: acmod.c(121): Falling back to general multi-stream GMM computation
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
....\model\hmm\en\v
oxforge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/means
INFO: ms_gauden.c(292): 5120 codebook, 1 feature, size
16x29
INFO: ms_gauden.c(198): Reading mixture gaussian parameter:
....\model\hmm\en\v
oxforge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/variances
INFO: ms_gauden.c(292): 5120 codebook, 1 feature, size
16x29
INFO: ms_gauden.c(356): 175 variance values floored
INFO: ms_senone.c(160): Reading senone mixture weights:
....\model\hmm\en\voxfo
rge-en-0.4\model_parameters\voxforge_en_sphinx.cd_cont_5000/mixture_weights
INFO: ms_senone.c(211): Truncating senone logs3(pdf) values by 10 bits
INFO: ms_senone.c(218): Not transposing mixture weights in memory
INFO: ms_senone.c(277): Read mixture weights for 5120 senones: 1 features x 16
c
odewords
INFO: ms_senone.c(331): Mapping senones to individual codebooks
INFO: ms_mgau.c(123): The value of topn: 4
INFO: dict.c(294): Allocating 4114 * 20 bytes (80 KiB) for word entries
INFO: dict.c(306): Reading main dictionary: ....\4363\4363.dic
INFO: dict.c(206): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(309): 15 words read
INFO: dict.c(314): Reading filler dictionary: ....\model\hmm\en\voxforge-
en-0.4
\model_parameters\voxforge_en_sphinx.cd_cont_5000/noisedict
INFO: dict.c(206): Allocated 0 KiB for strings, 0 KiB for phones
INFO: dict.c(317): 3 words read
INFO: dict2pid.c(396): Building PID tables for dictionary
INFO: dict2pid.c(405): Allocating 40^3 * 2 bytes (125 KiB) for word-initial
trip
hones
INFO: dict2pid.c(131): Allocated 19360 bytes (18 KiB) for word-final triphones
INFO: dict2pid.c(195): Allocated 19360 bytes (18 KiB) for single-phone word
trip
hones
INFO: ngram_model_arpa.c(476): ngrams 1=13, 2=18, 3=13
INFO: ngram_model_arpa.c(135): Reading unigrams
INFO: ngram_model_arpa.c(515): 13 = #unigrams created
INFO: ngram_model_arpa.c(194): Reading bigrams
INFO: ngram_model_arpa.c(531): 18 = #bigrams created
INFO: ngram_model_arpa.c(532): 5 = #prob2 entries
INFO: ngram_model_arpa.c(539): 3 = #bo_wt2 entries
INFO: ngram_model_arpa.c(291): Reading trigrams
INFO: ngram_model_arpa.c(552): 13 = #trigrams created
INFO: ngram_model_arpa.c(553): 3 = #prob3 entries
INFO: ngram_search_fwdtree.c(99): 13 unique initial diphones
INFO: ngram_search_fwdtree.c(147): 0 root, 0 non-root channels, 4 single-phone
w
ords
INFO: ngram_search_fwdtree.c(186): Creating search tree
INFO: ngram_search_fwdtree.c(191): before: 0 root, 0 non-root channels, 4
single
-phone words
INFO: ngram_search_fwdtree.c(324): after: max nonroot chan increased to 160
INFO: ngram_search_fwdtree.c(333): after: 13 root, 32 non-root channels, 3
singl
e-phone words
INFO: ngram_search_fwdflat.c(153): fwdflat: min_ef_width = 4, max_sf_win = 25
Allocating 32 buffers of 2500 samples each
INFO: continuous.c(261): pocketsphinx_continuous COMPILED ON: Jul 26 2011, AT:
0
9:20:22
FATAL_ERROR: "continuous.c", line 135: cont_ad_calib failed
So, is it LM and DICT independent from Acoustic model or not? What I'm doing
wrong?
If I build any LM and DICT can't I use those from sourforge?
This is my corpus (tutorial):
Thanks again.
Regards
No, they are not independent. Set of words in language model must match set of
words in dictionary. Set of phones in dictionary must match set of phones in
acoustic model.
You are doing almost everything correct. Error message says that it fails to
record audio from the microphone. Probably your input is muted in volume
settings.
Yes.... it was microphone problem... thanks :D
If I want to build a LM in spanish, I guess online LMTool is not valid.... is
it? Do I have to use text2wfreq & text2idngram & idngram2lm &
sphinx_lm_convert ¿? Can I use them in Windows? I'll try tomorrow....
I downloaded Voxforge Spanish model but I cannot see the dictionary you told
me was included... is it that noisedict file?
Thanks once again and best regards.
Yes
Yes
You can.
etc/voxforge_es_sphinx.dic
I found the spanish dictionary and also voxforge english dictionary
(cmudict.07a). I tried previous example with english dictionary and works but
when I try the following example:
Where dict & lm are some basic spanish commands created with LMtools.
Here is the output:
Current configuration:
-adcdev
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-argfile
-ascale 20.0 2.000000e+001
-backtrace no no
-beam 1e-48 1.000000e-048
-bestpath yes yes
-bestpathlw 9.5 9.500000e+000
-bghist no no
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-compallsen no no
-debug 0
-dict ....\8237ES\8237.dic
-dictcase no no
-dither no no
-doublebw no no
-ds 1 1
-fdict
-feat 1s_c_d_dd 1s_c_d_dd
-featparams
-fillprob 1e-8 1.000000e-008
-frate 100 100
-fsg
-fsgusealtpron yes yes
-fsgusefiller yes yes
-fwdflat yes yes
-fwdflatbeam 1e-64 1.000000e-064
-fwdflatefwid 4 4
-fwdflatlw 8.5 8.500000e+000
-fwdflatsfwin 25 25
-fwdflatwbeam 7e-29 7.000000e-029
-fwdtree yes yes
-hmm ....\model\hmm\es\voxforge-es-0.1.1\model_param
eters\voxforge_es_sphinx.cd_cont_1500
-input_endian little little
-jsgf
-kdmaxbbi -1 -1
-kdmaxdepth 0 0
-kdtree
-latsize 5000 5000
-lda
-ldadim 0 0
-lextreedump 0 0
-lifter 0 0
-lm ....\8237ES\8237.lm
-lmctl
-lmname default default
-logbase 1.0001 1.000100e+000
-logfn
-logspec no no
-lowerf 133.33334 1.333333e+002
-lpbeam 1e-40 1.000000e-040
-lponlybeam 7e-29 7.000000e-029
-lw 6.5 6.500000e+000
-maxhmmpf -1 -1
-maxnewoov 20 20
-maxwpf -1 -1
-mdef
-mean
-mfclogdir
-mixw
-mixwfloor 0.0000001 1.000000e-007
-mllr
-mmap yes yes
-ncep 13 13
-nfft 512 512
-nfilt 40 40
-nwpen 1.0 1.000000e+000
-pbeam 1e-48 1.000000e-048
-pip 1.0 1.000000e+000
-pl_beam 1e-10 1.000000e-010
-pl_pbeam 1e-5 1.000000e-005
-pl_window 0 0
-rawlogdir
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-sendump
-senmgau
-silprob 0.005 5.000000e-003
-smoothspec no no
-svspec
-tmat
-tmatfloor 0.0001 1.000000e-004
-topn 4 4
-topn_beam 0 0
-toprule
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+003
-usewdphones no no
-uw 1.0 1.000000e+000
-var
-varfloor 0.0001 1.000000e-004
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wbeam 7e-29 7.000000e-029
-wip 0.65 6.500000e-001
-wlen 0.025625 2.562500e-002
INFO: cmd_ln.c(512): Parsing command line:
\
-alpha 0.97 \
-dither yes \
-doublebw no \
-nfilt 32 \
-ncep 13 \
-lowerf 200 \
-upperf 3500 \
-nfft 256 \
-wlen 0.0256 \
-transform legacy \
-feat 1s_c_d_dd \
-agc none \
-cmn current \
-varnorm no
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+000
-alpha 0.97 9.700000e-001
-ceplen 13 13
-cmn current current
-cmninit 8.0 8.0
-dither no yes
-doublebw no no
-feat 1s_c_d_dd 1s_c_d_dd
-frate 100 100
-input_endian little little
-lda ....\model\hmm\es\voxforge-es-0.1.1\model_param
eters\voxforge_es_sphinx.cd_cont_1500/feature_transform
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 2.000000e+002
-ncep 13 13
-nfft 512 256
-nfilt 40 32
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+004
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 3.500000e+003
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-002
INFO: acmod.c(238): Parsed model-specific feature parameters from
....\model\hm
m\es\voxforge-
es-0.1.1\model_parameters\voxforge_es_sphinx.cd_cont_1500/feat.par
ams
ERROR: "fe_interface.c", line 100: FFT: Number of points must be greater or
equa
l to frame size (409 samples)
It happens the same with:
What can be the problem??
In the other hand, I'm trying to build a statistical language model with
text2wfreq, text2idngram, idngram2lm and sphinx_lm_convert. I downloaded
pocketsphinx 0.7 (windows) and I found sphinx_lm_convert in sphinxbase
project, but I cannot find the others. Do I have to download a different
package? Where can I find them?
Many thanks and best regards :)
Add "-samprate 8000" to pocketsphinx command line
http://sourceforge.net/projects/cmusphinx/files/cmuclmtk/0.7/cmuclmtk-0.7-win
32.zip/download
Ok, samprate solved the problem :)
I downloaded cmuclmtk 0.7 but it doesn't work (Win XP).... I couldn't open
pocketsphinx 0.7 (visual 2008), so I compiled previous snapshot (0.6).
The error is that it doesn't find MSVCR100.dll ... :?
So I decided to download previous version... and:
But then I cannot generate the arpa format LM because "text2idngram -vocab
weather.vocab -idngram weather.idngram < weather.closed.txt" doesn't find some
file (i think).
If I see what was generated by text2wfreq, I just can see the
weather.tmp.vocab file.... I also change that name to weather.vocab but
idngrab and closed are missing....
Maybe I'm not interpreting the manual
correctly
Last question... I tried LMTool with spanish, and I think that the .dic
generated cannot be used with that LM... Or at least it doesn't work for me,
maybe because words are pronounced like if was english¿? Using voxforge
dictionary works well :)
This dll is part of VS 2010
In your case command would be
That command will create weather.idngram.
This is correct
Hi again,
I think I have to change something else...
used The CMU-Cambridge Statistical Language Modeling Toolkit v2 documentation
and changed your command to:
Which looks better... but when I try to create the LM:
read_wlist_into_siht: a list of 58 words was read from "weather.tmp.vocab".
read_wlist_into_array: a list of 58 words was read from "weather.tmp.vocab".
WARNING:
appears as a vocabulary item, but is not labelled as acontext cue.
Allocated space for 5000000 2-grams.
Allocated space for 12500000 3-grams.
table_size 59
Allocated 60000000 bytes to table for 2-grams.
Allocated (2+25000000) bytes to table for 3-grams.
Processing id n-gram file.
20,000 n-grams processed for each ".", 1,000,000 for each line.
Error : n-grams are not correctly ordered. Error occurred at ngram 19.
Maybe text2idngram is wrong :?
By the way, will I have some problems when I repeat the process in spanish? Do
you know if accents have problems? I mean simbols like ´, `,¨.....(e.g.
camión, pingüino)
Regards and thanks :)Looks like you are using obsolete version. Maybe you want to try latest one.
Who knows
Accents are supported
Well, as I told you I'm using previous version because last one uses
MSVCR100.dll
Anyway I think it would be possible to continue without installing VS2010,
with previous version. It just fails last step (idngram2language model).
Or maybe the problem is in text2idngram as I indicated.... I don't know...
Any idea?
Thanks
Should I open a new thread with this last question? If so, I'll do it, but
before that I prefer to ask it here...
Thanks :)
Finally I installed VS 2010 and the tools work.
In sentence recognition, wich is the best way to build a LM? I tried with CMU
tools and some sentences (also tried weather example with few sentences) and
the result is that it doesn't recognize any word....
Maybe is because the LM is too short to build it in that mode¿? I tried with a
similar (few sentences) LM with LMTools (online) and seems to work better...
If I build a LM, which level of recognition can I achive? I have seen that is
a word-level recognition. I mean, if I build a LM with sentences, and I say a
sentence but in a different order than I have, it's also recognized. I thought
that sentence-level recognition was possible.
So, the
-tags what they do? I read the tutorial, FAQ, some onlineinformation from different websites... and I think I miss some basic
information.
Any guidelines? Which kind of LM do I need for my purpose?
Many thanks and best regards.
If you need fixed word order you can use a finite state grammar in jsgf
format. Tutorial describes that.
They mark start of the sentence and end of it.