I am new to pocketsphinx. I am trying to train logspec features instead of
mfcc. Can you please tell in which files do I have to make changes? I am
getting a lot of errors at each step of the training.
Thanks in advance.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I have made the logspec features and put them in the feat folder with the help
of sphinx_fe.
I am in the 05.Vector_quantize module. I am getting the following error in the
being shown in the logdir.
INFO: cmd_ln.c(691): Parsing command line:
/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx
/train_clean_log/bin/agg_seg \
-segdmpdirs /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1 \
-segdmpfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1/train_clean_log.dmp \
-segtype all \
-ctlfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/etc/train_clean_log_train.fileids \
-cepdir /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/feat \
-cepext lmc \
-ceplen 24 \
-agc none \
-cmn current \
-varnorm no \
-feat 6,6,6,6:4 \
-stride 1.8316
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-cachesz 200 200
-cb2mllrfn .1cls. .1cls.
-cepdir /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/feat
-cepext .mfc lmc
-ceplen 13 24
-cmn current current
-cmninit 8.0 8.0
-cntfn
-ctlfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/etc/train_clean_log_train.fileids
-dictfn
-example no no
-fdictfn
-feat 1s_c_d_dd 6,6,6,6:4
-help no no
-lda
-ldadim 0 0
-lsnfn
-mllrctlfn
-mllrdir
-moddeffn
-npart 0
-nskip 0 0
-part 0
-runlen -1 -1
-segdir
-segdmpdirs /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1,
-segdmpfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1/train_clean_log.dmp
-segext v8_seg v8_seg
-segidxfn
-segtype st all
-sentdir
-sentext
-stride 1 1
-svspec
-ts2cbfn
-varnorm no no
INFO: main.c(171): No lexical transcripts provided
INFO: feat.c(684): Initializing feature stream to type: '6,6,6,6:4',
ceplen=24, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: corpus.c(1281): Will process all remaining utts starting at 0
INFO: main.c(290): Will produce feature dump
INFO: main.c(429): Writing frames to one file
FATAL_ERROR: "corpus.c", line 1571: Expected mfcc vector len of 24, got 8
(13400)
Mon Jun 25 13:14:40 2012
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Actually with 6,6,6,6:4 I was trying to use the else option in feat_init in
feat.c . I have to change it to a single stream(24).
What is nfilt?
Also, is there any difference between cepsize and ceplen?
If I already have the features, then I guess I should use 1s_c instead?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx
/train_clean_log/bin/pocketsphinx_batch: error: `/CLUSTERHOMES/LMS_AUDIO/maas/
vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bin/.libs/po
cketsphinx_batch' does not exist
This script is just a wrapper for pocketsphinx_batch.
This is the error being shown in the file train_clean_log-1-1.log
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
You copied wrapper script instead of pocketsphinx_batch binary into the bin
folder. You need to install pocketsphinx first. Then you need to copy a
pocketsphinx_batch binary from the installed location to the bin subfolder of
your training folder.
It seems you are not using a recent Sphinxtrain version, that's why you meet
many issues that have been fixed long time ago. It's recommended to use the
latest version.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
But the -ncep to be given to sphinx_fe has to be 24 for logspec right?
Because I am getting the following error while testing now
No, I never told you to modify ncep. Why did you added it? Ncep parameter
strips the upper part of the logspectra. Thus you basically just strip upper
frequences. If you want 24 numbers you need to set nfilt to 24, not ncep. Ncep
only useful when you apply DCT on top of features like in MFC, that way you
take high-energy part of the filters and drop low-energy one.
I also do not understand why do you want to use just features without first-
order derivatives which are often useful but it's up to you.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
oh ok, so I will just confirm once. For logspec:
-ncep(default 13)
-nfilt 24
-ceplen 24
-cepext ( i made my own .lmc, does any exist)
any other changes in the arguments?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi!
I am new to pocketsphinx. I am trying to train logspec features instead of
mfcc. Can you please tell in which files do I have to make changes? I am
getting a lot of errors at each step of the training.
Thanks in advance.
You need to change the file:
And add the argument to sphinx_fe invocation script:
Please describe the errors in more details, it should be easy to solve them.
I have made the logspec features and put them in the feat folder with the help
of sphinx_fe.
I am in the 05.Vector_quantize module. I am getting the following error in the
being shown in the logdir.
INFO: cmd_ln.c(691): Parsing command line:
/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx
/train_clean_log/bin/agg_seg \
-segdmpdirs /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1 \
-segdmpfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1/train_clean_log.dmp \
-segtype all \
-ctlfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/etc/train_clean_log_train.fileids \
-cepdir /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/feat \
-cepext lmc \
-ceplen 24 \
-agc none \
-cmn current \
-varnorm no \
-feat 6,6,6,6:4 \
-stride 1.8316
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-cachesz 200 200
-cb2mllrfn .1cls. .1cls.
-cepdir /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/feat
-cepext .mfc lmc
-ceplen 13 24
-cmn current current
-cmninit 8.0 8.0
-cntfn
-ctlfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/etc/train_clean_log_train.fileids
-dictfn
-example no no
-fdictfn
-feat 1s_c_d_dd 6,6,6,6:4
-help no no
-lda
-ldadim 0 0
-lsnfn
-mllrctlfn
-mllrdir
-moddeffn
-npart 0
-nskip 0 0
-part 0
-runlen -1 -1
-segdir
-segdmpdirs /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1,
-segdmpfn /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bwaccumdir/train_clean_log_buff_1/train_clean_log.dmp
-segext v8_seg v8_seg
-segidxfn
-segtype st all
-sentdir
-sentext
-stride 1 1
-svspec
-ts2cbfn
-varnorm no no
INFO: main.c(171): No lexical transcripts provided
INFO: feat.c(684): Initializing feature stream to type: '6,6,6,6:4',
ceplen=24, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: corpus.c(1281): Will process all remaining utts starting at 0
INFO: main.c(290): Will produce feature dump
INFO: main.c(429): Writing frames to one file
FATAL_ERROR: "corpus.c", line 1571: Expected mfcc vector len of 24, got 8
(13400)
Mon Jun 25 13:14:40 2012
INFO: feat.c(684): Initializing feature stream to type: '6,6,6,6:4',
ceplen=24, CMN='current', VARNORM='no', AGC='none'
INFO: cmn.c(142): mean= 12.00, mean= 0.0
INFO: main.c(520): No mdef files. Assuming 1-class init
INFO: main.c(1352): 1-class dump file
INFO: main.c(1390): Corpus 0: sz==4 frames
INFO: main.c(1399): Convergence ratios are abs(cur - prior) / abs(prior)
INFO: main.c(231): alloc'ing 0Mb obs buf
ERROR: "main.c", line 258: Can't read dump file
; Success
INFO: main.c(577): Initializing means using random k-means
INFO: main.c(580): Trial 0: 256 means
kmeans_init: main.c:588: random_kmeans: Assertion `(cc >= 0) && (cc < n_obs)'
failed.
this is the error in the kmeans logdir
Ceplen must be the same as nfilt (40 by default) with logspec, not 24. Its a
size of the feature stream, not the size of all streams.
I'm also not sure what do you want to say with 6,6,6,6:4
Actually with 6,6,6,6:4 I was trying to use the else option in feat_init in
feat.c . I have to change it to a single stream(24).
What is nfilt?
Also, is there any difference between cepsize and ceplen?
If I already have the features, then I guess I should use 1s_c instead?
Number of filter banks you are using to calculate logspec.
I have no idea what cepsize is. Ceplen is a common optoin for sphinxtrain
tools which configures number of components in the input feature vector
No idea, it depends on what do you want to archive. Sometimes you want to
train HMM with the second-order features sometimes you don't.
Where is the -nfilt argument given? Also do you know where all in the code are
the changes to be made for logspec training?
Actually i just check for ./sphinx_fe, i had given the arguments -logspec yes
-ncep 24(instead of 13 which is the default value)
Thanks for your help. I trained the logspec features successfully.
While decoding I got the following error,
vij@lnt74:/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/po
cketSphinx/train_clean_log> perl scripts_pl/decode/slave.pl
MODULE: DECODE Decoding using models previously trained
Decoding 513 segments starting at 0 (part 1 of 1)
0%
Aligning results to find error rate
Can't open /CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/p
ocketSphinx/train_clean_log/result/train_clean_log-1-1.match
word_align.pl failed with error code 65280 at scripts_pl/decode/slave.pl line
173.
vij@lnt74:/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/po
cketSphinx/train_clean_log> ^C
vij@lnt74:/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/po
cketSphinx/train_clean_log>
Can you please tell how to solve this as I am not getting it.
Thanks in advance
/CLUSTERHOMES/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx
/train_clean_log/bin/pocketsphinx_batch: error: `/CLUSTERHOMES/LMS_AUDIO/maas/
vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean_log/bin/.libs/po
cketsphinx_batch' does not exist
This script is just a wrapper for pocketsphinx_batch.
This is the error being shown in the file train_clean_log-1-1.log
You copied wrapper script instead of pocketsphinx_batch binary into the bin
folder. You need to install pocketsphinx first. Then you need to copy a
pocketsphinx_batch binary from the installed location to the bin subfolder of
your training folder.
It seems you are not using a recent Sphinxtrain version, that's why you meet
many issues that have been fixed long time ago. It's recommended to use the
latest version.
Thanks for your help but I am using the latest version of sphinxtrain(1.0.7)
By latest I mean the development version. See for details
http://cmusphinx.sourceforge.net/wiki/download
But the -ncep to be given to sphinx_fe has to be 24 for logspec right? Because
I am getting the following error while testing now,
INFO: acmod.c(242): Parsed model-specific feature parameters from /CLUSTERHOME
S/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean
_log/model_parameters/train_clean_log.cd_semi_1000/feat.params
ERROR: "acmod.c", line 205: Configured feature length 40 doesn't match feature
extraction output size 24
FATAL_ERROR: "batch.c", line 819: PocketSphinx decoder init failed
-alpha 0.97
-doublebw no
-nfilt 40
-ncep 24
-lowerf 133.33334
-upperf 6855.4976
-nfft 8192
-ceplen 40
-wlen 0.0256
-transform legacy
-feat 1s_c
-agc none
-cmn current
-varnorm no
these are the features that i am giving for testing in config file.
No, I never told you to modify ncep. Why did you added it? Ncep parameter
strips the upper part of the logspectra. Thus you basically just strip upper
frequences. If you want 24 numbers you need to set nfilt to 24, not ncep. Ncep
only useful when you apply DCT on top of features like in MFC, that way you
take high-energy part of the filters and drop low-energy one.
I also do not understand why do you want to use just features without first-
order derivatives which are often useful but it's up to you.
oh ok, so I will just confirm once. For logspec:
-ncep(default 13)
-nfilt 24
-ceplen 24
-cepext ( i made my own .lmc, does any exist)
any other changes in the arguments?
And yeah -nfft according to the number of .wav files
Please try first then ask if you have troubles.
nfft depends on sample rate. It's 256 for 8khz and 512 for wideband.
But even with the above parameters, I am getting the same error,
INFO: acmod.c(242): Parsed model-specific feature parameters from /CLUSTERHOME
S/LMS_AUDIO/maas/vij/REMOS_HTK341_project/ASR_systems/pocketSphinx/train_clean
_log/model_parameters/train_clean_log.cd_semi_1000/feat.params
ERROR: "acmod.c", line 205: Configured feature length 24 doesn't match feature
extraction output size 13
FATAL_ERROR: "batch.c", line 819: PocketSphinx decoder init failed
Current configuration:
-agc none none
-agcthresh 2.0 2.000000e+00
-alpha 0.97 9.700000e-01
-ceplen 13 24
-cmn current current
-cmninit 8.0 8.0
-dither no no
-doublebw no no
-feat 1s_c_d_dd 1s_c
-frate 100 100
-input_endian little little
-lda
-ldadim 0 0
-lifter 0 0
-logspec no no
-lowerf 133.33334 1.333333e+02
-ncep 13 13
-nfft 512 8192
-nfilt 40 24
-remove_dc no no
-round_filters yes yes
-samprate 16000 1.600000e+04
-seed -1 -1
-smoothspec no no
-svspec
-transform legacy legacy
-unit_area yes yes
-upperf 6855.4976 6.855498e+03
-varnorm no no
-verbose no no
-warp_params
-warp_type inverse_linear inverse_linear
-wlen 0.025625 2.560000e-02
Try to add -ncep 24
nfft must be 512, not 8k
Sorry this is the error after making logspec yes in the above config..
word_align.pl failed with error code 65280 at ./scripts_pl/decode/slave.pl
line 173.