|
From: Daniel P. <dp...@gm...> - 2015-03-11 23:55:43
|
Hm. It looks to me from http://kaldi.sourceforge.net/data_prep.html like the decision was made not to support that option in the data-directory definition, for the segments file, in order to avoid having to support that option in too many scripts, and for other reasons too. So that is supported only at the command-line level, not the script level. The easiest way to accomplish what you want to do is to define your wav.scp in such a way that instead of just having a filename for each utterance-id, you have a command (e.g. a sox command) that selects the channel you want, followed by a pipe symbol. Or if it doesn't matter for your application, you could just ignore the data-validation error. Dan On Wed, Mar 11, 2015 at 7:23 PM, <Dan...@pa...> wrote: > Thanks for pointing out my error! I appended the channel number, 0 or > 1, to the end of each line in segments. I still get this as a result of > running steps/make_mfcc.sh: > > > > > steps/make_mfcc.sh --nj 4 --cmd "$train_cmd" data/train > exp/make_mfcc/train $mfccdir > > steps/make_mfcc.sh --nj 4 --cmd run.pl data/train exp/make_mfcc/train > /usr/share/capa/data/kaldi-scripts/TESCO/data/feats > > Bad line in segments file S00000R00001-000005-000049 R00001 0.5 4.9 0 > > utils/validate_data_dir.sh: badly formatted segments file > > > > I think the problem is in lines 129 – 135 of kaldi-trunk/egs/wsj/s5/utils/ > validate_data_dir.sh > > > > if [ -f $data/segments ]; then > > > > check_sorted_and_uniq $data/segments > > # We have a segments file -> interpret wav file as "recording-ids" not > utterance-ids. > > ! cat $data/segments | \ > > awk '{if (NF != 4 || ($4 <= $3 && $4 != -1)) { print "Bad line in > segments file", $0; exit(1); }}' && \ > > echo "$0: badly formatted segments file" && exit 1; > > > > The problem seems to be that appending the channel number to each segments > line boosts the number of fields to 5. Is there another way I should have > added the channel number? > > > > Thanks again, > > > > Dan > > > > *From:* Daniel Povey [mailto:dp...@gm...] > *Sent:* Tuesday, March 10, 2015 9:14 PM > *To:* Davies, Dan <Dan...@pa...> > *Cc:* kal...@li... > *Subject:* Re: [Kaldi-developers] Issue with dual channel recordings > > > > I see that the documentation in extract-segments is not quite right. The > channel is supposed to be 0 for left and 1 for right. > Dan > > > > > > On Wed, Mar 11, 2015 at 12:10 AM, Daniel Povey <dp...@gm...> wrote: > > It's not the number of channels you need in the segments file, but the > identity of the channel-- I think it's probably supposed to be 0 or 1, > depending which channel you want. Maybe that's why validate_data_dir.pl > is failing, because 2 is not an expected channel id. If you want to sum > the channels, then do that manually by having a command ending with a pipe > symbol in the wav.scp file. > > Dan > > > > > > On Wed, Mar 11, 2015 at 12:07 AM, <Dan...@pa...> wrote: > > Hi, > > > > Our .wav files have two channels. If I don’t do anything special, > src/featbin/extract-segments says I need to put the number of channels into > the segments file. So far as I can tell, this means appending a “2” after > the stop time in each line in segments. When I append the “2”, > validate_data_dir.sh complains that the segments file is malformed because > lines in segments have 5 fields instead of 4. All this happens as a result > of calling steps/make_mfcc.sh. > > > > Am I doing something screwy? > > > > Dan > > > > > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, > sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for > all > things parallel software development, from weekly thought leadership blogs > to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Kaldi-developers mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-developers > > > > > |