|
From: Zibo M. <mzb...@gm...> - 2014-07-10 19:37:55
|
Hi, I am preparing the data for dnn training using my own data set. I followed the instruction on http://kaldi.sourceforge.net/data_prep.html. I created the file "text" as the first 3 lines: S002-U-000300-000470 OH S002-U-000470-000630 I'D S002-U-000630-000870 LIKE the wav.scp file: S002-U <path to the corresponding wav file> S002-O <path to the corresponding wav file> S003-U <path to the corresponding wav file> and the utt2spk file: S002-U-000300-000470 002-U S002-U-000470-000630 002-U S002-U-000630-000870 002-U Then I used utt2spk_to_spk2utt.pl to create the spk2utt file. Everything went well until I tried to use the mak_mfcc.sh to create the feats.scp file where I got the error message like: utils/validate_data_dir.sh: file data/utt2spk is not in sorted order or has duplicates seems like my utt2spk file could not pass through the validation. Can any body help me out of here? Thank you so much. Best, Zibo |