|
From: Daniel P. <dp...@gm...> - 2014-03-06 19:15:17
|
It's actually trivial when you know how. The text version of the archive format is just the utterance-id, then, starting on the same line, the Matlab form of the matrix, then a newline. for instance utt1 [ 0 2 3 1 3 4 ] utt2 [ 9 8 7 6 4 2 ] etc. So just put them in a file foo and read them with ark:foo You can then put them in binary format with an associated scp by doing copy-feats ark:foo ark,scp:/some/dir/my_features.ark,/some/dir/my_features.scp and you can copy /some/dir/my_features.scp as data/<something>/feats.scp and use them. or as a pipe you can do <matlab script> | copy-feats ark:- ark,scp:/some/dir/my_features.ark,/some/dir/my_features.scp Dan On Thu, Mar 6, 2014 at 2:11 PM, Simon Klüpfel <sim...@gm...>wrote: > Hi, > > A colleague of mine experimented with some 'exotic' feature vectors > using Matlab, and now we would like to see how the pretty great Kaldi > tools might be used to train some model using them. > > I believe, the clean way to do it, would be to write a routine that > creates these features using the Kaldi libraries, and then writing them > to an archive. However, I fear this will involve quite some work, and as > we do not know if it will be an endeavor worth the effort, we would like > to start off to export the features in a Kaldi readable format from > Matlab. This so far seemed the smaller effort. > > I tried to find out about the way those files are structured, but got > lost somewhere on the way. > > Looking into compute-mfcc-feats.cc, I saw that there is: > > BaseFloatMatrixWriter kaldi_writer; > > which is later used to write the archive: > > kaldi_writer.Write(utt, features); > > Trying to find what this call actually does, I got lost. > > I found this: > > > http://kaldi.sourceforge.net/group__table__types.html#gaa9b0c000a2d8bbf1a7df386024110883 > > and from there this: > > http://kaldi.sourceforge.net/table-types_8h_source.html#l00036 > > and then eventually this: > > http://kaldi.sourceforge.net/classkaldi_1_1TableWriter.html > > > I however could not yet find anything I could use to understand the > particular format of the archive file of feature vectors. > > The scp file should be straightforward, but I hope someone of you could > point me to the right resource to learn how to write the matrices of a > set of features in the correct archive format. > > Perhaps doing a detour through non-binary files might be a way to get > there, but this surely would be very unfavorable. > > Thanks a lot, > > Simon > > > ------------------------------------------------------------------------------ > Subversion Kills Productivity. Get off Subversion & Make the Move to > Perforce. > With Perforce, you get hassle-free workflows. Merge that actually works. > Faster operations. Version large binaries. Built-in WAN optimization and > the > freedom to use Git, Perforce or both. Make the move to Perforce. > > http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk > _______________________________________________ > Kaldi-users mailing list > Kal...@li... > https://lists.sourceforge.net/lists/listinfo/kaldi-users > |