Menu

Sphinx 2 models in Sphinx 3.6

Help
Wout
2007-06-28
2012-09-22
  • Wout

    Wout - 2007-06-28

    Hi,

    I used the HTK-to-Sphinx 2 feature file converter in sphinxtrain/python/sphinx/htkmfc.py to convert my HTK feature files. I would like to train models with Sphinx 3.6, but the trainer gives the following error when it tries to load the feature files:

    Header size field: -1591410688(a1250000); filesize: 38544(00009690)
    ERROR: "corpus.c", line 1643: MFCC read failed. Retrying after sleep...

    I guess this indicates that Sphinx 3.6 cannot read Sphinx 2 feature files, am I correct? Is it somehow possible to convert Sphinx 2 feature files to Sphinx 3 format (or to directly convert HTK feature files to Sphinx 3 format)?

    Best regards,

    Wout

     
    • Wout

      Wout - 2007-06-28

      I'm sorry, I meant that I want to use S2 models in SphinxTrain.

      Wout

       
    • David Huggins-Daines

      The terminology is somewhat confusing here. Sphinx 2 feature files should really just be called "Sphinx feature files". There isn't a separate file format for Sphinx 3.

      I'm not sure what is going on here, are you sure that you're not mistakenly trying to read the HTK feature file?

       
    • Wout

      Wout - 2007-06-28

      Before trying you converter, I had already created my own based on the description you gave me[1]. After comparing the converted files, it seems that either that SphinxTrain can only handle one byte order (which would make sense, as there is no byte order marker in the feature file).

      My script isn't very robust and just uses the native byte order, which in my case is little-endian (i.e. 9633 is stored as "a1 25 00 00"). The resulting feature file is accepted by SphinxTrain. So I guess SphinxTrain assumes the first float to be little-endian. On my PC (64-bit dual core P4), your converter outputs the first float as big-endian.

      We[2] also believe that your script shouldn't write self.sampPeriod, self.sampSize, and self.paramKind to the Sphinx feature file. I you do, SphinxTrain won't accept the file.

      Summarizing, I think you should replace this:
      self.fh.write(pack(">IIHH", self.filesize,
      self.sampPeriod,
      self.sampSize,
      self.paramKind))
      with this:
      self.fh.write(pack("<I", self.filesize))

      After these modifications, the files are almost identical, although I used a entirely different technique. All differences occur in the least significant bytes of the integers and are probably due to the fact that I used a textual export of the HTK feature files instead of reading the binary form.

      Best regards,

      Wout

      [1] https://sourceforge.net/forum/message.php?msg_id=4362604
      [2] Like you said yourself in [1] ;-).

       
      • David Huggins-Daines

        Oh, no, htkmfc.py is not a converter, it is for reading and writing HTK format files. You need to use it in conjunction with s2mfc.py to make a converter. Here is a simple one, for instance:

        !/usr/bin/env python

        import htkmfc, s2mfc
        import os
        import sys

        for fname in sys.argv[1:]:
        root, ext = os.path.splitext(fname)
        hfeat = htkmfc.open(fname).getall()
        s2mfc.open(root + ".s2mfc", "wb").writeall(hfeat[:,0:13])

         
    • Wout

      Wout - 2007-06-28

      Hi,

      I noticed another error in the code, which may be a problem in all the Python files. You try to determine whether byte swapping 'necessary', by comparing the result of unpack with an integer:

      > self.swap = (unpack('=i', pack('<i', 42)) != 42)

      The problem is, however, that because unpack() always returns a tuple, the expression will always return false. Instead, you might use this:

      > self.swap = (unpack('=i', pack('<i', 42)) != (42,))

      Which will give the intended result.

      Best regards,

      Wout

       
      • David Huggins-Daines

        Whoops! You're right. I've fixed the other files affected by this...

         
    • Wout

      Wout - 2007-06-28

      I'm sorry for all the separate messages and for breaking the thread a number of times. I haven't been paying enough attention... :-S

      Am I correct in assuming that the HTK features MFCC_E_D_A_Z is equivalent to 1s_12c_12d_3p_12dd in SphinxTrain (as the -feat parameter)?

      Thanks,

      Wout

       
      • David Huggins-Daines

        It is not really exactly equivalent to any Sphinx feature specification.

        Sphinx calculates the dynamic features in the decoder or the trainer, rather than storing them in the feature files. Cepstral mean subtraction is also done in the decoder/trainer, so the feature files are stored un-normalized. Of course, in the Gaussian parameter files, dynamic feature computation and CMN have been applied.

        So, as far as the model files are concerned, MFCC_E_D_A_Z is roughly equivalent to -feat 1s_c_d_dd -cmn current, which is a single 39-dimensional vector consisting of c0..c12,dc0-dc12,ddc0-ddc12. However I think that HTK puts the energy (c0) coefficient at the end of the feature vector rather than the beginning. Also I think the dynamic feature computation is different in HTK.

         
    • Wout

      Wout - 2007-06-28

      Ah, I see. It may be wise to change the comment in HTKFeat_write() then.

      Wout

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.