Menu

MFC file format?

Help
2009-03-31
2016-11-04
1 2 > >> (Page 1 of 2)
  • Hadaiq Rolis

    Hadaiq Rolis - 2009-03-31

    I have some question about MFC file format in Sphinx.

    I want to modified the feature extraction algorithm (MFCC) using modified MFCC in order to reduce the noise. Therefore i want to know about the format of mfc file that generated using MFCC feature extraction.

    How to generate the .mfc file? What was the information that saved in .mfc file? why using binary format file?

    Thank you for your help in advance.

     
    • Hadaiq Rolis

      Hadaiq Rolis - 2009-04-03

      Thank you for your response, but i have some question more.

      Can u explain to me more detail about the header and the data?
      Was the header save in the first byte then followed by the data?
      The element of data saved in a matrix or in a sequence data?

       
      • Nickolay V. Shmyrev

        > Was the header save in the first byte then followed by the data?

        Do you understand what you are asking? "header" is called "header" because it goes first.

         
    • Nickolay V. Shmyrev

      It's just a length in frames followed by binary data.

      /***Header**/
      /
      compute number of frames and write cepfile header /
      numframes = fe_count_frames(FE,len,COUNT_PARTIAL);
      if (P->logspec != ON)
      outlen = numframes
      FE->NUM_CEPSTRA;
      else
      outlen = numframes*FE->MEL_FB->num_filters;
      if (P->output_endian != P->machine_endian)
      SWAPL(&outlen);
      if (write(fp, &outlen, 4) != 4) {
      E_ERROR("Data write error on %s\n",outfile);
      close(fp);
      return(FE_OUTPUT_FILE_WRITE_ERROR);
      }
      if (P->output_endian != P->machine_endian)
      SWAPL(&outlen);

      /**Data**/
      int32 fe_writeblock_feat(param_t P, fe_t FE, int32 fp, int32 nframes, float32
      feat)
      {

      int32 i, length, nwritebytes;
      
      if (P->logspec == ON)
          length = nframes*FE->MEL_FB->num_filters;
      else
          length = nframes*FE->NUM_CEPSTRA;
      
      if (P->output_endian != P->machine_endian){
          for (i=0;i<length;++i) SWAPF(feat[0]+i);
      }
      
      nwritebytes = length*sizeof(float32);
      if  (write(fp, feat[0], nwritebytes) != nwritebytes) {
          close(fp);
          E_FATAL("Error writing block of features\n");
      }
      
      if (P->output_endian != P->machine_endian){
          for (i=0;i<length;++i) SWAPF(feat[0]+i);
      }
      
      return(length);
      

      }

       
  • Diwakar.G

    Diwakar.G - 2016-10-25

    I have some question about MFC file format in Sphinx3.
    sir I am currently working dysarthric speech to text alignment.
    I want to modify the feature extraction algorithm (MFCC). Therefore i want to know about the format of mfc file that generated using MFCC feature extraction.
    How to generate the .mfc file? What was the information that saved in .mfc file? why using binary format file?
    Also, I want to know how to write the triphones for words in dictionary and whether I want to write a triphones for all the words in dictionary.
    Is there any standard dictionary available with their phonetic description.
    I need your help.
    Thank you

     
    • Nickolay V. Shmyrev

      How to generate the .mfc file?

      from command line with sphinx_fe binary, from API with sphinxbase/fe.h header

      What was the information that saved in .mfc file?

      Feature matrix

      why using binary format file?

      Binary representation is more efficient

      Also, I want to know how to write the triphones for words in dictionary and whether I want to write a triphones for all the words in dictionary.

      You can write triphones by hand in editor or with a script.

      Is there any standard dictionary available with their phonetic description.

      Sure, google for cmudict.

       
  • Diwakar.G

    Diwakar.G - 2016-10-27

    Thank you for your quick response.
    From the command line I am try to execute sphinx_fe but it says try to install sphinx_fe for that I have installed Sphinx_fe and try to run. For that it will shows following error saying that no arguments should be given.

    sitecsp@acl-pg-06:~/sphinxbase/src/sphinx_fe$ ls
    an251-fash-b.sph  fe.h      Makefile.am  sphinx_fe    wave2feat.h
    cmd_ln_defn.h     Makefile  Makefile.in  wave2feat.c  wave2feat.o
    sitecsp@acl-pg-06:~/sphinxbase/src/sphinx_fe$ sphinx_fe
    ERROR: "cmd_ln.c", line 675: No arguments given, available options are:
    Arguments list definition:
    [NAME]      [DEFLT]     [DESCR]
    -alpha      0.97        Preemphasis parameter
    -argfile            Argument file (e.g. feat.params from an acoustic model) to read parameters from.  This will override anything set in other command line arguments.
    -blocksize  2048        Number of samples to read at a time.
    -build_outdirs  yes     Create missing subdirectories in output directory
    -c              Control file for batch processing
    -cep2spec   no      Input is cepstral files, output is log spectral files
    -di             Input directory, input file names are relative to this, if defined
    -dither     no      Add 1/2-bit noise
    -do             Output directory, output files are relative to this
    -doublebw   no      Use double bandwidth filters (same center freq)
    -ei             Input extension to be applied to all input files
    -eo             Output extension to be applied to all output files
    -example    no      Shows example of how to use the tool
    -frate      100     Frame rate
    -help       no      Shows the usage of the tool
    -i              Single audio input file
    -input_endian   little      Endianness of input data, big or little, ignored if NIST or MS Wav
    -lifter     0       Length of sin-curve for liftering, or 0 for no liftering.
    -logspec    no      Write out logspectral files instead of cepstra
    -lowerf     133.33334   Lower edge of filters
    -mach_endian    little      Endianness of machine, big or little
    -mswav      no      Defines input format as Microsoft Wav (RIFF)
    -ncep       13      Number of cep coefficients
    -nchans     1       Number of channels of data (interlaced samples assumed)
    -nfft       512     Size of FFT
    -nfilt      40      Number of filter banks
    -nist       no      Defines input format as NIST sphere
    -npart      0       Number of parts to run in (supersedes -nskip and -runlen if non-zero)
    -nskip      0       If a control file was specified, the number of utterances to skip at the head of the file
    -o              Single cepstral output file
    -ofmt       sphinx      Format of output files - one of sphinx, htk, text.
    -part       0       Index of the part to run (supersedes -nskip and -runlen if non-zero)
    -raw        no      Defines input format as raw binary data
    -remove_dc  no      Remove DC offset from each frame
    -round_filters  yes     Round mel filter frequencies to DFT points
    -runlen     -1      If a control file was specified, the number of utterances to process, or -1 for all
    -samprate   16000       Sampling rate
    -seed       -1      Seed for random number generator; if less than zero, pick our own
    -smoothspec no      Write out cepstral-smoothed logspectral files
    -spec2cep   no      Input is log spectral files, output is cepstral files
    -sph2pipe   no      Input is NIST sphere (possibly with Shorten), use sph2pipe to convert
    -transform  legacy      Which type of transform to use to calculate cepstra (legacy, dct, or htk)
    -unit_area  yes     Normalize mel filters to unit area
    -upperf     6855.4976   Upper edge of filters
    -verbose    no      Show input filenames
    -warp_params            Parameters defining the warping function
    -warp_type  inverse_linear  Warping function type (or shape)
    -whichchan  0       Channel to process (numbered from 1), or 0 to mix all channels
    -wlen       0.025625    Hamming window length
    

    Sir, please help from .sph file how to convert .mfc file what are arguments for that function in detail.
    Thank you for your help in advance.

     
    • Nickolay V. Shmyrev

      sphinx_fe -nist yes -i file.sph -o file.mfc

       
  • Pradeep S V

    Pradeep S V - 2016-10-27

    Sir, I want to implement feature extraction process in matlab instead of sphinx_fe for that it is necessary to know in which format the extracted features are before they get written into binary file.

    If I extract features using matlab script how can I convert into .mfc binary file so that it can used with sphinx3.

     
    • Nickolay V. Shmyrev

      If I extract features using matlab script how can I convert into .mfc binary file so that it can used with sphinx3.

      You have to write a simple code for that, something like

          [iM, iN] = size(mfcc);
          iNumData = iM * iN;
          fid = fopen('out.mfc', 'wb');
          fwrite(fid, iNumData, 'int32');
          fwrite(fid, mfcc, 'float32');
          fclose(fid);
      

      MFCC file format is described here http://cmusphinx.sourceforge.net/wiki/mfcformat

       
      • Pradeep S V

        Pradeep S V - 2016-10-28

        Thank you sir

         
  • Diwakar.G

    Diwakar.G - 2016-10-28

    Thank yo u sir for your kind response.

     
  • Diwakar.G

    Diwakar.G - 2016-11-01

    Sir I want to know how write a triphone for the words with an example please tell me sir. Is it compulsory to write triphone for all the words in dictionary.Is it possible to do phone level alignment if yes how can be done. Sir I have studied one paper in that they are using htk toolkit in that by changing language model weight they are getting alignment rates very high.Sir I want to know what is language model weight is it possible to change in cmu sphinx 3.

     
    • Nickolay V. Shmyrev

      cmusphinx is quite different from htk, you can't easily transfer htk work without deep understanding of all internals.

       
  • Pradeep S V

    Pradeep S V - 2016-11-01

    Sir,I am working lyrics to song alignment. I also have same doubt is it compulsory to write triphone for all words in dictionary.

     
    • Nickolay V. Shmyrev

      No, unlike htk cmusphinx deals with triphones internally, they are not used as input.

       
  • Diwakar.G

    Diwakar.G - 2016-11-01

    Sir, I am really sorry if I am disturbing you. For phoneme level alignment how to start. What are the changes I need to do. Can I get any timing information from aligment results. Sir, I am m.tech student no one is here to assist. I don't know how to use htk toolkit I know some basics of hidden markov model theoretically our guide tell me directly to use cmu sphinx3. As I am new to this i need your help sir please help me. Thank you.

     

    Last edit: Diwakar.G 2016-11-14
    • Nickolay V. Shmyrev

      our guide tell me directly to use cmu sphinx3

      You can ask him for furhter details then.

       
  • Diwakar.G

    Diwakar.G - 2016-11-01

    Sir I am currently working on dysarthric speech to text alignment. Usually peoples who are suffered from dysarthric contain long pauses between words and even between words. So first initially I need locate their timing and remove those pauses and then it is applied to cmu sphinx3. For this i have written dictionary for those words they uttered. Now they have told first do phone level alignment. Here I have a confusion is it mandatory to write triphone for all words in dictionary. Should I need to modify the code for phone level alignment. Sir please help me.
    Thank you.

     
  • Pradeep S V

    Pradeep S V - 2016-11-03

    Sir I don't understand what is ihe problem can you please tell me I stuck with this error. I am using my own dictionary,data for training the model.

    sitecsp@acl-pg-06:~/Documents/an4$ perl scripts_pl/RunAll.pl
    MODULE: 00 verify training files
    O.S. is case sensitive ("A" != "a").
    Phones will be treated as case sensitive.
        Phase 1: DICT - Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 223 words using 40 phones
        Phase 2: DICT - Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: CTL - Check general format; utterance length (must be positive); files exist
    WARNING: CTL line does not parse correctly:
    
        Phase 4: CTL - Checking number of lines in the transcript should match lines in control file
        Phase 5: CTL - Determine amount of training data, see if n_tied_states seems reasonable.
            Total Hours Training: 0.131032692307692
            This is a small amount of data, no comment at this time
        Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 220
            Words in filler dictionary: 3
    WARNING: Bad line in transcript:
    
        Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    Something failed: (/home/sitecsp/Documents/an4/scripts_pl/00.verify/verify_all.pl)
    

    I have checked all the phones are used once in dictionary

     
    • Nickolay V. Shmyrev

      You are using outdated sphinxtrain.

      You have bad empty lines in transcript and ctl files.

       
  • Pradeep S V

    Pradeep S V - 2016-11-03

    Sir, is there any newer version sphinxtrain available for cmu sphinx 3. If yes, how to install it.

    I have attached the transcription file below sir please tell me how to remove bad empty lines.
    Thank you.

     
    • Nickolay V. Shmyrev

      Sir, is there any newer version sphinxtrain available for cmu sphinx 3. If yes, how to install it.

      In downloads.

      I have attached the transcription file below sir please tell me how to remove bad empty lines.

      With a text editor.

       
  • Pradeep S V

    Pradeep S V - 2016-11-03

    Sir, I have removed that empty line now i have while running this
    perl scripts_pl/20.ci_hmm/slave_convg.pl
    I am getting following error.

    sitecsp@acl-pg-06:~/Documents/an4$ perl scripts_pl/05.vector_quantize/slave.VQ.pl
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    sitecsp@acl-pg-06:~/Documents/an4$ perl scripts_pl/20.ci_hmm/slave_convg.pl
    MODULE: 20 Training Context Independent models
        Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...models...
        Phase 2: Flat initialize
        Phase 3: Forward-Backward
            Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
            0% *** Error in `/home/sitecsp/Documents/an4/bin/bw': free(): invalid next size (fast): 0x00000000024ad0a0 ***
    
    This step had 14 ERROR messages and 0 WARNING messages.  Please check the log file for details.
    Only 0 parts of 1 of Baum Welch were successfully completed
    Parts 1 failed to run!
    Training failed in iteration 1
    

    can you please help me.
    Thank you.

     
    • Nickolay V. Shmyrev

      Use latest sphinxtrain and follow the tutorial

      http://cmusphinx.sourceforge.net/wiki/tutorialam

      It gives correct and up-to-date information about acoustic model training.

       
1 2 > >> (Page 1 of 2)

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.