CMU Sphinx / Forums / Help: MFC file format?

Hadaiq Rolis - 2009-03-31

I have some question about MFC file format in Sphinx.

I want to modified the feature extraction algorithm (MFCC) using modified MFCC in order to reduce the noise. Therefore i want to know about the format of mfc file that generated using MFCC feature extraction.

How to generate the .mfc file? What was the information that saved in .mfc file? why using binary format file?

Thank you for your help in advance.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Hadaiq Rolis - 2009-04-03
  
  Thank you for your response, but i have some question more.
  
  Can u explain to me more detail about the header and the data?
  Was the header save in the first byte then followed by the data?
  The element of data saved in a matrix or in a sequence data?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2009-04-03
    
    > Was the header save in the first byte then followed by the data?
    
    Do you understand what you are asking? "header" is called "header" because it goes first.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2009-03-31
  
  It's just a length in frames followed by binary data.
  
  /***Header**/
  / compute number of frames and write cepfile header /
  numframes = fe_count_frames(FE,len,COUNT_PARTIAL);
  if (P->logspec != ON)
  outlen = numframesFE->NUM_CEPSTRA;
  else
  outlen = numframes*FE->MEL_FB->num_filters;
  if (P->output_endian != P->machine_endian)
  SWAPL(&outlen);
  if (write(fp, &outlen, 4) != 4) {
  E_ERROR("Data write error on %s\n",outfile);
  close(fp);
  return(FE_OUTPUT_FILE_WRITE_ERROR);
  }
  if (P->output_endian != P->machine_endian)
  SWAPL(&outlen);
  
  /**Data**/
  int32 fe_writeblock_feat(param_t P, fe_t FE, int32 fp, int32 nframes, float32 feat)
  {
  
  int32 i, length, nwritebytes; if (P->logspec == ON) length = nframes*FE->MEL_FB->num_filters; else length = nframes*FE->NUM_CEPSTRA; if (P->output_endian != P->machine_endian){ for (i=0;i<length;++i) SWAPF(feat[0]+i); } nwritebytes = length*sizeof(float32); if (write(fp, feat[0], nwritebytes) != nwritebytes) { close(fp); E_FATAL("Error writing block of features\n"); } if (P->output_endian != P->machine_endian){ for (i=0;i<length;++i) SWAPF(feat[0]+i); } return(length);
  
  }
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-10-25

I have some question about MFC file format in Sphinx3.
sir I am currently working dysarthric speech to text alignment.
I want to modify the feature extraction algorithm (MFCC). Therefore i want to know about the format of mfc file that generated using MFCC feature extraction.
How to generate the .mfc file? What was the information that saved in .mfc file? why using binary format file?
Also, I want to know how to write the triphones for words in dictionary and whether I want to write a triphones for all the words in dictionary.
Is there any standard dictionary available with their phonetic description.
I need your help.
Thank you

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-25
  
  How to generate the .mfc file?
  
  from command line with sphinx_fe binary, from API with sphinxbase/fe.h header
  
  What was the information that saved in .mfc file?
  
  Feature matrix
  
  why using binary format file?
  
  Binary representation is more efficient
  
  Also, I want to know how to write the triphones for words in dictionary and whether I want to write a triphones for all the words in dictionary.
  
  You can write triphones by hand in editor or with a script.
  
  Is there any standard dictionary available with their phonetic description.
  
  Sure, google for cmudict.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Thank you for your quick response.
From the command line I am try to execute sphinx_fe but it says try to install sphinx_fe for that I have installed Sphinx_fe and try to run. For that it will shows following error saying that no arguments should be given.

sitecsp@acl-pg-06:~/sphinxbase/src/sphinx_fe$ ls
an251-fash-b.sph  fe.h      Makefile.am  sphinx_fe    wave2feat.h
cmd_ln_defn.h     Makefile  Makefile.in  wave2feat.c  wave2feat.o
sitecsp@acl-pg-06:~/sphinxbase/src/sphinx_fe$ sphinx_fe
ERROR: "cmd_ln.c", line 675: No arguments given, available options are:
Arguments list definition:
[NAME]      [DEFLT]     [DESCR]
-alpha      0.97        Preemphasis parameter
-argfile            Argument file (e.g. feat.params from an acoustic model) to read parameters from.  This will override anything set in other command line arguments.
-blocksize  2048        Number of samples to read at a time.
-build_outdirs  yes     Create missing subdirectories in output directory
-c              Control file for batch processing
-cep2spec   no      Input is cepstral files, output is log spectral files
-di             Input directory, input file names are relative to this, if defined
-dither     no      Add 1/2-bit noise
-do             Output directory, output files are relative to this
-doublebw   no      Use double bandwidth filters (same center freq)
-ei             Input extension to be applied to all input files
-eo             Output extension to be applied to all output files
-example    no      Shows example of how to use the tool
-frate      100     Frame rate
-help       no      Shows the usage of the tool
-i              Single audio input file
-input_endian   little      Endianness of input data, big or little, ignored if NIST or MS Wav
-lifter     0       Length of sin-curve for liftering, or 0 for no liftering.
-logspec    no      Write out logspectral files instead of cepstra
-lowerf     133.33334   Lower edge of filters
-mach_endian    little      Endianness of machine, big or little
-mswav      no      Defines input format as Microsoft Wav (RIFF)
-ncep       13      Number of cep coefficients
-nchans     1       Number of channels of data (interlaced samples assumed)
-nfft       512     Size of FFT
-nfilt      40      Number of filter banks
-nist       no      Defines input format as NIST sphere
-npart      0       Number of parts to run in (supersedes -nskip and -runlen if non-zero)
-nskip      0       If a control file was specified, the number of utterances to skip at the head of the file
-o              Single cepstral output file
-ofmt       sphinx      Format of output files - one of sphinx, htk, text.
-part       0       Index of the part to run (supersedes -nskip and -runlen if non-zero)
-raw        no      Defines input format as raw binary data
-remove_dc  no      Remove DC offset from each frame
-round_filters  yes     Round mel filter frequencies to DFT points
-runlen     -1      If a control file was specified, the number of utterances to process, or -1 for all
-samprate   16000       Sampling rate
-seed       -1      Seed for random number generator; if less than zero, pick our own
-smoothspec no      Write out cepstral-smoothed logspectral files
-spec2cep   no      Input is log spectral files, output is cepstral files
-sph2pipe   no      Input is NIST sphere (possibly with Shorten), use sph2pipe to convert
-transform  legacy      Which type of transform to use to calculate cepstra (legacy, dct, or htk)
-unit_area  yes     Normalize mel filters to unit area
-upperf     6855.4976   Upper edge of filters
-verbose    no      Show input filenames
-warp_params            Parameters defining the warping function
-warp_type  inverse_linear  Warping function type (or shape)
-whichchan  0       Channel to process (numbered from 1), or 0 to mix all channels
-wlen       0.025625    Hamming window length

Sir, please help from .sph file how to convert .mfc file what are arguments for that function in detail.
Thank you for your help in advance.

Nickolay V. Shmyrev - 2016-10-27

sphinx_fe -nist yes -i file.sph -o file.mfc

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pradeep S V - 2016-10-27

Sir, I want to implement feature extraction process in matlab instead of sphinx_fe for that it is necessary to know in which format the extracted features are before they get written into binary file.

If I extract features using matlab script how can I convert into .mfc binary file so that it can used with sphinx3.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-10-27
  
  If I extract features using matlab script how can I convert into .mfc binary file so that it can used with sphinx3.
  
  You have to write a simple code for that, something like
  
  [iM, iN] = size(mfcc); iNumData = iM * iN; fid = fopen('out.mfc', 'wb'); fwrite(fid, iNumData, 'int32'); fwrite(fid, mfcc, 'float32'); fclose(fid);
  
  MFCC file format is described here http://cmusphinx.sourceforge.net/wiki/mfcformat
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Pradeep S V - 2016-10-28
    
    Thank you sir
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-10-28

Thank yo u sir for your kind response.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-11-01

Sir I want to know how write a triphone for the words with an example please tell me sir. Is it compulsory to write triphone for all the words in dictionary.Is it possible to do phone level alignment if yes how can be done. Sir I have studied one paper in that they are using htk toolkit in that by changing language model weight they are getting alignment rates very high.Sir I want to know what is language model weight is it possible to change in cmu sphinx 3.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-11-01
  
  cmusphinx is quite different from htk, you can't easily transfer htk work without deep understanding of all internals.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pradeep S V - 2016-11-01

Sir,I am working lyrics to song alignment. I also have same doubt is it compulsory to write triphone for all words in dictionary.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-11-01
  
  No, unlike htk cmusphinx deals with triphones internally, they are not used as input.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-11-01

Sir, I am really sorry if I am disturbing you. For phoneme level alignment how to start. What are the changes I need to do. Can I get any timing information from aligment results. Sir, I am m.tech student no one is here to assist. I don't know how to use htk toolkit I know some basics of hidden markov model theoretically our guide tell me directly to use cmu sphinx3. As I am new to this i need your help sir please help me. Thank you.

Last edit: Diwakar.G 2016-11-14

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-11-01
  
  our guide tell me directly to use cmu sphinx3
  
  You can ask him for furhter details then.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diwakar.G - 2016-11-01

Sir I am currently working on dysarthric speech to text alignment. Usually peoples who are suffered from dysarthric contain long pauses between words and even between words. So first initially I need locate their timing and remove those pauses and then it is applied to cmu sphinx3. For this i have written dictionary for those words they uttered. Now they have told first do phone level alignment. Here I have a confusion is it mandatory to write triphone for all words in dictionary. Should I need to modify the code for phone level alignment. Sir please help me.
Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sir I don't understand what is ihe problem can you please tell me I stuck with this error. I am using my own dictionary,data for training the model.

sitecsp@acl-pg-06:~/Documents/an4$ perl scripts_pl/RunAll.pl
MODULE: 00 verify training files
O.S. is case sensitive ("A" != "a").
Phones will be treated as case sensitive.
    Phase 1: DICT - Checking to see if the dict and filler dict agrees with the phonelist file.
        Found 223 words using 40 phones
    Phase 2: DICT - Checking to make sure there are not duplicate entries in the dictionary
    Phase 3: CTL - Check general format; utterance length (must be positive); files exist
WARNING: CTL line does not parse correctly:

    Phase 4: CTL - Checking number of lines in the transcript should match lines in control file
    Phase 5: CTL - Determine amount of training data, see if n_tied_states seems reasonable.
        Total Hours Training: 0.131032692307692
        This is a small amount of data, no comment at this time
    Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the dictionary
        Words in dictionary: 220
        Words in filler dictionary: 3
WARNING: Bad line in transcript:

    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
Something failed: (/home/sitecsp/Documents/an4/scripts_pl/00.verify/verify_all.pl)

I have checked all the phones are used once in dictionary

Nickolay V. Shmyrev - 2016-11-03

You are using outdated sphinxtrain.

You have bad empty lines in transcript and ctl files.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Pradeep S V - 2016-11-03

Sir, is there any newer version sphinxtrain available for cmu sphinx 3. If yes, how to install it.

I have attached the transcription file below sir please tell me how to remove bad empty lines.
Thank you.

an4_train.transcription

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2016-11-03
  
  Sir, is there any newer version sphinxtrain available for cmu sphinx 3. If yes, how to install it.
  
  In downloads.
  
  I have attached the transcription file below sir please tell me how to remove bad empty lines.
  
  With a text editor.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Sir, I have removed that empty line now i have while running this
perl scripts_pl/20.ci_hmm/slave_convg.pl
I am getting following error.

sitecsp@acl-pg-06:~/Documents/an4$ perl scripts_pl/05.vector_quantize/slave.VQ.pl
MODULE: 05 Vector Quantization
Skipped for continuous models
sitecsp@acl-pg-06:~/Documents/an4$ perl scripts_pl/20.ci_hmm/slave_convg.pl
MODULE: 20 Training Context Independent models
    Phase 1: Cleaning up directories:
    accumulator...logs...qmanager...models...
    Phase 2: Flat initialize
    Phase 3: Forward-Backward
        Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
        0% *** Error in `/home/sitecsp/Documents/an4/bin/bw': free(): invalid next size (fast): 0x00000000024ad0a0 ***

This step had 14 ERROR messages and 0 WARNING messages.  Please check the log file for details.
Only 0 parts of 1 of Baum Welch were successfully completed
Parts 1 failed to run!
Training failed in iteration 1

can you please help me.
Thank you.

Nickolay V. Shmyrev - 2016-11-03

Use latest sphinxtrain and follow the tutorial

http://cmusphinx.sourceforge.net/wiki/tutorialam

It gives correct and up-to-date information about acoustic model training.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

MFC file format?

Speech Recognition Toolkit

Forums

Help

MFC file format? document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

MFC file format?