MFC files during adaptation

Speech Recognition Toolkit

Brought to you by: air, arthchan2003, awb, bhiksha, and 5 others

This project can now be found here.

MFC files during adaptation

Forum: Help

Creator: Balaji

Created: 2017-08-25

Updated: 2018-01-01

Balaji - 2017-08-25

Hello,

I am using the tutorial page "Adapting the default acoustic model". When I run the step "Generating acoustic feature files", there are .mfc files created for each of my wav files. These files are in unreadable, binary format. Do they contain the features (MFCCs) of each of the sound signals?

I have learnt the following steps for creating MFCCs from .wav file (text book):
* wav file's sampling rate: 16000.
Framing --> Windowing --> apply FFT --> Find the magnitude of the fft --> Convert the fft data into filter bank outputs --> Find the log base 10 --> Find the cosine transform
Implementation of this procedure creates a file with a number of floating point values (positive and negative).

Is the procedure adapted in Sphinx4 is same as the above? If so, is it possible to get a readable (text) version of these .MFC files?

Thank you.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2017-08-27
  
  Is the procedure adapted in Sphinx4 is same as the above?
  
  Yes
  
  If so, is it possible to get a readable (text) version of these .MFC files?
  
  Run sphinx_cepview -f file.mfc
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Balaji - 2017-08-30

Thank you. I got the numbers.
But, I'm finding that the numbers are too different for the same wav file. Going through my algorithm to get the values as sphinx's output.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Balaji - 2017-12-31
  
  I am going through the MFC files created using sphinx_fe with my wav file as a parameter. I got the readable version using sphinx_cepview program.
  I see the contents of each MFC file is organized as a 10 column matrix. The number of rows increase with the wav file size.
  
  Can I get more details on the contents and structure of the MFC files.
  
  Thank you.
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2018-01-01
    
    https://cmusphinx.github.io/wiki/mfcformat/
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Balaji - 2017-11-20

Hello,

I am using Sphinx4 for "Adapting the default acoustic model".
I have attached herewith data.zip which has some voice files under a directory "notworking". When I run sphinx_fe.exe (windows version X64, release) with these files, the program crashes. I have attached the Sphinx_fe error.docx to show the same.

The command line I am using is:
"F:\CMUSphinx\sphinxbase\bin\Release\x64\sphinx_fe.exe" -argfile cmusphinx-en-us-5.2\feat.params -samprate 16000 -c VoiceFiles\UAS2.fileids -di .\VoiceFiles -do . -ei wav -eo mfc -mswav yes

I then converted the audio files using an online tool with the following parameters:

Sampling rate: 16000 Hz

Bit resolution: 16bits

Audio channels: mono

After conversion, the error disappears and I am able to create the .MFC files successfully. However, when I analyze the 2 audio files (before and after conversion), the metadata output is same. Running the converter has solved the problem means, there must be some change in the audio files' meta data, isn't it? I have also attached the MetaDataOutput.docx. Could somebody please look into this and clarify.
Thank you.

Balaji.

MetaDataOutput.docx

Sphinx_fe error.docx

data.zip
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.