CMU Sphinx / Forums / Help: test

Sebastian - 2008-08-08

dear all,

i am searching for speech feature values and i find this test_means.1.txt file from ...\SphinxTrain\test\cp_parm folder.

question 1: is it the right file?

i open the file with notepad and get this information:

param 135 1 1
mgau 0
feat 0
density 0 -4.367e-07 3.294e-08 -7.529e-09 2.243e-08 6.745e-09 3.216e-09 -1.427e-08 -2.549e-09 6.745e-09 -6.157e-09 -6.451e-09 -1.608e-09 2.667e-09 4.987e-02 2.084e-02 9.597e-04 1.033e-02 3.889e-04 -4.084e-03 4.686e-04 -9.042e-04 -1.617e-03 -5.149e-03 -5.388e-03 -3.314e-03 -8.431e-03 -7.605e-03 -1.187e-02 -7.943e-03 -1.409e-02 -7.232e-04 6.492e-03 -3.121e-03 -5.490e-03 2.059e-03 3.168e-03 -2.580e-03 -3.544e-03 6.253e-03 .. ets ..

question 2: how do i read this info? are they mfcc values? (there are 40 values each set)

question 3: is there any tools to convert others means or variances files in binary to ascii and vice versa?

thank you in advanced

regards,
zbastian

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Sebastian - 2008-08-11
  
  first, thank you mr. Shmyrev .. (i'm still trying to build sphinxbase and sphinx3 now .. since i only have sphinx4 in my machine and do not have vc++)
  
  sorry for being not very clear, what i am searching for is clean speech feature vectors values for each fonem (39 english phonemes) outputted by sphinx4 features extractor, so i can use it to benchmarking my features extractor simulation (in matlab) ..
  
  Q1: is it possible to get this clean speech feature vectors values? or do you have any suggestion for that?
  
  form the literature, i find that mfccs based speech recognizer will usually use 3x13 freq + 1 energy coefficient = total 40 values for 1 feature vectors and for each phoneme usually consist of 3 sets (3x20ms STFT windows)
  
  Q2: is sphinx4 also use this format? if, yes, how do sphinx4 store this values in its acoustic corpus? is it combination beetween means, variances, mixture_weights, erc files?
  
  thanks for your time ..
  
  regards,
  zbastian
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Nickolay V. Shmyrev - 2008-08-11
    
    > if, yes, how do sphinx4 store this values in its acoustic corpus?
    
    I think you don't understant the following concepts:
    
    Speech
    Phoneme
    Phone
    Corpus
    Waveform
    Feature
    Cepstrum
    Gaussian
    Mean
    Variance
    Gaussian Mixture
    Hidden Markov Model
    Acoustic model
    
    Until you read a book and learn the concepts above our discussion is senseless.
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Nickolay V. Shmyrev - 2008-08-11
  
  > question 1: is it the right file?
  
  Right for what? For a sample of feature file? No, feature files are in binary format, you can download an4 database in mfcc for example or create your own feature file from a wave file with wave2feat.
  
  > question 2: how do i read this info? are they mfcc values? (there are 40 values each set)
  
  No, they are means of gaussians, of course they describe some "average" mfcc vectors for each of 135 gaussians in RM1 model.
  
  > question 3: is there any tools to convert others means or variances files in binary to ascii and vice versa?
  
  yes, bin/printp.exe
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

test_means.1.txt

Speech Recognition Toolkit

Forums

Help

test_means.1.txt document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

test_means.1.txt