Menu

test_means.1.txt

Help
Sebastian
2008-08-08
2012-09-22
  • Sebastian

    Sebastian - 2008-08-08

    dear all,

    i am searching for speech feature values and i find this test_means.1.txt file from ...\SphinxTrain\test\cp_parm folder.

    question 1: is it the right file?

    i open the file with notepad and get this information:

    param 135 1 1
    mgau 0
    feat 0
    density 0 -4.367e-07 3.294e-08 -7.529e-09 2.243e-08 6.745e-09 3.216e-09 -1.427e-08 -2.549e-09 6.745e-09 -6.157e-09 -6.451e-09 -1.608e-09 2.667e-09 4.987e-02 2.084e-02 9.597e-04 1.033e-02 3.889e-04 -4.084e-03 4.686e-04 -9.042e-04 -1.617e-03 -5.149e-03 -5.388e-03 -3.314e-03 -8.431e-03 -7.605e-03 -1.187e-02 -7.943e-03 -1.409e-02 -7.232e-04 6.492e-03 -3.121e-03 -5.490e-03 2.059e-03 3.168e-03 -2.580e-03 -3.544e-03 6.253e-03 .. ets ..

    question 2: how do i read this info? are they mfcc values? (there are 40 values each set)

    question 3: is there any tools to convert others means or variances files in binary to ascii and vice versa?

    thank you in advanced

    regards,
    zbastian

     
    • Sebastian

      Sebastian - 2008-08-11

      first, thank you mr. Shmyrev .. (i'm still trying to build sphinxbase and sphinx3 now .. since i only have sphinx4 in my machine and do not have vc++)

      sorry for being not very clear, what i am searching for is clean speech feature vectors values for each fonem (39 english phonemes) outputted by sphinx4 features extractor, so i can use it to benchmarking my features extractor simulation (in matlab) ..

      Q1: is it possible to get this clean speech feature vectors values? or do you have any suggestion for that?

      form the literature, i find that mfccs based speech recognizer will usually use 3x13 freq + 1 energy coefficient = total 40 values for 1 feature vectors and for each phoneme usually consist of 3 sets (3x20ms STFT windows)

      Q2: is sphinx4 also use this format? if, yes, how do sphinx4 store this values in its acoustic corpus? is it combination beetween means, variances, mixture_weights, erc files?

      thanks for your time ..

      regards,
      zbastian

       
      • Nickolay V. Shmyrev

        > if, yes, how do sphinx4 store this values in its acoustic corpus?

        I think you don't understant the following concepts:

        Speech
        Phoneme
        Phone
        Corpus
        Waveform
        Feature
        Cepstrum
        Gaussian
        Mean
        Variance
        Gaussian Mixture
        Hidden Markov Model
        Acoustic model

        Until you read a book and learn the concepts above our discussion is senseless.

         
    • Nickolay V. Shmyrev

      > question 1: is it the right file?

      Right for what? For a sample of feature file? No, feature files are in binary format, you can download an4 database in mfcc for example or create your own feature file from a wave file with wave2feat.

      > question 2: how do i read this info? are they mfcc values? (there are 40 values each set)

      No, they are means of gaussians, of course they describe some "average" mfcc vectors for each of 135 gaussians in RM1 model.

      > question 3: is there any tools to convert others means or variances files in binary to ascii and vice versa?

      yes, bin/printp.exe

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.