Menu

sphinx_fe warp_type parameter

Help
2011-08-31
2012-09-22
  • Rafael Oliveira

    Rafael Oliveira - 2011-08-31

    Hi,

    Do you know what type of warp function (inverse_linear, linear or piecewise)
    in sphinx_fe corresponds to the warp_function HCopy uses to vocal tract length
    normalization?

    I'm trying to perform a feature extraction with sphinx_fe to generate the same
    ( or the closer) .mfc I get using HCopy, so if you have some information that
    could help me, please let me know.

     
  • Nickolay V. Shmyrev

    http://nsh.nexiwave.com/2009/09/using-htk-models-in-sphinx4.html?showComment=
    1254025513274#c8154075213468076053

    Hey, the C front-end in sphinxbase can generate HTK features although they
    need to be rearranged a bit... The options corresponding to the HTK config
    above are:

    -round_filters no
    -unit_area no
    -remove_dc yes
    -transform htk
    -lifter 22
    -nfilt 26
    -lowerf 1
    -upperf 8000
    

    Note, however, that liftering does absolutely nothing if you use CMN.
    Likewise, -unit_area makes no difference either in this case.

     
  • Rafael Oliveira

    Rafael Oliveira - 2011-09-05

    Thank you for your reply, it helped a lot!

    However, I'm still having problems with the , and . As you can see in the htk
    and sphinx output below, the values of this coefficients remains far from
    being similar.

    HTK output:

    x: MFCC-1 MFCC-2 MFCC-3 MFCC-4 MFCC-5 MFCC-6 MFCC-7 MFCC-8 MFCC-9 MFCC-10
    MFCC-11 MFCC-12 C0
    ------------------------------------------------ Samples: 0->-1 ------------------------------------------------
    0: -19.628 -14.573 -12.145 -1.408 -0.009 -13.456 -27.977 -5.427 3.278 -2.887
    -8.363 3.391 94.208
    1: -16.742 -10.796 -8.409 -3.281 -6.843 -20.527 -22.235 -7.657 9.613 -1.215
    -1.462 5.480 95.465

    Sphinx output

    frame#: c c c c c c c c c c c c c
    0: 55.125 -10.512 -14.357 -9.635 -1.656 0.207 -13.878 -26.570 -4.790 3.916
    -3.998 -8.733 4.924
    1: 56.364 -7.729 -10.867 -6.368 -2.854 -5.398 -20.044 -20.778 -7.683 10.223
    -0.931 -0.569 7.168

    Here are my HCopy and sphinx_fe config file. Note that in order to compare
    just the 13 cepstrals coefficients, I'm no using the D, A and Z options in
    HCopy.

    -alpha 0.97
    -nfilt 26
    -ncep 13
    -lowerf 1
    -upperf 8000
    -transform htk
    -mswav yes
    -lifter 22
    -remove_dc yes

    USESILDET = FALSE
    ENORMALISE = TRUE
    NUMCEPS = 12
    CEPLIFTER = 22
    NUMCHANS = 26
    USEPOWER = TRUE
    PREEMCOEF = 0.97
    USEHAMMING = T
    WINDOWSIZE = 250000.0
    TARGETRATE = 100000.0
    TARGETKIND = MFCC_0
    ZMEANSOURCE = T
    SOURCEFORMAT = WAV

    Am i doing something wrong?

     
  • Nickolay V. Shmyrev

    Sorry, do you mean you see the difference in first 3 frames and other frames
    are correct?

     
  • Rafael Oliveira

    Rafael Oliveira - 2011-09-06

    No, I mean the c0, c1 and c3 (in HTK C0, MFCC-1 and MFCC-3). The others in my
    opinion are okay.

    Sphinx

       c[ 0]         c[ 1]           c[ 2]
    55.125   -10.512      -9.635
    

    HTK

        C0       MFCC-1      MFCC-3
    94.208    -19.628       -12.145
    
     

Log in to post a comment.