Menu

Help with Sphinx3 Decoding (Syntax)

Help
2011-09-28
2012-09-22
  • Siddharth Sigtia

    I've been trying to use Sphinx3 for the last two months. I've managed to learn
    quite a bit of it, but am still very stuck in some places. I'm a novice and my
    doubts are probably very trivial but I would really appreciate it if some of
    the more experienced users could help me out.
    1. I successfully used sphinx_livedecode and sphinx_livepretend. However I cannot figure out what the command line syntax for s3decode is. The script says usage is <part> <npart> . However i cannot figure out what exactly part, npart and exptid stand for. Control im guessing is the ctl file. </npart></part>

    1. I built a system using the TIMIT database. However the accuracy wasn't very good. So i tried to increase the accuracy by extracting more features. I extracted 39 features and trained the system. However on running slave.pl i get the error:
      FATAL_ERROR: "kbcore.c", line 633: Feature streamlen(39) != mgau
      streamlen(117)

    I know i'm probably missing out some field in the sphinx_decode.cfg file.

    I really hope someone can help out, with this, i'm really stuck.
    Thank you so much in advance.

     
  • Nickolay V. Shmyrev

    what the command line syntax for s3decode is. The script says usage is
    <part> <npart> . However i cannot figure out what exactly part, npart and
    exptid stand for. Control im guessing is the ctl file.</npart></part>

    Hello

    Specifically s3decode.pl script is designed to run in a parallel environment.
    For that reason it can process decoding tasks by parts. It can split the whole
    list on n parts and process each specific part dumping the output for later
    merge. If you want to use it you can just run it on a whole set. For that
    reason use part 1 and npart 1. Other arguments are optional.

    If you want to run sphinx3_decode binary, you need to run it as:

    sphinx3_decode -hmm <hmm> -lm <lm> -dict <dict> -ctl <control_file with="" one="" name="" per="" line=""> -cepdir <directory for="" feature="" files=""> -hyp <output file=""> </output></directory></control_file></dict></lm></hmm>

    Or to process wavs

    sphinx3_decode -hmm <hmm> -lm <lm> -dict <dict> -ctl <control_file with="" one="" name="" per="" line=""> -cepdir <directory for="" feature="" files=""> -hyp <output file=""> -adcin
    yes -cepext .wav </output></directory></control_file></dict></lm></hmm>

    1. I built a system using the TIMIT database. However the accuracy wasn't
      very good. So i tried to increase the accuracy by extracting more features. I
      extracted 39 features and trained the system. However on running slave.pl i
      get the error: FATAL_ERROR: "kbcore.c", line 633: Feature streamlen(39) !=
      mgau streamlen(117)

    It feels like you enabled cepwin features with lda transform but didn't pass
    the transform file to the decoder. You need to pass it properly. If you don't
    understand this part yet I recommend you to revert to default settings
    instead.

     
  • Siddharth Sigtia

    Thank you so much. I'll give these a shot.
    You're right i do not understand the second part yet. I wanted to extract more
    features and changed the CFG_VECTOR_LENGTH to 39 in sphinx_decode.cfg.
    I also updated ncep to 39 in feat.params.
    However these changes did not give me a 39 dim feature vector. So i checked
    the make_feats.pl file and it appeared that the default parameters were being
    used to create the features. So i updated -ncep to 39 in make_feats.pl, which
    gave me a 39 dim feature vector.
    I trained the system and got the error that i stated above. Is there some
    fundamental error in what i did?

     
  • Nickolay V. Shmyrev

    Thank you so much. I'll give these a shot.
    You're right i do not understand the second part yet. I wanted to extract more
    features and changed the CFG_VECTOR_LENGTH to 39 in sphinx_decode.cfg.
    I also updated ncep to 39 in feat.params.
    However these changes did not give me a 39 dim feature vector. So i checked
    the make_feats.pl file and it appeared that the default parameters were being
    used to create the features. So i updated -ncep to 39 in make_feats.pl, which
    gave me a 39 dim feature vector.

    You have a confusion between feature vector length and cepstrum length.
    Cepstrum is stored in MFC files and only contains mel log-scale values. It's
    typical dimension is 13. Feature vector is combined of cepstrum, first
    cepstrum derivatives and second cepstrum derivatives as specified by feature
    vector type 1s_c_d_dd. The size for feature vector is 39 (or 13 * 3 with
    derivatives)

    If you change vector lenght with -ceplen, you change the lengh of final
    feature vector
    If you change cepstrum length with -ncep you change the lengh of cepstrum.

    If you wanted cepstrum lenght 39 (pretty useless, it should be less than 20),
    you also need to set vector lenght to 39 with -ceplen. You need to edit the
    decoder script to do both.

    -ncep configures cepstrum length
    -ceplen together with -feat control feature vector length

     
  • Siddharth Sigtia

    That makes so much more sense. Thank you so much. I'm sure this will clear
    everything up. Thank you :D

     
  • Siddharth Sigtia

    Hey,

    I just tried the decoder and it works perfectly.
    I have one last question. So -ceplen and -feat together control the length of
    the feature vector.
    So if i want a feature vector length of 39, what changes should i make to
    sphinx_train.cfg.
    If the -feat is "1s_c_d_dd", is that all that needs to be updated? Or should i
    update CFG_VECTOR_LENGTH also.

    I am using RunAll.pl to train, so i guess the arguments to the trainer have to
    passed through the config file. I was wondering if CFG_VECTOR_LENGTH also
    needs to be updated.
    I'm a little confused,because i updated -feat to 1s_c_d_dd and
    CFG_VECTOR_LENGTH to 39 and got:
    FATAL_ERROR: "corpus.c", line 1754: Expected mfcc vector len of 39, got 26
    (1157)

    So what fields should i update to get a feature vector of length 39.
    I'm really sorry if you're having to repeat yourself. But this would be a lot
    of help.

     
  • Nickolay V. Shmyrev

    So if i want a feature vector length of 39, what changes should i make to
    sphinx_train.cfg.

    You shoudln't change anything. Default values are:

    ncep (cepstrum length) 13
    ceplen (vector length) 13
    feature 1s_c_d_dd (cepstrum, delta and delta-delta)
    full feature vector length 39 (3 * 13)

    i updated -feat to 1s_c_d_dd and CFG_VECTOR_LENGTH to 39 and got:
    FATAL_ERROR: "corpus.c", line 1754: Expected mfcc vector len of 39, got 26
    (1157)

    If you change vector length you (you shoud not do that) you also need to
    change cepstrum length (-ncep option in make_feats.pl) and you also need to
    reextract the mfc files (cepstrum)

     
  • Siddharth Sigtia

    Got it. Thanks :)

     

Log in to post a comment.