Menu

Regarding tuning and states per hmm

Help
Karthik
2011-05-05
2012-09-22
  • Karthik

    Karthik - 2011-05-05

    Hey,

    I am trying to train models in another language. Unfortunately, the training
    and testing data is a bit poor. Nevertheless, I want to tune the system as
    much as I can. Which parameters can I fiddle around with? Is there a procedure
    for the tuning or is it mostly trial and error?

    Also, can we model different phones differently? Model shorter phones with
    less states and longer ones with more?

    Any help is appreciated. Thanks :D.

     
  • Nickolay V. Shmyrev

    Unfortunately, the training and testing data is a bit poor.

    What do you mean by poor

    Nevertheless, I want to tune the system as much as I can.

    The perfect is the enemy of the good

    Which parameters can I fiddle around with?

    During training - the number of senones, number of mixtures and maybe
    phoneset. You also need to write phone tree questions manually.

    Is there a procedure for the tuning or is it mostly trial and error?

    Yes

    Also, can we model different phones differently? Model shorter phones with
    less states and longer ones with more?

    No, you can't do that. There is no much need to do that either. For most cases
    it's not helpful.

     
  • Karthik

    Karthik - 2011-05-06

    Cool, thanks a lot :).

    By poor, I mean there is ambient noise and quite a few fillers which I'm sure
    are heavily weighing down the system's ability to recognise accurately.
    Needless to say, the test data is also of a similar nature :|. Some other
    people are working on getting better data. In the meantime, I'm not aiming for
    perfect, only for an increase in accuracy that perhaps I can implement once I
    get the good data.

    "" Is there a procedure for the tuning or is it mostly trial and error?"

    Yes"

    I'm assuming by this you meant there is a procedure. Where can I read up on
    this? And also about writing phone tree questions?

    What about trying to optimize values of wip, lw, uw and beams? Do they affect
    the accuracy significantly?

     
  • Nickolay V. Shmyrev

    By poor, I mean there is ambient noise and quite a few fillers which I'm
    sure are heavily weighing down the system's ability to recognise accurately.
    Needless to say, the test data is also of a similar nature :|. Some other
    people are working on getting better data. In the meantime, I'm not aiming for
    perfect, only for an increase in accuracy that perhaps I can implement once I
    get the good data.

    From my practice it's usually helpful to identify the source of the issues not
    just try to tune the parameters. I don't believe that fillers make any issue
    but if you have ambient noise it's way more helpful to look on the ways to
    remove it than to optimize the number of senones. Senones can give you 2%
    improvement over the wild guess. If you will fix the issue you have (not
    necessary noise, it might be to tight decoding beam or bad language model) it
    can give you way bigger improvement.

    I'm assuming by this you meant there is a procedure. Where can I read up on
    this? And also about writing phone tree questions?

    Yes meant it's mostly trial and error. Basically you need to try all
    reasonable values to select the ones which work best.
    Feature extaction parameters for example is something to try first, then there
    are different things.

    What about trying to optimize values of wip, lw, uw and beams? Do they
    affect the accuracy significantly?

    Those are decoder optimization, not training. Yes, they can have some
    significant effect. You can find optimization guide on wiki:

    http://cmusphinx.sourceforge.net/wiki/decodertuning

    There are some more uncovered topics though. There is also some interesting
    ongoing research on this subject, see

    http://www.isca-speech.org/archive/interspeech_2010/i10_1497.html

    Overall I suggest you to share your accuracy results first just to check
    everything is ok and not completely wrong in your setup. Database type,
    vocabulary size and accuracy.

     

Log in to post a comment.