Menu

Reducing the size of the models

Help
2001-11-08
2012-09-22
  • anselme dewavrin

    Hi everyone,

    I am trying to find ways to reduce the size
    of the model files ("sendump", "map", and "phone") generated by the SphinxTrain,
    to create an adaptation of the Sphinx2 on
    embedded systems. I only have a small vocaburaly set (200 words).

    Did anybody try to reduce these files  ?

    My ideas:
    -use 3 states instead of 5, but this is
    impossible with sphinx2 :(
    -group the phonemes by similarity (for instance,
    k and t and p), to have only 20 phonemes instead of 40 before the training. But I do not have enough training data to make measurements of the impact
    -throw away some data from those files. It might
    seem to be a joke, but when using a small vocabulary set, not all of the senones are usefull...

    Any experience to share? Or ideas?

     
    • LEI YANG

      LEI YANG - 2001-12-14

      1. use 3 state definitly can reduce size, but you can use decoder III.
      2. If you use a small vacabulory table. I think Train process already discard unnessary phoneme/senones.

       
    • anselme dewavrin

      Thank you Lei,

      Kevin, any ideas?

      I was wondering if there is a way to re-cluster
      the sendump file, by grouping vectors of the
      same kind.

       
    • Kevin A. Lenzo

      Kevin A. Lenzo - 2001-12-17

      One thing to do is to use, say, 4000-state models instead of 6000 states.  That would get you to 2/3 the size.  Sphinx2 does have the limitation that it's hard-wired to 3 states per phone HMM, and sphinx3 does not have that topology constraint, but sphinx2 is still the winner for speed.  The default models have 6K states.  The 4K version will fit on e.g. an iPaq, though some folks prefer to just do feature extraction on the device (get the cepstral coefficients) and pass off the much smaller feature stream to other networked machines.  That's an aside, though :)

      I think your best bet is to reduce the number of states.  I'll see if we can produce a 4k state model; we had one in the past before conversion to the new front-end/phoneset/training, and this would be useful for folks trying to get down to small footprint.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.