Menu

What does p specify in mdef.tx file

creative64
2010-09-16
2012-09-22
  • creative64

    creative64 - 2010-09-16

    Hi,

    What does field "p" specify in the model definition file mdef.txt
    (hub4wsj_sc_8k acoustic model) ? This entry seems to be valid
    for triphones and takes values of s, b, i, e etc.

    Thanks and regards,

     
  • Nickolay V. Shmyrev

    P is position of triphone in a word. CMUSphinx distinguish diphones of
    different positions and tries to use the ones that are suitable for the target
    word. This is long tradition but to be honest I don't have proper
    justification for it.

    Positions:

    i - word-internal triphone
    b - word-beginning triphone (central phone is the first phone of the word)
    e - word-end triphone
    s - single-word triphone

     
  • creative64

    creative64 - 2010-09-16

    Thanks NS,

    1. So in the code below p actually tries to capture the closest triphone based on its position in the word (file: dict2pid.c)

    p = bin_mdef_phone_id_nearest(mdef, (s3cipid_t) b,
    (s3cipid_t) l, (s3cipid_t) r,
    WORD_POSN_BEGIN);

    1. And eventually the corresponding senone sequence ID is obtained below:

    dict2pid->ldiph_lc** = bin_mdef_pid2ssid(mdef, p);

    Is my understanding correct ?

    Regards,**

     
  • Nickolay V. Shmyrev

    Yes, it's correct

     
  • creative64

    creative64 - 2010-09-16

    Thanks NS,

    Let's say I have a value of 134747 for p.
    In the mdef.txt file, how do I locate the corresponding triphone ?

    Regards,

     
  • Nickolay V. Shmyrev

    "grep 123747 mdef" will give you all tripones, won't it?

     
  • creative64

    creative64 - 2010-09-16

    mdef.txt file (atleast the version I have) doesn't have entries for triphone
    IDs. A typical entry in the file looks like:
    AA AA AA s n/a 10 187 199 241 N
    AA AA AE s n/a 10 187 199 241 N
    So a particular triphone has to be located by going to the appropriate line
    number.

    The first entry
    +BREATH+ - - - filler 0 0 1 2 N is at line no 11

    and the first triphone entry
    AA AA AA s n/a 10 187 199 241 N is at line number 61

    So adding one of these offsets in the value of p should give me the
    corresponding line number... I was trying to add 61 (assuming a 0 for first
    triphone entry) and wasn't getting the correct triphone. Just now realized (at
    least for few iterations) that adding 11 is taking me to the correct line.
    Meaning the IDs are counted starting from first entry and not from first
    triphone.
    Hope this understanding is correct !

    Regards,

     

Log in to post a comment.