Menu

PS lattice file format

Help
2011-02-01
2012-09-22
  • Vassil Panayotov

    Hi,

    What is the exact meaning first-endframe and last-endframe fields in .lat
    files (for example see
    http://www.pasteall.org/18753)? As a related
    question: for which frames are the acoustic scores (in the edge table section
    of the file) calculated? Is it for example for the frames between start frame
    and first-lastframe, between start and last-endframe or something in between?

    Thank you!

     
  • Vassil Panayotov

    Thinking again about my second question(about the acoustic scores) it seems
    logical that the scores are calculated between the start frame of the source
    node and the start frame of the destination node. If this is the case for the
    above link one(and only in this case) hypothesis for the word "go" spans the
    frames between 46 and frame 64, which is the first frame of the word
    "forward". For this hypothesis the acoustic score (log_1.0001) is -1653760

    21 19 -1653760
    

    . Can someone (e.g. Nickolay) confirm this?

     
  • Vassil Panayotov

    Here is the GraphViz picture corresponding to the lattice in my first post:
    http://www.pasteall.org/pic/8711 .
    This picture was produced using the lattice.py module from SphinxTrain (not
    sure this is the proper way to calculate posteriors, but it is irrelevant for
    now):

    import cmusphinx.lattice
    
    dag = cmusphinx.lattice.Dag('goforward.lat')
    dag.posterior()
    dag.dag2dot('goforward.dot')
    

    What I find puzzling is that for some of the nodes there are no arcs. For
    example I think there should be an arc between "s/0" and "s/46" node and
    between for example ten/117 and leaders/152.
    Is it a bug?

     
  • Nickolay V. Shmyrev

    Hello

    This is not really a bug, algorithm just works this way. It builds lattice
    from the end trying to find predecessor candidates and link to them. Sometimes
    it can happen that no good predecessor can be found and the node says
    unlinked. For details see the function

    ps_lattice_t * ngram_search_lattice(ps_search_t *search)
    

    Of course there could be a step to clean such nodes, but it's just not
    implemented. As far as I understand it doesn't affect any further decoding
    steps

     

Log in to post a comment.