Menu

sphinx lattice to WCN conversion

Help
2011-08-05
2012-09-22
  • shamsa abid

    shamsa abid - 2011-08-05

    Hi
    Im working on converting my pocketsphinx lattice into a WCN/sausage.
    Ultimately,i want all possible candidates for a single word utterance made
    available to choose from.

    Once i get the lattice object, what should my next steps be?
    I was planning to use SRILM lattice-tool for this, however, it accepts only
    PFSG and HTK style lattices.
    How do i transform sphinx lattice to a SRILM compatible lattice?

    Regards,
    Shamsa.

     
  • Nickolay V. Shmyrev

    pocketsphinx can output lattices in HTK format

     
  • shamsa abid

    shamsa abid - 2011-08-21

    I am using pocketsphinx as decoder for my android application.
    and I intend to use SRILM for sausage generation.

    I was able to generate the htk format lattice using the -outlatfmt htk
    parameter in the pocketsphinxbatch command

    i later gave that file as input to the srilm lattice-tool (on cygwin prompt)
    and got a mesh file

    Now i want this mesh available in my android application. In my android app i
    have got a lattice object and i am able to access the words on the nodes of
    the lattice.

    I need to know which srilm file has the method that would take my lattice
    object as input and return me a lattice which is in sausage form. (or whether
    it is possible or not? because i studied the code and i could see lattice
    being manipulated as a file)

    Your comments and guidance is greatly appreciated.

    Regards

     
  • Nickolay V. Shmyrev

    Answer from Andreas Stolke on srilm-users mailing list

    You would have to construct the lattice data structure in memory. The best way
    to learn how to do that (I admit it is somewhat involved) is to read the code
    that reads an HTK lattice file into memory. You can find this in
    $SRILM/lattice/src/HTKLattice.cc, function Lattice::readHTK().

    A more straightforward, if somewhat less efficient way would be to write the
    textual HTK lattice representation to a string object, and then "read" from
    that string using Lattice::readHTK(). For this purpose the File object can be
    constructed from a string, see the comment in $SRILM/misc/src/File.h . I
    believe you need to get the very latest (beta) version of SRILM for this to
    work.

    Andreas

     
  • shamsa abid

    shamsa abid - 2011-09-14

    The ps_lattice_write method for pocketsphinx isnt working for me because im
    running it for my android app. The file name that i specify is an android

    /mnt/sdcard/edu.cmu.pocketsphinx

    folder. The problem may be that the C code uses a file object.
    The following code is from ps_lattice.c

    if ((fp = fopen(filename, "w")) == NULL) {
    E_ERROR("Failed to open lattice file '%s' for writing: %s\n", filename,
    strerror(errno));
    return -1;
    }

    I always get -1 returned to my calling code.

    I desperately need to get the lattice written out so i can see what it looks
    like after the decoder outputs it. Meanwhile Im also trying to traverse the
    lattice using the following code.

    while (nodeiterator.getnode().hyp(this.lattice)!=null )
    {
    nodeiterator=nodeiterator.next();
    if (nodeiterator!=null)
    {
    n1=nodeiterator.getnode().hyp(this.lattice).getHypstr();
    System.out.println(n1);// prints the word at the node

    }else
    break;
    if(n1=="")
    break;

    int dummy=0;
    }

    This gives me the following output:

    09-14 12:23:50.984: INFO/System.out(319): <sil>
    09-14 12:23:50.984: INFO/System.out(319): <sil>
    09-14 12:23:51.013: INFO/System.out(319): <sil>
    09-14 12:23:51.033: INFO/System.out(319): <sil>
    09-14 12:23:51.075: INFO/System.out(319): <sil>
    09-14 12:23:51.075: INFO/System.out(319): <sil>
    09-14 12:23:51.154: INFO/System.out(319): <sil>
    09-14 12:23:51.194: INFO/System.out(319): AN(2)
    09-14 12:23:51.235: INFO/System.out(319): <sil>
    09-14 12:23:51.235: INFO/System.out(319): <sil>
    09-14 12:23:51.253: INFO/System.out(319): <sil>
    09-14 12:23:51.264: INFO/System.out(319): AN(2)
    09-14 12:23:51.293: INFO/System.out(319): <sil>
    09-14 12:23:51.293: INFO/System.out(319): <sil>
    09-14 12:23:51.314: INFO/System.out(319): THE
    09-14 12:23:51.334: INFO/System.out(319): OUT
    09-14 12:23:51.354: INFO/System.out(319): AND(2)
    09-14 12:23:51.383: INFO/System.out(319): A
    09-14 12:23:51.383: INFO/System.out(319): IT
    09-14 12:23:51.424: INFO/System.out(319): <sil>
    09-14 12:23:51.433: INFO/System.out(319): AT
    09-14 12:23:51.463: INFO/System.out(319): IT
    09-14 12:23:51.463: INFO/System.out(319): IT.
    09-14 12:23:51.485: INFO/System.out(319): IT?
    09-14 12:23:51.485: INFO/System.out(319): <sil>
    09-14 12:23:51.514: INFO/System.out(319): A(2)
    09-14 12:23:51.514: INFO/System.out(319): IT.
    09-14 12:23:51.553: INFO/System.out(319): IT?
    09-14 12:23:51.553: INFO/System.out(319): IT
    09-14 12:23:51.574: INFO/System.out(319): AN
    09-14 12:23:51.574: INFO/System.out(319): IN
    09-14 12:23:51.604: INFO/System.out(319): TO(2)
    09-14 12:23:51.604: INFO/System.out(319): A(2)
    09-14 12:23:51.624: INFO/System.out(319): A(2)
    09-14 12:23:51.624: INFO/System.out(319): <sil>
    09-14 12:23:51.663: INFO/System.out(319): <sil>
    09-14 12:23:51.674: INFO/System.out(319): A(2)
    09-14 12:23:51.693: INFO/System.out(319): <sil>
    09-14 12:23:51.693: INFO/System.out(319): <sil>
    09-14 12:23:51.723: INFO/System.out(319): <sil>
    09-14 12:23:51.723: INFO/System.out(319): IN
    09-14 12:23:51.743: INFO/System.out(319): TO(2)
    09-14 12:23:51.743: INFO/System.out(319): AND(2)
    09-14 12:23:51.763: INFO/System.out(319): <sil>
    09-14 12:23:51.774: INFO/System.out(319): HAVE
    09-14 12:23:51.804: INFO/System.out(319): THAT
    09-14 12:23:51.804: INFO/System.out(319): I'VE
    09-14 12:23:51.815: INFO/System.out(319): AT
    09-14 12:23:56.863: INFO/System.out(319): <sil>
    09-14 12:23:56.873: INFO/System.out(319): A
    09-14 12:23:56.903: INFO/System.out(319): </sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil>

    and my code for traversal using exit nodes looks like
    this.lattice= new Lattice(this.ps);
    node = new Latnode ();

    nodeiterator=new Latnode();
    nodeiterator=node.LatIterator(this.lattice);

    //my working code for iteration in lattice

    //code which iterates on node exits, currently giving repetitions
    node=nodeiterator.getnode();
    latlinkiter=new LatlinkIterator();

    while (node!=null )
    {
    latlinkiter=node.nodeExits();

    while (latlinkiter!=null)
    {
    String linkword=latlinkiter.LinkIterLink().LinkWord(this.lattice);
    latlinkiter=latlinkiter.LinkIterNext();
    System.out.println(linkword);

    }
    nodeiterator=nodeiterator.next();
    if(nodeiterator!=null)
    node=nodeiterator.getnode();
    else
    break;

    }

    and my results look like

    09-14 12:23:56.973: INFO/System.out(319): <sil>
    09-14 12:23:57.004: INFO/System.out(319): <sil>
    09-14 12:23:57.004: INFO/System.out(319): <sil>
    09-14 12:23:57.024: INFO/System.out(319): <sil>
    09-14 12:23:57.024: INFO/System.out(319): <sil>
    09-14 12:23:57.053: INFO/System.out(319): <sil>
    09-14 12:23:57.053: INFO/System.out(319): <sil>
    09-14 12:23:57.084: INFO/System.out(319): <sil>
    09-14 12:23:57.084: INFO/System.out(319): <sil>
    09-14 12:23:57.104: INFO/System.out(319): <sil>
    09-14 12:23:57.104: INFO/System.out(319): <sil>
    09-14 12:23:57.133: INFO/System.out(319): <sil>
    09-14 12:23:57.133: INFO/System.out(319): <sil>
    09-14 12:23:57.154: INFO/System.out(319): AN(2)
    09-14 12:23:57.154: INFO/System.out(319): AN(2)
    09-14 12:23:57.174: INFO/System.out(319): AN(2)
    09-14 12:23:57.174: INFO/System.out(319): <sil>
    09-14 12:23:57.203: INFO/System.out(319): <sil>
    09-14 12:23:57.203: INFO/System.out(319): <sil>
    09-14 12:23:57.224: INFO/System.out(319): <sil>
    09-14 12:23:57.224: INFO/System.out(319): <sil>
    09-14 12:23:57.254: INFO/System.out(319): <sil>
    09-14 12:23:57.254: INFO/System.out(319): AN(2)
    09-14 12:23:57.263: INFO/System.out(319): AN(2)
    09-14 12:23:57.273: INFO/System.out(319): AN(2)
    09-14 12:23:57.273: INFO/System.out(319): AN(2)
    09-14 12:23:57.304: INFO/System.out(319): <sil>
    09-14 12:23:57.304: INFO/System.out(319): <sil>
    09-14 12:23:57.324: INFO/System.out(319): THE
    09-14 12:23:57.324: INFO/System.out(319): THE
    09-14 12:23:57.343: INFO/System.out(319): THE
    09-14 12:23:57.343: INFO/System.out(319): THE
    09-14 12:23:57.353: INFO/System.out(319): THE
    09-14 12:23:57.374: INFO/System.out(319): THE
    09-14 12:23:57.374: INFO/System.out(319): THE
    09-14 12:23:57.403: INFO/System.out(319): THE
    09-14 12:24:19.114: INFO/System.out(319):
    09-14 12:24:19.114: INFO/System.out(319):
    09-14 12:24:19.114: INFO/System.out(319):
    09-14 12:24:19.114: INFO/System.out(319):
    09-14 12:24:19.114: INFO/System.out(319):
    </sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil></sil>

    Well i have cutdown so many lines of the results, because there were just too
    many lines.

    Now whats really bugging me is
    1) I am unable to write the complete lattice file on android, i even tried
    giving my hard disk path, that didnt work either and i didnt expect it to
    work. I need the file so i can judge for myself whether my traversal is
    correct or not
    2) The traversal isnt telling me which node follows the next. I need to
    traverse the lattice so that i am able to convert it into a WCN sausage. How
    do i do that?

    With thanks,
    Shamsa.

    2)

     
  • shamsa abid

    shamsa abid - 2011-09-14

    Another thing i need to ask is that i even tried to make the write_lattice
    method return me an array of strings containing the lattice file content and
    return it as char ** return type.
    Now swig eventually makes it a jlong type. But how would that be useful to
    access the array in java?

     
  • Nickolay V. Shmyrev

    1) I am unable to write the complete lattice file on android, i even tried
    giving my hard disk path, that didnt work either and i didnt expect it to
    work. I need the file so i can judge for myself whether my traversal is
    correct or not

    If you open pocketsphinx.log you can easily access error reason. It should
    hint you what is wrong and why application can't create a file. The reasons
    are always trivial, for example you pointed a wrong directory.

    2) The traversal isnt telling me which node follows the next.

    You can access this information with ps_latnode_entries/ps_latnode_exits
    functions which return iterators over entry links and exit links.

     
  • shamsa abid

    shamsa abid - 2011-09-14

    First issue is resolved. I actually pushed an empty text file in that
    directory and then it worked fine.

    For the second case, I was wondering whether ps_latnode_entries and exits
    functions are sufficient to traverse a lattice because my ultimate goal is to
    create a WCN.
    How should i go about pruning the lattice? There dont happen to be any
    functions available for that.
    Then i also have to merge the repeated nodes, any functions for that?
    Or do i have to construct the WCN in memory?

     
  • Nickolay V. Shmyrev

    I was wondering whether ps_latnode_entries and exits functions are
    sufficient to traverse a lattice

    They are sufficient

    How should i go about pruning the lattice?

    /**
     * Prune all links (and associated nodes) below a certain posterior probability.
     *
     * This function assumes that ps_lattice_posterior() has already been called.
     *
     * @param beam Minimum posterior probability for links. This is
     *         expressed in the log-base used in the decoder.  To convert
     *         from linear floating-point, use
     *         logmath_log(ps_lattice_get_logmath(), prob).
     * @return number of arcs removed.
     */
    POCKETSPHINX_EXPORT
    int32 ps_lattice_posterior_prune(ps_lattice_t *dag, int32 beam);
    

    Then i also have to merge the repeated nodes, any functions for that?

    Sorry, I'm not sure what do you mean by that

    Or do i have to construct the WCN in memory?

    You can use files or work in memory, it doesn't matter.

     

Log in to post a comment.