Menu

n-best-list sphinx3 problems

Help
2008-01-16
2012-09-22
  • Masrur Doostdar

    Masrur Doostdar - 2008-01-16

    Hi,

    I'm trying to get N-best list using sphinx 3.7 decoder. First of all preferably i'd like to generate n-best, when decoding with FSGs, but it seems there is no support yet in sphinx for n-best generation when decoding with FSGs (no word lattice generation/no astar routine), right? It would be very valuable for us having this, so could you give an idea if some work is planned for this in the near future?

    So for now i try to generate n-best lists when working with language-models, but have to cope with some errors - so i'm not sure, whether there are failures in my usage:

    i tried first very simple, getting the graph, initializing astar and trying to get a next hypothesis:

    dag_t * p_dag;
    astar_t* p_astar;
    p_dag = s3_decode_word_graph(_decoder);
    p_astar = astar_init(p_dag, kbcore_dict(_decoder->kbcore), kbcore_lm(_decoder->kbcore),
    kbcore_fillpen(_decoder->kbcore),
    cmd_ln_float64_r(cmd_ln,"-beam"),
    cmd_ln_float32_r(cmd_ln,"-lw")
    );
    glist_t hyp_list=astar_next_hyp(p_astar);

    i get this error with the corresponding stack:

    speech_live_recog_new: astar.c:567: astar_next_ppath: Assertion `l->node->reachable && (!l->bypass)' failed.

    Program received signal SIGABRT, Aborted.
    [Switching to Thread -1266910320 (LWP 12307)]
    0x0047f402 in __kernel_vsyscall ()
    (gdb) bt
    #0 0x0047f402 in __kernel_vsyscall ()
    #1 0x002abd40 in raise () from /lib/libc.so.6
    #2 0x002ad591 in abort () from /lib/libc.so.6
    #3 0x002a538b in __assert_fail () from /lib/libc.so.6
    #4 0x00ac9e4f in astar_next_ppath (astar=0x87e4bb8) at astar.c:567
    #5 0x00ac9f7c in astar_next_hyp (astar=0x87e4bb8) at astar.c:627

    ok, i thought seems there has to be done some preprocessing with the lattice (like marking reachability and removing uncreachable node). So i had a look in main_astar.c and took from it some appropriate-looking extra-code:

    dag_t * p_dag;
    astar_t* p_astar;
    p_dag = s3_decode_word_graph(_decoder);
    E_INFO("%5d frames, %6d nodes, %8d edges, %8d bypass\n",
    p_dag->nfrm, p_dag->nnode, p_dag->nlink, p_dag->nbypass);

    if (dict_filler_word(kbcore_dict(_decoder->kbcore), p_dag->end->wid))
    p_dag->end->wid = kbcore_dict(_decoder->kbcore)->finishwid;

    dag_remove_unreachable(p_dag);
    if (dag_bypass_filler_nodes(p_dag, 1.0, kbcore_dict(_decoder->kbcore), kbcore_fillpen(_decoder->kbcore)) < 0) {
    E_ERROR("maxedge limit (%d) exceeded\n", p_dag->maxedge);
    }
    E_INFO("%5d frames, %6d nodes, %8d edges, %8d bypass\n",
    p_dag->nfrm, p_dag->nnode, p_dag->nlink, p_dag->nbypass);

    dag_compute_hscr(p_dag, kbcore_dict(_decoder->kbcore), kbcore_lm(_decoder->kbcore), 1.0);
    dag_remove_bypass_links(p_dag);

    p_astar = astar_init(p_dag, kbcore_dict(_decoder->kbcore), kbcore_lm(_decoder->kbcore),
    kbcore_fillpen(_decoder->kbcore),
    cmd_ln_float64_r(cmd_ln,"-beam"),
    cmd_ln_float32_r(cmd_ln,"-lw")
    );
    glist_t hyp_list=astar_next_hyp(p_astar);

    now i get this error with corresponding stack (i set a breakpoint, cause elsewhise i wasnt able to get the stack):

    INFO: livedecode_new.cpp(228): 64 frames, 0 nodes, 8 edges, 0 bypass
    INFO: livedecode_new.cpp(238): 64 frames, 0 nodes, 14 edges, 6 bypass
    [Switching to Thread -1267344496 (LWP 12521)]

    Breakpoint 2, lm_bg_score (lm=0x9b60430, lw1=30, lw2=65535, w2=112)
    at lm.c:1244
    1244 E_FATAL("Bad lw2 argument (%d) to lm_bg_score\n", lw2);
    Current language: auto; currently c
    (gdb) bt
    #0 lm_bg_score (lm=0x9b60430, lw1=30, lw2=65535, w2=112) at lm.c:1244
    #1 0x001400c1 in lm_tg_score (lm=0x9b60430, lw1=65535, lw2=30, lw3=65535,
    w3=112) at lm.c:1663
    #2 0x0014f8b4 in dag_compute_hscr (dag=0x9d4e3c8, dict=0x9961718,
    lm=0x9b60430, lwf=1) at dag.c:557
    #3 0x0804c2bf in n_best_list (_decoder=0x8052640) at livedecode_new.cpp:241
    #4 0x0804f48f in decode_closed_speech (m_pCSD=0x9d3fa90, frames=0x9d49988,
    num_frames=13, frame_number=@0xb475d3a0, state=WAIT_FOR_END,
    hypstr=0xb475d39c, hypsegs=0xb475d394, filter_hypstr=0xb475d398)
    at livedecode_new.cpp:474
    #5 0x0804fb36 in process_thread (aParam=0x0) at livedecode_new.cpp:632
    #6 0x00dad3db in start_thread () from /lib/libpthread.so.0
    #7 0x0044226e in clone () from /lib/libc.so.6

    so is there some apparent mistake in my usage, or any idea what could cause this problems?

    regars
    Masrur D.

     
    • Masrur Doostdar

      Masrur Doostdar - 2008-01-21

      no idea on my issue?

       
      • Nickolay V. Shmyrev

        does sphinx3_astar work for you?

         
        • Masrur Doostdar

          Masrur Doostdar - 2008-01-22

          I just tried it out.
          Yes sphinx3_astar does work. I started it with the same configuration as the decoder (same lm, dict, fdict, mdef). However, i uploaded the lm-file [1]. Perhaps you can have a look if my problem possibly could be caused by this lm file. I tried converting this text-lm-file into DMP format with lm_convert and got a segmentation-error - so its somewhat suspicious...

          regards
          Masrur D.

          [1] http://www-users.rwth-aachen.de/Masrur.Doostdar/navigation-go7.lm

           
          • Nickolay V. Shmyrev

            Everything is fine with a lm, it's converted properly. You only have to point output file name:

            sphinx3_lm_convert -i navigation.lm -o navigation.lm.dmp

            (well, it's a bug in convert it doesn't work with -o)

            about astar, a bit later.

             
            • Masrur Doostdar

              Masrur Doostdar - 2008-01-30

              Thanks Nickolay for your answer about the lm!

              >about astar, a bit later.

              has my error been reproducable for you, and you are aiming to look for a patch or are you aiming to tell me what i'm doing wrong in the usage of astar?

              thanks again and regards

              Masrur D.

               
              • Masrur Doostdar

                Masrur Doostdar - 2008-02-25

                A short report how i worked out my problem. Its again a workaround, since I dont know what causes this problem:
                Instead of getting the lm to be used for astar by means of the kbcore struct [kbcore_lm(_decoder->kbcore)], i retrieved it at the intialization of my programm directly using the config-parameters:

                p_nbest_lmset = lmset_init(cmd_ln_str_r(cmd_ln,"-lm"),
                cmd_ln_str_r(cmd_ln,"-lmctlfn"),
                cmd_ln_str_r(cmd_ln,"-ctl_lm"),
                cmd_ln_str_r(cmd_ln,"-lmname"),
                cmd_ln_str_r(cmd_ln,"-lmdumpdir"),
                cmd_ln_float32_r(cmd_ln,"-lw"),
                cmd_ln_float32_r(cmd_ln,"-wip"),
                cmd_ln_float32_r(cmd_ln,"-uw"), kbcore_dict(decoder.kbcore));

                This works so far that I now get the n-best, and its looks ok all in all. There is still one strange thing about the language-model-scores in the n-best-list: In each hypotheses of the n-best-list all words (including silence,excluding the first word:score 0) have the language-model-score of 243 independent on the language-model used.

                Additionaly, would be very nice, if someone(perhaps David?) could give a short comment on the issue i mentioned in the first post of this thread:
                >First of all preferably i'd like to generate n-best, when decoding with FSGs, but it seems there is no >support yet in sphinx for n-best generation when decoding with FSGs (no word lattice generation/no astar >routine), right? It would be very valuable for us having this, so could you give an idea if some work is >planned for this in the near future?

                thanks and regards
                Masrur D.

                 

Log in to post a comment.