There are 2 ways to get nbest lists in S3.
1. Use -nbestdir param with sphinx3_decode
2. create *lat files with decode, feed them to sphinx3_astar
I compared the nbest lists created by the above 2 methods and the TOP
hypothesis in the2 lists matches 99.9% of the time. I'm testing on around 8000
wav files.
But, the number of nbest hypotheses is different and the difference is huge
for many files. Overall, s3_astar gave 34071 less hypotheses in nbest list as
compared with s3_decode.
I had kept the beam, wip, lw and min_endfr params same in both the cases. I
guess the difference is somehow because of implementation in src/programs/main_astar.c (used bu s3_astar) and
src/libs3decoder/libsearch/astar.c (used bu s3_decode).
Is there any reason why one of these methods of creating nbest lists is to be
preferred over another (which code is 'more correct')?
Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I've just tried this and both results are identical. Given settings are the
same the lists have same length. It's not strange because both sphinx3_astar
and sphinx3_decode call the same function nbest_search. The only difference
might be in a default parameters.
Maybe you need to provide more detailed example.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks a lot for checking, but I'm still getting different results :(
I have verified that in the function trace nbest_search() is common for both
s3_decode and s3_astar (see below)
Below that I have pasted logs and nbest lists of s3_decode and s3_astar.
I think the difference is because of the dag structure. The dag can be printed
in s3_decode with the help of srch_FLAT_FWD_dag_dump(srch, dag) function,
but can't do so in s3_astar as there is no srch structure available there
which contains lattice history. If you know of a way the two dags (one in
s3_decode and another in s3_astar) can be compared, please let me know.
Should I upload the wav file and other related files for testing?
I was able to dump the dag that is sent to nbest_search in s3_astar. There are
2 nodes less in it as compared to the those in *lat file that was given as
input.
The issue that was here is that dag construction created a link from to
which was stripped in astar during loading. Dag construction is fixed now
in trunk.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
There are 2 ways to get nbest lists in S3.
1. Use -nbestdir param with sphinx3_decode
2. create *lat files with decode, feed them to sphinx3_astar
I compared the nbest lists created by the above 2 methods and the TOP
hypothesis in the2 lists matches 99.9% of the time. I'm testing on around 8000
wav files.
But, the number of nbest hypotheses is different and the difference is huge
for many files. Overall, s3_astar gave 34071 less hypotheses in nbest list as
compared with s3_decode.
I had kept the beam, wip, lw and min_endfr params same in both the cases. I
guess the difference is somehow because of implementation in
src/programs/main_astar.c (used bu s3_astar) and
src/libs3decoder/libsearch/astar.c (used bu s3_decode).
Is there any reason why one of these methods of creating nbest lists is to be
preferred over another (which code is 'more correct')?
Thanks.
I should mention that I used
while using s3_decode
I've just tried this and both results are identical. Given settings are the
same the lists have same length. It's not strange because both sphinx3_astar
and sphinx3_decode call the same function nbest_search. The only difference
might be in a default parameters.
Maybe you need to provide more detailed example.
Hello!
Thanks a lot for checking, but I'm still getting different results :(
I have verified that in the function trace nbest_search() is common for both
s3_decode and s3_astar (see below)
Below that I have pasted logs and nbest lists of s3_decode and s3_astar.
I think the difference is because of the dag structure. The dag can be printed
in s3_decode with the help of srch_FLAT_FWD_dag_dump(srch, dag) function,
but can't do so in s3_astar as there is no srch structure available there
which contains lattice history. If you know of a way the two dags (one in
s3_decode and another in s3_astar) can be compared, please let me know.
Should I upload the wav file and other related files for testing?
Here is my log for s3_decode
and here is the log for s3_astar
and below is the nbest list for s3_decode
below is the nbest list for s3_astar
Update:
I was able to dump the dag that is sent to nbest_search in s3_astar. There are
2 nodes less in it as compared to the those in *lat file that was given as
input.
and for s3_astar the lattice is as below -
For some reason there are 2 less nodes in s3_astar dag ....
The issue that was here is that dag construction created a link from to
which was stripped in astar during loading. Dag construction is fixed now
in trunk.
Thanks!