I would like to convert the start and end frame value of my output (see below) to an actual time point. Can I assume that each frame has the same duration, so that:
start_second = (AUDIO_DURATION/MAX_FRAME_NUMBER) * SFM ?
Does sphinx3 provide me with the values for AUDIO_DURATION and MAX_FRAME_NUMBER somewhere?
Hi, it's actually simpler than that. The length of each frame is 1.0/frame_rate, where frame_rate is specified by the -frate configuration option. This is usually 100, so the length of each frame is usually 10ms.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all,
I would like to convert the start and end frame value of my output (see below) to an actual time point. Can I assume that each frame has the same duration, so that:
start_second = (AUDIO_DURATION/MAX_FRAME_NUMBER) * SFM ?
Does sphinx3 provide me with the values for AUDIO_DURATION and MAX_FRAME_NUMBER somewhere?
FV:29-10-07_78.416> WORD SFrm EFrm AScr(UnNorm) LMScore AScr+LScr AScale
fv:29-10-07_78.416> <sil> 0 26 -2814963 -97464 -2912427 -2343676
fv:29-10-07_78.416> DURING(2) 27 94 -1696112 -292500 -1988612 -152794
fv:29-10-07_78.416> WORLD 95 443 10547560 -179520 10368040 14301418
Hi, it's actually simpler than that. The length of each frame is 1.0/frame_rate, where frame_rate is specified by the -frate configuration option. This is usually 100, so the length of each frame is usually 10ms.