Lee Baker - 2019-02-20

My application processes short <10s commands from a constrained vocabulary and grammar.

As a command is given, I'm processing the audio as it is recorded, and then presenting the user with the current best hypothesis in real time using ps_start_utt() / ps_process_raw() / ps_seg_iter().

After recording has stopped, I re-process the command as a whole to obtain the final result- ps_start_utt() / ps_process_raw() / ps_end_utt() / ps_seg_iter(). The reason I'm processing a second time is that a comment on ps_process_raw() indicates that accuracy may be higher if you process a full utterance at once; this seems to be the case.

As I'm trying to reduce the CPU time necessary, is there a way to reduce the redunant work when processing twice like this, while preserving the accuracy of processing the full utterance together?