My application processes short <10s commands from a constrained vocabulary and grammar.
As a command is given, I'm processing the audio as it is recorded, and then presenting the user with the current best hypothesis in real time using ps_start_utt() / ps_process_raw() / ps_seg_iter().
After recording has stopped, I re-process the command as a whole to obtain the final result- ps_start_utt() / ps_process_raw() / ps_end_utt() / ps_seg_iter(). The reason I'm processing a second time is that a comment on ps_process_raw() indicates that accuracy may be higher if you process a full utterance at once; this seems to be the case.
As I'm trying to reduce the CPU time necessary, is there a way to reduce the redunant work when processing twice like this, while preserving the accuracy of processing the full utterance together?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
My application processes short <10s commands from a constrained vocabulary and grammar.
As a command is given, I'm processing the audio as it is recorded, and then presenting the user with the current best hypothesis in real time using ps_start_utt() / ps_process_raw() / ps_seg_iter().
After recording has stopped, I re-process the command as a whole to obtain the final result- ps_start_utt() / ps_process_raw() / ps_end_utt() / ps_seg_iter(). The reason I'm processing a second time is that a comment on ps_process_raw() indicates that accuracy may be higher if you process a full utterance at once; this seems to be the case.
As I'm trying to reduce the CPU time necessary, is there a way to reduce the redunant work when processing twice like this, while preserving the accuracy of processing the full utterance together?