With the uniform mathematical representation of the search space WFST
structure allows you to efficiently compress search space with the WFST
modification algorithms. For example you can reduce the space with WFST
minimization and thus speed up the decoding. You can also represent lattices
as WFST structures and thus have a simple and powerful framework for speech
decoding, redecoding and so on.
You can learn more from the related papers, for example from the ones listed
on our wiki page:
So is WFST the future trend in speech recognition?
it's not a future trend, it's a current trend. And there are many problems in
WFST too, it can't handle large of flexible language models. So the new
approaches are developed too, dynamic decoders for example.
Any shortcoming of this new technology vis-a-vis HMM based technology used
say in Pocketsphinx ?
You can not compare WFST vs HMM, those are technologies of a different scale.
WFST decoder also uses HMM framework. You can compare static WFST decoder vs
dynamic decoder like pocketsphinx like in this paper
Dynamic decoder has it's own advantages. It's like comparing compiled language
vs interpreted language. There was quite some attentions to compiled languages
in computer science history but recently interpreted languages like Python are
more popular for many applications. For some applications compiled languages
are better.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
What are weighted FSTs and how they are superior for speech recognition
compared to HMMs ? Or is it a fair
statement to make ?
Thanks and regards,
With the uniform mathematical representation of the search space WFST
structure allows you to efficiently compress search space with the WFST
modification algorithms. For example you can reduce the space with WFST
minimization and thus speed up the decoding. You can also represent lattices
as WFST structures and thus have a simple and powerful framework for speech
decoding, redecoding and so on.
You can learn more from the related papers, for example from the ones listed
on our wiki page:
http://cmusphinx.sourceforge.net/wiki/asr:wfst
Other than being efficient, do WFST techniques have any effect on recognition
accuracy per se ?
Efficiency could always be traded for accuracy
So is WFST the future trend in speech recognition ? Any shortcoming of this
new technology vis-a-vis
HMM based technology used say in Pocketsphinx ?
Thanks and regards,
it's not a future trend, it's a current trend. And there are many problems in
WFST too, it can't handle large of flexible language models. So the new
approaches are developed too, dynamic decoders for example.
You can not compare WFST vs HMM, those are technologies of a different scale.
WFST decoder also uses HMM framework. You can compare static WFST decoder vs
dynamic decoder like pocketsphinx like in this paper
http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=5372904
Dynamic decoder has it's own advantages. It's like comparing compiled language
vs interpreted language. There was quite some attentions to compiled languages
in computer science history but recently interpreted languages like Python are
more popular for many applications. For some applications compiled languages
are better.
Hi, Nickolay,
Thanks a lot for the information.
Regards,