One of my blue sky ideas for that joyous future day when I find time has been
to use speech recognition to create a memorization aid. Such a program would
need to have a comprehensive overall vocabulary, but at any given time it
would really only need to be answering the question "does what's being spoken
match the next word in the text" (though it'd be nice to be able to discern a
couple of dozen other words- synonyms of the correct word and transpositions
of words further on in the text). I haven't looked very hard at the details of
the workings of the different Sphinx versions, but my first impression is that
the way to go with Sphinx would be to use sphinx4 and generate a specialized
grammar when a text is loaded. Am I on the right track at all? How feasible
does this sound to you?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
ut my first impression is that the way to go with Sphinx would be to use
sphinx4 and generate a specialized grammar when a text is loaded
.
Yes, that's similar approach to Aligner grammar which aligns words and audio.
However, you need to distinguish incorrect words in grammar (maybe you'll need
to enable oov branch as well and maybe you'll need a special tool to handle
delays). It's pretty straightforward.
Am I on the right track at all? How feasible does this sound to you?
It's feasible
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
One of my blue sky ideas for that joyous future day when I find time has been
to use speech recognition to create a memorization aid. Such a program would
need to have a comprehensive overall vocabulary, but at any given time it
would really only need to be answering the question "does what's being spoken
match the next word in the text" (though it'd be nice to be able to discern a
couple of dozen other words- synonyms of the correct word and transpositions
of words further on in the text). I haven't looked very hard at the details of
the workings of the different Sphinx versions, but my first impression is that
the way to go with Sphinx would be to use sphinx4 and generate a specialized
grammar when a text is loaded. Am I on the right track at all? How feasible
does this sound to you?
.
Yes, that's similar approach to Aligner grammar which aligns words and audio.
However, you need to distinguish incorrect words in grammar (maybe you'll need
to enable oov branch as well and maybe you'll need a special tool to handle
delays). It's pretty straightforward.
It's feasible
i am looking to do the same, can you update on this project