Hi all, first of all say I'm a rookie in the speech recognition.
I'd like if you can help me a little bit with this two application.
App1: (isolated words)
Regardless of the user, the software has to understand 26.000 isolated words
of vocabulary.
The 26.000 would be pre-recorded by a speaker A, then the software has to
understand a speaker X saying a word of this list, so the software finally has
to tell which word are saying the speaker X.
App2: (continuous speech)
The speaker X reads a default text (300-500 lines aprox.). The software
understands witch sentence is being read.
Like in App1 the sentences would be pre-recorded by a speaker A.
The speaker X doesn’t have to be a native, so maybe he/she doesn’t pronounce
100% correctly.
I’m interested that the recognition was nearly of 100% reliability.
Is it possible with pocketsphinx?
Which is the way forward?
Any recommendation?
Sorry for my english.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I’m interested that the recognition was nearly of 100% reliability. Is it
possible with pocketsphinx?
No, it's not possible
Which is the way forward? Any recommendation?
The way forward is to start with a requirements of the application and design
the way how recognizer helps to satisfy them. For example, if you consider
language learning task, there are ways to teach language using automated
speech recognition but they require special design of the application as a
whole and special design of the recognizer cooperation.
If you consider some other type of application of the recognizer, you also
need to design your application with recognizer capabilities in mind, not
setting the requirements for engine first which can't be satisfied.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi all, first of all say I'm a rookie in the speech recognition.
I'd like if you can help me a little bit with this two application.
App1: (isolated words)
Regardless of the user, the software has to understand 26.000 isolated words
of vocabulary.
The 26.000 would be pre-recorded by a speaker A, then the software has to
understand a speaker X saying a word of this list, so the software finally has
to tell which word are saying the speaker X.
App2: (continuous speech)
The speaker X reads a default text (300-500 lines aprox.). The software
understands witch sentence is being read.
Like in App1 the sentences would be pre-recorded by a speaker A.
The speaker X doesn’t have to be a native, so maybe he/she doesn’t pronounce
100% correctly.
I’m interested that the recognition was nearly of 100% reliability.
Is it possible with pocketsphinx?
Which is the way forward?
Any recommendation?
Sorry for my english.
No, it's not possible
The way forward is to start with a requirements of the application and design
the way how recognizer helps to satisfy them. For example, if you consider
language learning task, there are ways to teach language using automated
speech recognition but they require special design of the application as a
whole and special design of the recognizer cooperation.
If you consider some other type of application of the recognizer, you also
need to design your application with recognizer capabilities in mind, not
setting the requirements for engine first which can't be satisfied.
Thank you for the answer and recomendation.