Voice recognition is CRAP

Brought to you by: evanvennn, great-cow-basic, kent_twt4, w_cholmondeley

Voice recognition is CRAP

Forum: Open Discussion

Creator: Bertrand BAROTH

Created: 2022-02-22

Updated: 2022-02-23

Bertrand BAROTH - 2022-02-22

I tested several modules. The speaker independent models (I had the bargain to get the licence for one of them for half the price at RobotShop together with a former version of the module) are not reliable enough, I noticed that commands which don't absolutely not sound the same get confused and some are simply rejected ; and the systems are very sensitive to speakin' "speed" ; BTW "start" can be triggered by saying "f..ck", which denotes that the systems are rather sensitive to vowels ! One Chinese system on AliExpress comes with a vocabulary already programmed in factory ... if You order at least 200 parts ! The LD3320 needs configuration in PinYin, so it is rather dedicated to Chinese (or are You able to write English or French in Pinyin ? ) And with speaker dependent versions, after one week my voice seemed to have changed enough so that some commands were no more recognized and I had to train them again. One model needs training connected to a PC via USB, and mine (W7 x64) wrote "unknown device", I saw on a forum that I was not alone with this issue. Forget interfacing to online systems, "Big Brother is watching (listening to) You" ... Finally, nothing is better than pushbuttons !

:( :( :(

Last edit: Bertrand BAROTH 2022-02-22

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Chris Roper - 2022-02-22

Many years ago I was beta testing Microsoft Voice recognition software whilst my wife was watching TV.
Someone on the TV shouted "Close the Windows" and my PC dutifully shut down.
I have never trusted voice recognition since.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bertrand BAROTH - 2022-02-23

It sounds like an old, well known IT-joke ...
:)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bertrand BAROTH - 2022-02-23

Finally it's "roughly" possible to use Elechouse's SimpleVR (speaker independent) if one uses a tree structure, so there are never more that 4 (carefully chosen) words active in the present context. And the user has to say 5 expressions (if no error occurs ! ) instead of pressing 2 or 3 buttons. Humm ... good for demonstration, but where is the practical use ? Especially given the fact that there must be a feedback to signal the context and the word recognized (visual display or speech synthesis). Again humm ...

:(

Last edit: Bertrand BAROTH 2022-02-25

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.