CMU Sphinx / Forums / Help: Non-digit(english words) speech2text conversi

hi folks

My goal is to do offline speech (containing english words) 2 text conversion using sphinx4..i.e i have a wav file which can contain english words( i think in sr lingo these r c/d non digit utternaces)and i need 2 convert these into text.

i am exploring the sphinx4 demos...hellodigits,wavfile,transcriber etc but all of them only works with digits data.

i have the follg queries :

1.what will it take for me to make these demos work for non-digit data(i.e english language words)?....

the readme for transcriber demo says that non-digit STT can be accomplished by suitably modifying the config.xml file....i went through the sphinx4 configuration management doc..

but can smbdy help me figure out exactly what components need to be modified and roughly what all changes are required..if i want it to work for non-digits(i.e normal english words) instead of only digits

to use the lattice demo we need to get it by cvs and then build & run it.

i am behind a firewall and have tried accessing the cvs tree using wincvs and the normal cmdline cvs but it hasnt worked for me

The alternative suggested at sphinx documentation(https://sourceforge.net/docman/display_doc.php?docid=14033&group_id=1#firewall)

D:\share_for_spiff\sphinx related\sphinx4-1.0beta\bin>cvs -d :pserver:anonymous@
cvs-pserver.sourceforge.net:80/cvsroot/cmusphinx co sphinx4
Unknown host cvs-pserver.sourceforge.net.

has not worked for me.(have tried port 443 as well)

also i came across the hellongram demo and i tried it out....for me the accuracy of the recognition was absymally low.....

2 Qs here :

i. how can i improve the accuracy for this demo ?
ii. is it possible to add to the list of sentences which can be recognized?
if so, how exactly 2 go abt it ?

some major irritants
a. 'the' in the beginning of a sentence is almost never recognized
b. 'purple' is rarely recognized correctly.it is almost always recognized as 'front'

can sphinx be used for enterprise grade SR as well ?
i am contemplating a scenario where the SR is done on a central server as opposed to doing it on individual machines as individual devices often have limited computation and memory capabilities.

so the idea is to have a v.v. high grade SR done centrally on a server which would do SR for several devices which would submit their individual "SR jobs"( for lack of a better word) to it( "the SR server").

Can u sugest some other options (other than sphinx) which could meet this requirement ?

i am aware of only MSS(microsoft speech server)

would sphinx be a good choice for such a scenario ?
would changes be required to sphinx in its current form to do this i.e. to make it enterprise grade ?

awaiting an early reply

thanks a ton

ashish

Non-digit(english words) speech2text conversi

Speech Recognition Toolkit

Forums

Help

Non-digit(english words) speech2text conversi document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Non-digit(english words) speech2text conversi