From: spencer l. <spe...@gm...> - 2010-01-06 17:59:37
|
On Sat, Jan 2, 2010 at 6:04 PM, johny jj2 <joh...@gm...> wrote: > SpencerLord, thank you for your previous answers once more :-). > > I got rid of those difficulties which I explained in my previous post. > So now let me come to questions related only to Cairo/Zanzibar. If > those are only Cairo/Zanzibar-connected (and those are) I see no other > place at all to get the answer to these questions than this mailing > list (especially first two questions but also the third one). > > 1. First and most important question. I see things which are told by > application are written in the file parrot.vxml. So I guess it uses > Text To Speech to generate those. I found only one wav file, i.e. > parrot.wav but it doesn't include e.g. "I could not hear you" or "Your > response was out of grammar". I also cannot see where this file > parrot.wav is used in Parrot.java or parrot.vxml or any other file. In > other words it looks like this file is recorder but not used at all. > Am I right that those things written in parrot.vxml are spoken by Text > To Speech? > Where is the usage of parrot.wav specified? And the crucial > thing for my project - I cannot use any Text To Speech because this > TTS requires well-trained acoustic base for my language. (I worked > only with ASR, not with TTS but I guess there must be really much > training for TTS). Those acoustic models are freely available and of > good quality for English language but for my language there are no > acoustic models available for free. So finally coming to this most > important question - where can I specify wav files which I'd like the > application to speak? I think the most comfortable place would be > Parrot.java (or MyApp.java, created similarly as Parrot.java). My > application would ask the user to speak some digits, then it will > calculate control sum based on those digits and say to the user "The > control sum is correct" or, in the other case, "The control sum is > incorrect. Do you want to accept incorrect sum?". So it looks like it > cannot be specified in vxml file because it is not a standard thing > for voice control (by standard thing I mean something like "out of > grammar" or "I could not hear you"). And if it is not a standard > thing, I think it can be specified only in Parrot.java. > Yes the Parrot demo does use tts and not pre-recoded audio. I should add an option to use pre-recored audio. Take a look at the Jukebox demo, it plays pre-recorded audio. For example in this method dylan variable is a url to an audio clip. sClient.playAndRecognizeBlocking(true, dylan, playGrammar, true); Also note that the SpeechClient has a queuePrompt and playBlocking method which have two parameters. IF the urlPrompt flag is true, then the second parameter is a url to an audio file. if false, it is text to be synthesized. As far as vxml playing audio, I am not a vxml expert but I think that this will play a pre-recorded audio file <block> <audio src="wav_file_URL"/> </block> This will do tts: <block> <audio>Hello, World</audio> </block> I think you are talking about getting the semantic meaning from what was spoken. That is usually done by tags in the grammars. The version of jvoicexml that I used in this release did not have support for extracting the semantic meaning. They may have added that since, in whihc case I should release a new version. But note that even as is you can do some conditional logic <if cond="main=='quit'"> <exit/> <else/> <clear namelist="main"/> <reprompt/> </if> > > 2. Other, also important thing. I don't see in sphinx-config.xml any > line which indicates where Sphinx4 is installed. So from this it looks > like Sphinx4 doesn't have to be installed if the directory where it is > installed is not specified in the configuration file. But on the other > hand Cairo/Zanzibar is responsible only for connecting Asterisk with > Sphinx4 so it looks like Sphinx4 has to be installed. What do I lack > in my understanding of the issue? Where to specify the directory for > Sphinx4? Do I need to have this Sphinx4 installed after all? (I guess > I need). > Cairo uses an internal sphinx_config file. It is inside the cairo jar. You can specifiy your own by setting the path in cairo-config. The sphinx jars are alos included -- so no need to install your own version of sphinx. > > 3. Do I have all the required files in my acoustic model jar archive? > I've got two directories (etc and model_parameters). In etc there are > files which were input for SphinxTrain (dictionary, filler, lm, > transcriptions). In model_parameters there is only one directory > pl1.ci_cont. First of all I don't have any ModelLoader.class > (http://images45.fotosik.pl/242/a07bc3c15d943928.jpg) in my pl1.jar > file because it wasn't created by SphinxTrain. More information (if > needed) about my acoustic model given here > (http://www.speedyshare.com/files/20015137/foto.rar) in files 48-53 > (WSJ model), 54-64 (my way of creating the model), 65-69 (final > version of jar file with model). > I can not help you too much here. I am not yet up to speed with building acoustic models. We are using the existing WSJ 8khz models that are packaged in jar files. You can specify your own models, by setting up your own sphinx-config file but make sure that teh jar files are in the path (just put them in the lib directory) Are your acoustic models in jar files? > > 4. In Parrot.java I see [CODE]private String grammar; // = > "file:.../demo/grammar/example-loop.gram";[/CODE]. First thing, why > can't I see any place in Parrot.java which indicates that this grammar > is used? (Similarly the [CODE]private String prompt[/CODE] is > specified only in parrot.vxml). And second thing, why in some places > it is example-loop.gram and in the other parrot.gram? > The grammar is actually used by sphinx. Parrot.java does pass the grammar url in the recognize methods. Similar answer for parrot.vxml, the grammar does evntually get to the sphinx engine and is used for recognition. > > 5. And minor question. May you explain, please, to me this kind of > syntax > value="resource:/org.speechforge.cairo.server.recog.sphinx.SphinxRecEngine!/grammar" > ? I saw similar things in Sphinx4 files and I don't know why some part > of it is before exclamation mark and some is after. You can have a > look here > http://forum.idg.pl/programowanie-f119-zmiana_linka_w_pliku_xml-t193808.html > . The only what matters is the code which I included in the post. > First is what I had, second what I changed it to. Similarly third is > what I had, fourth what I changed it to. > > Thanks for answers in advance! > Regards! > > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast and > easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > _______________________________________________ > cairo-user mailing list > cai...@li... > https://lists.sourceforge.net/lists/listinfo/cairo-user > |