You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(39) |
Sep
(119) |
Oct
(177) |
Nov
(164) |
Dec
(77) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(32) |
Feb
(10) |
Mar
(38) |
Apr
(47) |
May
(15) |
Jun
(5) |
Jul
(36) |
Aug
(4) |
Sep
(5) |
Oct
(12) |
Nov
(4) |
Dec
|
2004 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
(2) |
Aug
(6) |
Sep
(12) |
Oct
|
Nov
|
Dec
|
2005 |
Jan
(2) |
Feb
|
Mar
|
Apr
(7) |
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2007 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
(2) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Ivan U. <ia...@uk...> - 2004-09-13 11:29:39
|
FWIW I just tidied up the formatting on the 'IvansNotes' page: http://xvoice.arborius.net/xvoice-sphinx/IvansNotes Ivan |
From: Ivan U. <ia...@uk...> - 2004-09-13 11:06:25
|
Hello John > ... > "make_dict" > ... > > /********** extracted from the script on the wiki page ********** > #!/bin/sh -xv > rm time.html > rm model_architecture/time.[a-z]* > bin/make_dict etc/time.transcription > # > #WIll create etc/word.known etc/word.unknown files,. check them once you are > happy, > # > mv etc/word.known etc/time.dic > # > #Make the melcep feature files > # > bin/make_feats etc/time.fileids > # > #Now we can start on the basic perl scripts, ther > > /********* end of extraction *************** make_dict requires festival and the CMUDict pronunciation dictionary (PD). Note that (AFAIK) word.known and word.unknown are not sorted or uniq'd. I don't know if sorting matters (except to humans) but your final pronunciation dictionary should have unique headwords. If you don't have festival and/or CMUDict (i.e. you have a different PD), it's not difficult to write a script to do what make_dict does. > In the mean time, the language model produced by the CMU tools is working, > but would like to add some training for individual speakers. Working on a > project for air traffic control/pilot communications (which is highly > structured and contained) but there is a requirement for a high level of > accuracy. Do you have a speaker-independent Acoustic Model already? The SphinxTrain manual and faq have notes on adapting existing models. The manual says you can build speaker-specific AMs with 8-10 hours of data. This might be the best way to go if you have/can easily get the data. Good luck Ivan |
From: Arthur C. <ar...@ux...> - 2004-09-11 19:51:36
|
Hi John, If you need database, tries out the CMU Communicator database. It is free to use. (Free in the sense of free-software, not free-beer). It also contains pretty large amount of data.=20 You can obtain it by asking Prof. Alex Rudnicky (air at cs dot cmu dot edu )in CMU. Of course, you need to pay shipping and packaging fees. Arthur Phone (Office) : 8-9504 -----Original Message----- From: xvo...@li... [mailto:xvo...@li...] On Behalf Of Jessica Perry Hekman Sent: Saturday, September 11, 2004 2:52 PM To: John Wojnaroski Cc: xvo...@li... Subject: Re: [Xvoice-sphinx] Wiki SphinxTrain On Sat, Sep 11, 2004 at 11:25:46AM -0700, John Wojnaroski wrote: > It looks like the file in that directory is written in Scheme and is = not a > binary file. True, but the first line of it tells the shell how to execute it: #!/bin/sh "true"; exec /usr/local/festival/bin/festival --script $0 $* Is festival installed on your system? That could be the problem. j ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM.=20 Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xvoice-sphinx mailing list Xvo...@li... https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx |
From: Jessica P. H. <jph...@ar...> - 2004-09-11 18:51:44
|
On Sat, Sep 11, 2004 at 11:25:46AM -0700, John Wojnaroski wrote: > It looks like the file in that directory is written in Scheme and is not a > binary file. True, but the first line of it tells the shell how to execute it: #!/bin/sh "true"; exec /usr/local/festival/bin/festival --script $0 $* Is festival installed on your system? That could be the problem. j |
From: John W. <ca...@mm...> - 2004-09-11 18:26:01
|
> > The build script outlined in the "recipe" is unable to find the "make_dict" > > program. Neither can I... > > > > Is there something missing? Like the CMU language model tools or > > speech_tools or ??? > > It looks to me like it comes with SphinxTrain -- I have it in the > scripts_pl directory. Check your distribution? > It looks like the file in that directory is written in Scheme and is not a binary file. /********** extracted from the script on the wiki page ********** #!/bin/sh -xv rm time.html rm model_architecture/time.[a-z]* bin/make_dict etc/time.transcription # #WIll create etc/word.known etc/word.unknown files,. check them once you are happy, # mv etc/word.known etc/time.dic # #Make the melcep feature files # bin/make_feats etc/time.fileids # #Now we can start on the basic perl scripts, ther /********* end of extraction *************** Note that the next call to "bin/make_feats" is a shell script that winds up calling /bin/wave2feat. Both files are there in ../time/bin along with a bunch of logical links. "make_feats" works but "make_dict" fails. I'm not all that famliar with Scheme... wondering if it requires some sort of interpreter or compiler to run. In the mean time, the language model produced by the CMU tools is working, but would like to add some training for individual speakers. Working on a project for air traffic control/pilot communications (which is highly structured and contained) but there is a requirement for a high level of accuracy. Thanks for your help Regards John W. |
From: Jessica P. H. <jph...@ar...> - 2004-09-11 13:20:13
|
On Fri, Sep 10, 2004 at 10:22:53PM -0700, John Wojnaroski wrote: > The build script outlined in the "recipe" is unable to find the "make_dict" > program. Neither can I... > > Is there something missing? Like the CMU language model tools or > speech_tools or ??? It looks to me like it comes with SphinxTrain -- I have it in the scripts_pl directory. Check your distribution? I would be happy to mail the script to you as a last resort :) j |
From: John W. <ca...@mm...> - 2004-09-11 05:23:15
|
Hi, The build script outlined in the "recipe" is unable to find the "make_dict" program. Neither can I... Is there something missing? Like the CMU language model tools or speech_tools or ??? Thanks John W. |
From: Jessica P. H. <jph...@ar...> - 2004-08-03 19:28:47
|
On Tue, 3 Aug 2004, Paul Lamere wrote: > Sure thing. One thing I'd like to do to get the ball rolling would be > to look at how xvoice interfaces to a particular speech engine. Do you > know if there's a speech engine api that xvoice uses to interface to a > particular speech engine? It'd be interesting to hook xvoice to s4 to > allow C&C of the desktop, while the trainer folks work their training > magic over the models to improve the large vocabulary accuracy. Once > they're done we can hook up the dictation. Any pointers on where to > look for the engine API would be appreciated. xvoice uses IBM ViaVoice SDK's API, SMAPI. We talked about abstracting that out to make hooking other engines in easier, but never actually did it. By the way, Paul, you might also check out the OSSRI project, which is somewhat more active than xvoice-sphinx. http://www.ossri.org/ OSSRI has been evaluating the various sphinx projects and would definitely be interested to hear about the sphinx developers' interest in making sphinx work with an open source front end. I forwarded Ken's original mail to the ossri list. j |
From: Paul L. <Paul.Lamere@Sun.COM> - 2004-08-03 19:25:59
|
Ken: Sure thing. One thing I'd like to do to get the ball rolling would be to look at how xvoice interfaces to a particular speech engine. Do you know if there's a speech engine api that xvoice uses to interface to a particular speech engine? It'd be interesting to hook xvoice to s4 to allow C&C of the desktop, while the trainer folks work their training magic over the models to improve the large vocabulary accuracy. Once they're done we can hook up the dictation. Any pointers on where to look for the engine API would be appreciated. Paul Ken Olum wrote: >Thanks, Paul. > >Of course we'd be delighted to have Sphinx-4 moving in the direction >you mention, and I'm sure we'd like to help, although having time to >do so is always problematic. Jessica tried for quite a while to do >something with training in Sphinx-2, but didn't really get anywhere. > >There is certainly a community of potential users here who are waiting >to get xvoice going as a dictation system again. I, for one, would >just love to get rid of NaturallySpeaking if I had a system that was >accurate enough and no slower on the latest hardware, say, than >NaturallySpeaking was on my old 266 MHz machine. > >Probably the best thing is to post to this list if "maybe" starts >moving closer to "yes", or if you have anything that people here might >be able to help with. > > Ken > > |
From: Ken O. <kd...@co...> - 2004-08-03 19:17:09
|
Thanks, Paul. Of course we'd be delighted to have Sphinx-4 moving in the direction you mention, and I'm sure we'd like to help, although having time to do so is always problematic. Jessica tried for quite a while to do something with training in Sphinx-2, but didn't really get anywhere. There is certainly a community of potential users here who are waiting to get xvoice going as a dictation system again. I, for one, would just love to get rid of NaturallySpeaking if I had a system that was accurate enough and no slower on the latest hardware, say, than NaturallySpeaking was on my old 266 MHz machine. Probably the best thing is to post to this list if "maybe" starts moving closer to "yes", or if you have anything that people here might be able to help with. Ken |
From: Paul L. <Paul.Lamere@Sun.COM> - 2004-08-03 18:19:15
|
Ken: I posted this to the S4 forum, but just in case you've stopped reading there, I figured I'd send this here as well: Your question has prompted a bit of a dialog among some of the sphinx developers. First, the xvoice project is an important project for a variety of reasons and the fact that xvoice is taking a look at Sphinx-4 is a good thing. We should look seriously at your requirements and push to make S4 work for you. Second, there is no technical reason why S4 can't do what you need to do. Some comparisons have been made between S4 and some of the commercial recognition engines and with equivalent training data S4 compares extremely well in terms of speed and accuracy. The main barrier for us right now to achieving the kind of accuracy that you are looking for is the lack of speaker dependent acoustic and language models. We currently train for speaker independent models. We may start taking a look at what it will take to adapt these models for a particular speaker. With these adapted models we may indeed be able to meet your accuracy requirements. What this boils down to is when we say 'no' what we really mean is 'maybe'. So ... stay tuned. Paul |
From: Jessica P. H. <jph...@ar...> - 2004-08-03 15:25:54
|
On Tue, 3 Aug 2004, Ken Olum wrote: > Since the Sphinx4 developers have been very active and helpful on the > Sphinx4 forum, I decided to ask if they had any plans to go in a > direction that would help us. Unfortunately the answer is no. Bummer. But thank you for thinking to ask and letting the list know. I forwarded this mail to ossri so that that project would know, since they are doing a lot of the work that this project used to do. j |
From: Ken O. <kd...@co...> - 2004-08-03 12:14:07
|
Since the Sphinx4 developers have been very active and helpful on the Sphinx4 forum, I decided to ask if they had any plans to go in a direction that would help us. Unfortunately the answer is no. I am going to stop reading the Sphinx forums now. Ken From: "SourceForge.net" <no...@so...> Subject: [cmusphinx - Sphinx4 Open Discussion] RE: Future of Sphinx Date: Tue, 03 Aug 2004 03:35:46 -0700 Read and respond to this message at: https://sourceforge.net/forum/message.php?msg_id=2695060 By: lamere Ken: This is the right place for this question. Sphinx-4 is a speaker independent recognition system. As such it will have much higher WER than would be acceptable for use as a dictation system. We do have some provisions for speaker adaptation in the acoustic models that could be used to improve WER, however we have no plans at this time to go down that path. Paul |
From: David G. <dgr...@ne...> - 2004-07-09 04:50:08
|
Yeah. Still here. Swamped with many other projects right now though. Will any luck I can work with this project sometime this year. On Sat, 2004-06-26 at 10:37, Filip djMedrzec Zyzniewski wrote: > Hello. > > I've just discovered CMU Sphinx and I am starting to like it. > > I've seen on xvoice-sphinx homepage that some of you played > with it to do various things like training. > > > I'd like to use sphinx2 to build a simple desktop environment > control system. > > For now I just parse output of sphinx2-continuous with an > awk script. This script confirms commands using festival > and launches what i specify to launch. > > What i'd like to accomplish is a system, that: > - recognizes about 100-200 words (or phrases, like "workspace 1". > this way it would be even simplier) > - does it with good accuracy > - doesn't eat up too much CPU. > > My sphinx2 setup currently recognizes 22 phrases. > I created a simple dictionary with > http://www.speech.cs.cmu.edu/tools/lmtool.html > (problem: how to make sphinx2 recognize ONLY phrases, not > single words in them). > > For now, accuracy level is just too low. For that small > set of phrases i thing it can be done better. I'd like > to train sphinx. > Maybe I could use polish language instead of english? > > I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain, > but i don't know what to do with the result. It does not resemple > hmm directory structure... > > Sphinx documentation is just... scary... > > BTW, I've tried cvoicecontrol, but it plainly sucks. Accuracy level > is low, and CPU usage is 100% all the time. > > > hope some of you are still here, > > Filip Zyzniewski > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & Training. > Attend Black Hat Briefings & Training, Las Vegas July 24-29 - > digital self defense, top technical experts, no vendor pitches, > unmatched networking opportunities. Visit www.blackhat.com > _______________________________________________ > Xvoice-sphinx mailing list > Xvo...@li... > https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx -- David Graham <dgr...@ne...> |
From: Ken O. <kd...@co...> - 2004-07-01 20:01:41
|
A message that went by on the Sphinx 4 forum a while ago gave the location of a regularly updated page of Sphinx 3 and Sphinx 4 large vocabulary accuracy results: http://cmusphinx.sourceforge.net/LargeVocabResults.html It might be worth looking here for improvement, but at the moment the conclusion is that neither of these programs is of any use for dictation. The best accuracy on any run was 12% word error rate. I think these programs do not permit speaker adaptation, thus perhaps explaining the poor results. Ken |
From: Jessica P. H. <jph...@ar...> - 2004-06-28 21:14:42
|
On Mon, 28 Jun 2004, Filip djMedrzec Zyzniewski wrote: > time/model_parameters/time.s2models > time/model_parameters/time.s2models/cep.256.vec > time/model_parameters/time.s2models/cep.256.var > time/model_parameters/time.s2models/d2cep.256.vec > time/model_parameters/time.s2models/d2cep.256.var > time/model_parameters/time.s2models/p3cep.256.vec > time/model_parameters/time.s2models/p3cep.256.var > time/model_parameters/time.s2models/xcep.256.vec > time/model_parameters/time.s2models/xcep.256.var > time/model_parameters/time.s2models/map > time/model_parameters/time.s2models/phone I would expect to see the HMMs in here somewhere. Perhaps the process died prematurely -- check your logs. > And how should i do this training for few dozens of sentences? If you just want it to recognize you, I'd read the sentences a few times and use that data; if you want it to recognize more people, they each need to train it themselves. The more data, the better the recognition. > It would be nice to force sphinx to recognize ONLY these sentences. > So if I have sentences: > foo bar > acme xxx > > i would like sphinx to NOT recognize foo xxx :). You need to build a language model that tells it this. j |
From: Filip d. Z. <xvo...@fi...> - 2004-06-28 19:52:58
|
On Mon, Jun 28, 2004 at 09:43:53AM -0400, Jessica Perry Hekman wrote: > On Sat, 26 Jun 2004, Filip djMedrzec Zyzniewski wrote: > > > I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain, > > but i don't know what to do with the result. It does not resemple > > hmm directory structure... > > What result do you get? Can you send us the resulting directory structure? here it is: time time/bin time/bin/agg_seg time/bin/bldtree time/bin/bw time/bin/cp_parm time/bin/delint time/bin/dict2tri time/bin/inc_comp time/bin/init_gau time/bin/init_mixw time/bin/kmeans_init time/bin/make_quests time/bin/mixw_interp time/bin/mk_flat time/bin/mk_mdef_gen time/bin/mk_mllr_class time/bin/mk_model_def time/bin/mk_s2cb time/bin/mk_s2hmm time/bin/mk_s2phone time/bin/mk_s2phonemap time/bin/mk_s2sendump time/bin/mk_s3gau time/bin/mk_s3mixw time/bin/mk_s3tmat time/bin/mk_ts2cb time/bin/norm time/bin/param_cnt time/bin/printp time/bin/prunetree time/bin/QUICK_COUNT time/bin/scripts_pl time/bin/tiestate time/bin/wave2feat time/bin/maketopology time/bin/make_feats time/bin/make_dict time/etc time/etc/sphinx_train.cfg time/etc/time.transcript time/etc/time.fileids time/etc/time.filler time/etc/time.phone time/feat time/feat/time0001.feat time/wav time/wav/time0001.wav time/logdir time/logdir/01.vector_quantize time/logdir/01.vector_quantize/time.vq.agg_seg.log time/logdir/02.ci_schmm time/logdir/02.ci_schmm/time.make_ci_mdef_fromphonelist.log time/logdir/02.ci_schmm/time.makeflat_cischmm.log time/logdir/02.ci_schmm/time.1-1.bw.log time/logdir/02.ci_schmm/time.1.norm.log time/logdir/03.makeuntiedmdef time/logdir/03.makeuntiedmdef/time.make_alltriphonelist.log time/logdir/04.cd_schmm_untied time/logdir/04.cd_schmm_untied/time.copycitocd.log time/logdir/04.cd_schmm_untied/time.1-1.bw.log time/logdir/04.cd_schmm_untied/time.1.norm.log time/logdir/05.buildtrees time/logdir/05.buildtrees/time.make_questions.log time/logdir/06.prunetree time/logdir/06.prunetree/time.build.alltriphones.mdef.log time/logdir/06.prunetree/time.prunetree.6000.log time/logdir/06.prunetree/time.tiestate.6000.log time/logdir/07.cd-schmm time/logdir/07.cd-schmm/time.copy.ci.2.cd.log time/logdir/07.cd-schmm/time.1-1.bw.log time/logdir/07.cd-schmm/time.1-2.bw.log time/logdir/07.cd-schmm/time.1.norm.log time/logdir/08.deleted_interpolation time/logdir/08.deleted_interpolation/time.deletedintrep-6000.log time/logdir/09.make_s2_models time/logdir/09.make_s2_models/time.mk_s2cb.log time/logdir/09.make_s2_models/time.mk_s2chmm.log time/logdir/09.make_s2_models/time.mk_s2sendump.log time/logdir/09.make_s2_models/time.mk_s2phonemap.log time/bwaccumdir time/bwaccumdir/time_buff_1 time/bwaccumdir/time_buff_2 time/model_parameters time/model_parameters/hub4 time/model_parameters/hub4/variances time/model_parameters/hub4/transition_matrices time/model_parameters/hub4/newfe.6000.mdef time/model_parameters/hub4/mixture_weights time/model_parameters/hub4/means time/model_parameters/hub4/time.6000.mdef time/model_parameters/time.ci_semi_flatinitial time/model_parameters/time.ci_semi_flatinitial/means time/model_parameters/time.ci_semi_flatinitial/variances time/model_parameters/time.ci_semi_flatinitial/transition_matrices time/model_parameters/time.ci_semi_flatinitial/mixture_weights time/model_parameters/time.cd_semi_untied time/model_parameters/time.cd_semi_initial time/model_parameters/time.cd_semi_6000_delinterp time/model_parameters/time.cd_semi_6000_interp time/model_parameters/time.s2models time/model_parameters/time.s2models/cep.256.vec time/model_parameters/time.s2models/cep.256.var time/model_parameters/time.s2models/d2cep.256.vec time/model_parameters/time.s2models/d2cep.256.var time/model_parameters/time.s2models/p3cep.256.vec time/model_parameters/time.s2models/p3cep.256.var time/model_parameters/time.s2models/xcep.256.vec time/model_parameters/time.s2models/xcep.256.var time/model_parameters/time.s2models/map time/model_parameters/time.s2models/phone time/model_architecture time/model_architecture/time.phonelist time/model_architecture/time.topology time/model_architecture/time.ci.mdef time/model_architecture/time.tree_questions time/gifs time/gifs/green-ball.gif time/gifs/red-ball.gif time/scripts_pl time/scripts_pl/00.verify time/scripts_pl/00.verify/verify_all.pl time/scripts_pl/01.vector_quantize time/scripts_pl/01.vector_quantize/agg_seg.pl time/scripts_pl/01.vector_quantize/kmeans.pl time/scripts_pl/01.vector_quantize/slave.VQ.pl time/scripts_pl/02.ci_schmm time/scripts_pl/02.ci_schmm/baum_welch.pl time/scripts_pl/02.ci_schmm/norm.pl time/scripts_pl/02.ci_schmm/norm_and_launchbw.pl time/scripts_pl/02.ci_schmm/slave_convg.pl time/scripts_pl/03.makeuntiedmdef time/scripts_pl/03.makeuntiedmdef/make_untied_mdef.pl time/scripts_pl/04.cd_schmm_untied time/scripts_pl/04.cd_schmm_untied/baum_welch.pl time/scripts_pl/04.cd_schmm_untied/makeuntiedmixw.pl time/scripts_pl/04.cd_schmm_untied/norm_and_launchbw.pl time/scripts_pl/04.cd_schmm_untied/norm.pl time/scripts_pl/04.cd_schmm_untied/slave_convg.pl time/scripts_pl/05.buildtrees time/scripts_pl/05.buildtrees/make_questions.pl time/scripts_pl/05.buildtrees/slave.treebuilder.pl time/scripts_pl/05.buildtrees/buildtree.pl time/scripts_pl/06.prunetree time/scripts_pl/06.prunetree/prunetree.pl time/scripts_pl/06.prunetree/slave.state-tie-er.pl time/scripts_pl/06.prunetree/tiestate.pl time/scripts_pl/07.cd-schmm time/scripts_pl/07.cd-schmm/baum_welch.pl time/scripts_pl/07.cd-schmm/norm.pl time/scripts_pl/07.cd-schmm/norm_and_launchbw.pl time/scripts_pl/07.cd-schmm/slave_convg.pl time/scripts_pl/08.deleted-interpolation time/scripts_pl/08.deleted-interpolation/deleted_interpolation.pl time/scripts_pl/09.make_s2_models time/scripts_pl/09.make_s2_models/make_s2_models.pl time/scripts_pl/mc time/scripts_pl/mc/mc_status.pl time/scripts_pl/mc/mc_run.pl time/scripts_pl/mc/mc_kill.pl time/scripts_pl/mc/mc_check.pl time/builds2model time/time.html time/.02.bw.1.1.state.gif time/.02.bw.1.2.state.gif I just copied instructions from wiki, my system is gentoo. I installed sphinx (from CVS) and festival. And also downloaded/unpacked this hup4 thing. > I'm still here, but don't have a lot of time to spend on this stuff right > now. So no promises, but I'll try to help. :). I'd just like to know how to apply these train data. And how should i do this training for few dozens of sentences? It would be nice to force sphinx to recognize ONLY these sentences. So if I have sentences: foo bar acme xxx i would like sphinx to NOT recognize foo xxx :). bye, Filip Zyzniewski PS. I am subscribed, so you don't have to cc me :) |
From: Jessica P. H. <jph...@ar...> - 2004-06-28 13:42:16
|
On Sat, 26 Jun 2004, Filip djMedrzec Zyzniewski wrote: > I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain, > but i don't know what to do with the result. It does not resemple > hmm directory structure... What result do you get? Can you send us the resulting directory structure? > hope some of you are still here, I'm still here, but don't have a lot of time to spend on this stuff right now. So no promises, but I'll try to help. j |
From: Filip d. Z. <xvo...@fi...> - 2004-06-26 15:37:27
|
Hello. I've just discovered CMU Sphinx and I am starting to like it. I've seen on xvoice-sphinx homepage that some of you played with it to do various things like training. I'd like to use sphinx2 to build a simple desktop environment control system. For now I just parse output of sphinx2-continuous with an awk script. This script confirms commands using festival and launches what i specify to launch. What i'd like to accomplish is a system, that: - recognizes about 100-200 words (or phrases, like "workspace 1". this way it would be even simplier) - does it with good accuracy - doesn't eat up too much CPU. My sphinx2 setup currently recognizes 22 phrases. I created a simple dictionary with http://www.speech.cs.cmu.edu/tools/lmtool.html (problem: how to make sphinx2 recognize ONLY phrases, not single words in them). For now, accuracy level is just too low. For that small set of phrases i thing it can be done better. I'd like to train sphinx. Maybe I could use polish language instead of english? I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain, but i don't know what to do with the result. It does not resemple hmm directory structure... Sphinx documentation is just... scary... BTW, I've tried cvoicecontrol, but it plainly sucks. Accuracy level is low, and CPU usage is 100% all the time. hope some of you are still here, Filip Zyzniewski |
From: <bla...@gm...> - 2003-11-28 10:42:42
|
Hi @all!! I am a new user of the sphinx2 engine. I just need to know, which file format allphone-test is able to read and what is exactly meant by pgm? Hope, that someone knows. Thanx a lot!! -- NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien... Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService Jetzt kostenlos anmelden unter http://www.gmx.net +++ GMX - die erste Adresse für Mail, Message, More! +++ |
From: Benner S. M N. <BennerSM@Npt.NUWC.Navy.Mil> - 2003-11-03 16:12:18
|
In an effort to defend my scripts, let me make a few notes: First of all, for some reason my browser is unable to access the xvoice-sphinx site anymore so I don't know exactly what the content is. Usage for my buildlm.sh script is as follows: The script and all resource files should be located in the CMU_Cam_Toolkit_v2/bin directory, or wherever your LM Builder binaries are located. The argument list at the top of buildlm.sh are your files to be tailored to your individual names. Input.txt is the corpus, make sure it does not have any punctuation such as , ' . " etc. The purpose of make_dic.java is two-fold; First, it builds the dictionary by looking for any new words not in $DICT, and then using cmudict06d.dic as reference to add the new words. Second, it conditions the corpus for input into the LM toolkit. The LM Builder expects the corpus in a certain syntax: "<s> (caps)CORPUS-LINE </s>" The make_dic.java outputs the conditioned line into $SENT. Note: I have it set so each iteration you run through this script make_dic appends the input to $SENT, does not over-write. This is so you can feed various transcripts through and grow the LM & dictionary. However during initial debugging, it might be a good idea to throw in a "rm $SENT" in the script so it doesn't get bogged down. I built and tested my script using a small vocabulary, about 200-300 words, and it took about 3 minutes to build a dictionary from scratch. If you already have a large Sphinx dictionary built and just need the LM Builder portion, you can comment out the java make_dic line. Just make sure you have your corpus conditioned as described above, you can use 'sed' to do the same thing from the shell, I forget exactly how to do that though. I would also like to make a recommendation for the decoder settings. Since I used a relatively small vocabulary, I was able to change the settings from default to "wide beam" mode without sacrificing much speed. The documentation mentions it somewhere the settings to change from default are: " -top 4 -topsenfrm 4 -topsenthresh -80000 -beam 1e-06 npbeam 1e-06 -lpbeam 1e-05 -lpbeam 1e-05 -lponlybeam 0.0003 -nwbeam 0.0003 -fwdflat TRUE" My accuracy rate improved significantly with these settings, decoding time increased by about 2-3 seconds. I hope it helps, you can get in touch with me if you have any more questions about my scripts. Steve -----Original Message----- From: xvo...@li... [mailto:xvo...@li...] Sent: Friday, October 31, 2003 11:51 PM To: xvo...@li... Subject: Xvoice-sphinx digest, Vol 1 #182 - 1 msg Send Xvoice-sphinx mailing list submissions to xvo...@li... To subscribe or unsubscribe via the World Wide Web, visit https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx or, via email, send a message with subject or body 'help' to xvo...@li... You can reach the person managing the list at xvo...@li... When replying, please edit your Subject line so it is more specific than "Re: Contents of Xvoice-sphinx digest..." Today's Topics: 1. Re: Re: Xvoice-sphinx digest, Vol 1 #180 - 1 msg (Jessica P. Hekman) --__--__-- Message: 1 Date: Fri, 31 Oct 2003 11:11:49 -0500 (EST) To: S Samba Siva Rao <ss...@cs...> cc: xvo...@li... Subject: Re: [Xvoice-sphinx] Re: Xvoice-sphinx digest, Vol 1 #180 - 1 msg From: "Jessica P. Hekman" <jph...@ar...> Reply-To: "Jessica P. Hekman" <jph...@ar...> On Thu, 30 Oct 2003, S Samba Siva Rao wrote: > Ya! Really I stucked at this point. I read that Jessica madam made one > LM. What about it? Is it working fine. I am unable to figure out what > exactly I should do. Any help please. The LM I build does not break -- it does work. But it does not recognize very well. The project needs an LM which recognizes much better. That's what I was hoping you might be able to do. In jy-scripts.tgz, you should see a run-cmu.sh file. Take a look at this file. It is a shell script which has the command that you need to run to build an LM. You will need to edit it to change the paths. I used this script to build an LM, but the LM that I build did not work. In sb-scripts.tgz, you will see buildlm.sh. This is another shell script with another set of commands. When I ran this, the Java command ran for days before I killed it. When I re-ran the script with the Java command commented out, it successfully built an LM -- but not a good one. (The Java command was building the dictionary, and since it never completed, it wasn't a very good dictionary.) Your first step should probably be to make sure you can get back to where I was: to run both scripts. Can you do that? j --__--__-- _______________________________________________ Xvoice-sphinx mailing list Xvo...@li... https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx End of Xvoice-sphinx Digest |
From: Jessica P. H. <jph...@ar...> - 2003-11-03 15:07:07
|
On Sat, 1 Nov 2003, S Samba Siva Rao wrote: > What is the input for run-cmu.sh? run-cmu.sh corpus.txt where corpus.txt is whatever file you want to use as your corpus; the one I used is available for download. > where is cmu-unstressed dictionary? Not sure where I got that from, so I made it available at http://xvoice.sourceforge.net/xvoice-sphinx/cmudict-unstressed.gz > Can you clearly explain what is the difference between dictionary and > language model. A dictionary is a list of words and how each one is pronounced. A language model is a file detailing which words are likely to occur and when (for example, it is unlikely for a person to say "a the" right next to each other). > Can you bit clear about this script please? I'm not sure what you need to know! I don't understand what each of the commands do, myself. But the series of commands build a dictionary and a language model, basically. j |
From: S S. S. R. <ss...@cs...> - 2003-11-01 17:21:17
|
> In jy-scripts.tgz, you should see a run-cmu.sh file. Take a look at this > file. It is a shell script which has the command that you need to run to > build an LM. You will need to edit it to change the paths. I used this > script to build an LM, but the LM that I build did not work. Dear madam! What is the input for run-cmu.sh? where is cmu-unstressed dictionary? Can you clearly explain what is the difference between dictionary and language model. Can you bit clear about this script please? samba -- ----- "When you do the common things in life in an uncommon way, you will command the attention of the world." - George Washington Carver (1864-1943) |
From: Jessica P. H. <jph...@ar...> - 2003-10-31 16:12:03
|
On Thu, 30 Oct 2003, S Samba Siva Rao wrote: > Ya! Really I stucked at this point. I read that Jessica madam made one > LM. What about it? Is it working fine. I am unable to figure out what > exactly I should do. Any help please. The LM I build does not break -- it does work. But it does not recognize very well. The project needs an LM which recognizes much better. That's what I was hoping you might be able to do. In jy-scripts.tgz, you should see a run-cmu.sh file. Take a look at this file. It is a shell script which has the command that you need to run to build an LM. You will need to edit it to change the paths. I used this script to build an LM, but the LM that I build did not work. In sb-scripts.tgz, you will see buildlm.sh. This is another shell script with another set of commands. When I ran this, the Java command ran for days before I killed it. When I re-ran the script with the Java command commented out, it successfully built an LM -- but not a good one. (The Java command was building the dictionary, and since it never completed, it wasn't a very good dictionary.) Your first step should probably be to make sure you can get back to where I was: to run both scripts. Can you do that? j |
From: S S. S. R. <ss...@cs...> - 2003-10-30 13:47:32
|
> Your first step would be to download and install the LM toolkit: I installed the toolkit > http://mi.eng.cam.ac.uk/~prc14/toolkit.html > > Then download the scripts and files available at I downloaded the scripts and files except the long one of Jessica madam > http://xvoice.sourceforge.net/xvoice-sphinx/status.php > > and try to build an LM with them. (I assume that at that point you'll have > some questions; be sure to ask them on the list and not directly to me!) Ya! Really I stucked at this point. I read that Jessica madam made one LM. What about it? Is it working fine. I am unable to figure out what exactly I should do. Any help please. samba ----- "When you do the common things in life in an uncommon way, you will command the attention of the world." - George Washington Carver (1864-1943) |