xvoice-sphinx Mailing List for XVoice (Page 2)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

FWIW I just tidied up the formatting on the 'IvansNotes' page:

http://xvoice.arborius.net/xvoice-sphinx/IvansNotes

Ivan

Hello John

> ...
> "make_dict"
> ...
> 
>  /********** extracted from the script on the wiki page **********
> #!/bin/sh -xv
> rm time.html
> rm model_architecture/time.[a-z]*
> bin/make_dict etc/time.transcription
> #
> #WIll create etc/word.known etc/word.unknown files,. check them once you are
> happy,
> #
> mv etc/word.known etc/time.dic
> #
> #Make the melcep feature files
> #
> bin/make_feats etc/time.fileids
> #
> #Now we can start on the basic perl scripts, ther
> 
> /********* end of extraction ***************

make_dict requires festival and the CMUDict pronunciation dictionary 
(PD).  Note that (AFAIK) word.known and word.unknown are not sorted or 
uniq'd.  I don't know if sorting matters (except to humans) but your 
final pronunciation dictionary should have unique headwords.

If you don't have festival and/or CMUDict (i.e. you have a different 
PD), it's not difficult to write a script to do what make_dict does.

> In the mean time, the language model produced by the CMU tools is working,
> but would like to add some training for individual speakers. Working on a
> project for air traffic control/pilot communications (which is highly
> structured and contained) but there is a requirement for a high level of
> accuracy.

Do you have a speaker-independent Acoustic Model already?

The SphinxTrain manual and faq have notes on adapting existing models. 
The manual says you can build speaker-specific AMs with 8-10 hours of 
data.  This might be the best way to go if you have/can easily get the data.

Good luck

Ivan

Hi John,
	If you need database, tries out the CMU Communicator database.  It
is free to use. (Free in the sense of free-software, not free-beer).  It
also contains pretty large amount of data.=20
	You can obtain it by asking Prof. Alex Rudnicky (air at cs dot cmu
dot edu )in CMU. Of course, you need to pay shipping and packaging fees.

Arthur

Phone (Office) : 8-9504

-----Original Message-----
From: xvo...@li...
[mailto:xvo...@li...] On Behalf Of Jessica
Perry Hekman
Sent: Saturday, September 11, 2004 2:52 PM
To: John Wojnaroski
Cc: xvo...@li...
Subject: Re: [Xvoice-sphinx] Wiki SphinxTrain

On Sat, Sep 11, 2004 at 11:25:46AM -0700, John Wojnaroski wrote:

> It looks like the file in that directory is written in Scheme and is =
not a
> binary file.

True, but the first line of it tells the shell how to execute it:

#!/bin/sh
"true"; exec /usr/local/festival/bin/festival --script $0 $*

Is festival installed on your system? That could be the problem.

j

-------------------------------------------------------
This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170
Project Admins to receive an Apple iPod Mini FREE for your judgement on
who ports your project to Linux PPC the best. Sponsored by IBM.=20
Deadline: Sept. 13. Go here: http://sf.net/ppc_contest.php
_______________________________________________
Xvoice-sphinx mailing list
Xvo...@li...
https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx

On Sat, Sep 11, 2004 at 11:25:46AM -0700, John Wojnaroski wrote:

> It looks like the file in that directory is written in Scheme and is not a
> binary file.

True, but the first line of it tells the shell how to execute it:

#!/bin/sh
"true"; exec /usr/local/festival/bin/festival --script $0 $*

Is festival installed on your system? That could be the problem.

j

> > The build script outlined in the "recipe" is unable to find the
"make_dict"
> > program. Neither can I...
> >
> > Is there something missing? Like the CMU language model tools or
> > speech_tools or ???
>
> It looks to me like it comes with SphinxTrain -- I have it in the
> scripts_pl directory. Check your distribution?
>
It looks like the file in that directory is written in Scheme and is not a
binary file.

 /********** extracted from the script on the wiki page **********
#!/bin/sh -xv
rm time.html
rm model_architecture/time.[a-z]*
bin/make_dict etc/time.transcription
#
#WIll create etc/word.known etc/word.unknown files,. check them once you are
happy,
#
mv etc/word.known etc/time.dic
#
#Make the melcep feature files
#
bin/make_feats etc/time.fileids
#
#Now we can start on the basic perl scripts, ther

/********* end of extraction ***************

Note that the next call to "bin/make_feats" is a shell script that winds up
calling /bin/wave2feat. Both files are there in ../time/bin along with a
bunch of logical links. "make_feats" works but "make_dict" fails. I'm not
all that famliar with Scheme... wondering if it requires some sort of
interpreter or compiler to run.

In the mean time, the language model produced by the CMU tools is working,
but would like to add some training for individual speakers. Working on a
project for air traffic control/pilot communications (which is highly
structured and contained) but there is a requirement for a high level of
accuracy.

Thanks for your help

Regards
John W.

On Fri, Sep 10, 2004 at 10:22:53PM -0700, John Wojnaroski wrote:

> The build script outlined in the "recipe" is unable to find the "make_dict"
> program. Neither can I...
> 
> Is there something missing? Like the CMU language model tools or
> speech_tools or ???

It looks to me like it comes with SphinxTrain -- I have it in the 
scripts_pl directory. Check your distribution?

I would be happy to mail the script to you as a last resort :)

j

Hi,

The build script outlined in the "recipe" is unable to find the "make_dict"
program. Neither can I...

Is there something missing? Like the CMU language model tools or
speech_tools or ???

Thanks
John W.

On Tue, 3 Aug 2004, Paul Lamere wrote:

> Sure thing.   One thing I'd like to do to get the ball rolling would be
> to look at how xvoice interfaces to a particular speech engine.  Do you
> know if there's a speech engine api that xvoice uses to interface to a
> particular speech engine?  It'd be interesting to hook xvoice to s4 to
> allow C&C of the desktop, while the trainer folks work their training
> magic over the models to improve the large vocabulary accuracy. Once
> they're done we can hook up the dictation.  Any  pointers on where to
> look for the engine API would be appreciated.

xvoice uses IBM ViaVoice SDK's API, SMAPI. We talked about abstracting
that out to make hooking other engines in easier, but never actually did
it.

By the way, Paul, you might also check out the OSSRI project, which is
somewhat more active than xvoice-sphinx.

  http://www.ossri.org/

OSSRI has been evaluating the various sphinx projects and would definitely
be interested to hear about the sphinx developers' interest in making
sphinx work with an open source front end. I forwarded Ken's original mail
to the ossri list.

j

Ken:

Sure thing.   One thing I'd like to do to get the ball rolling would be 
to look at how xvoice interfaces to a particular speech engine.  Do you 
know if there's a speech engine api that xvoice uses to interface to a 
particular speech engine?  It'd be interesting to hook xvoice to s4 to 
allow C&C of the desktop, while the trainer folks work their training 
magic over the models to improve the large vocabulary accuracy. Once 
they're done we can hook up the dictation.  Any  pointers on where to 
look for the engine API would be appreciated.

Paul

Ken Olum wrote:

>Thanks, Paul.
>
>Of course we'd be delighted to have Sphinx-4 moving in the direction
>you mention, and I'm sure we'd like to help, although having time to
>do so is always problematic.  Jessica tried for quite a while to do
>something with training in Sphinx-2, but didn't really get anywhere.
>
>There is certainly a community of potential users here who are waiting
>to get xvoice going as a dictation system again.  I, for one, would
>just love to get rid of NaturallySpeaking if I had a system that was
>accurate enough and no slower on the latest hardware, say, than
>NaturallySpeaking was on my old 266 MHz machine.
>
>Probably the best thing is to post to this list if "maybe" starts
>moving closer to "yes", or if you have anything that people here might
>be able to help with.
>
>		Ken
>  
>

Thanks, Paul.

Of course we'd be delighted to have Sphinx-4 moving in the direction
you mention, and I'm sure we'd like to help, although having time to
do so is always problematic.  Jessica tried for quite a while to do
something with training in Sphinx-2, but didn't really get anywhere.

There is certainly a community of potential users here who are waiting
to get xvoice going as a dictation system again.  I, for one, would
just love to get rid of NaturallySpeaking if I had a system that was
accurate enough and no slower on the latest hardware, say, than
NaturallySpeaking was on my old 266 MHz machine.

Probably the best thing is to post to this list if "maybe" starts
moving closer to "yes", or if you have anything that people here might
be able to help with.

		Ken

Ken:

I posted this to the S4 forum, but just in case you've stopped reading 
there,
I figured I'd send this here as well:

Your question has prompted a bit of a dialog among some of the sphinx
developers.  First, the xvoice project is an important project for a
variety of reasons and the fact that xvoice is taking a look at
Sphinx-4 is a good thing.  We should look seriously at your
requirements and push to make S4 work for you.

Second, there is no technical reason why S4 can't do what you need to
do. Some comparisons have been made between S4 and some of the
commercial recognition engines and with equivalent training data S4
compares extremely well in terms of speed and accuracy. The main
barrier for us right now to achieving the kind of accuracy that you are
looking for is the lack of speaker dependent acoustic and language
models.  We currently train for speaker independent models.  We may
start taking a look at what it will take to adapt these models for a
particular speaker. With these adapted models we may indeed be
able to meet your accuracy requirements.

What this boils down to is when we say 'no' what we really mean is
'maybe'.  So ... stay tuned.

Paul

On Tue, 3 Aug 2004, Ken Olum wrote:

> Since the Sphinx4 developers have been very active and helpful on the
> Sphinx4 forum, I decided to ask if they had any plans to go in a
> direction that would help us.  Unfortunately the answer is no.

Bummer. But thank you for thinking to ask and letting the list know. I
forwarded this mail to ossri so that that project would know, since they
are doing a lot of the work that this project used to do.

j

Since the Sphinx4 developers have been very active and helpful on the
Sphinx4 forum, I decided to ask if they had any plans to go in a
direction that would help us.  Unfortunately the answer is no.

I am going to stop reading the Sphinx forums now.

                                        Ken

From: "SourceForge.net" <no...@so...>
Subject: [cmusphinx - Sphinx4 Open Discussion] RE: Future of Sphinx
Date: Tue, 03 Aug 2004 03:35:46 -0700

Read and respond to this message at: 
https://sourceforge.net/forum/message.php?msg_id=2695060
By: lamere

Ken:

This is the right place for this question. Sphinx-4 is a speaker
independent recognition system. As such it will have much higher WER
than would be acceptable for use as a dictation system.  We do have
some provisions for speaker adaptation in the acoustic models that
could be used to improve WER, however we have no plans at this time to
go down that path.

Paul

Yeah.  Still here.  Swamped with many other projects right now though. 
Will any luck I can work with this project sometime this year.

On Sat, 2004-06-26 at 10:37, Filip djMedrzec Zyzniewski wrote:
> Hello.
> 
> I've just discovered CMU Sphinx and I am starting to like it.
> 
> I've seen on xvoice-sphinx homepage that some of you played
> with it to do various things like training.
> 
> 
> I'd like to use sphinx2 to build a simple desktop environment
> control system.
> 
> For now I just parse output of sphinx2-continuous with an
> awk script. This script confirms commands using festival 
> and launches what i specify to launch.
> 
> What i'd like to accomplish is a system, that:
> - recognizes about 100-200 words (or phrases, like "workspace 1".
>   this way it would be even simplier)
> - does it with good accuracy
> - doesn't eat up too much CPU.
> 
> My sphinx2 setup currently recognizes 22 phrases.
> I created a simple dictionary with
> http://www.speech.cs.cmu.edu/tools/lmtool.html
> (problem: how to make sphinx2 recognize ONLY phrases, not
>  single words in them).
> 
> For now, accuracy level is just too low. For that small
> set of phrases i thing it can be done better. I'd like
> to train sphinx.
> Maybe I could use polish language instead of english?
> 
> I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain,
> but i don't know what to do with the result. It does not resemple
> hmm directory structure...
> 
> Sphinx documentation is just... scary...
> 
> BTW, I've tried cvoicecontrol, but it plainly sucks. Accuracy level
> is low, and CPU usage is 100% all the time.
> 
> 
> hope some of you are still here,
> 
> Filip Zyzniewski
> 
> 
> -------------------------------------------------------
> This SF.Net email sponsored by Black Hat Briefings & Training.
> Attend Black Hat Briefings & Training, Las Vegas July 24-29 - 
> digital self defense, top technical experts, no vendor pitches, 
> unmatched networking opportunities. Visit www.blackhat.com
> _______________________________________________
> Xvoice-sphinx mailing list
> Xvo...@li...
> https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx
-- 
David Graham <dgr...@ne...>

A message that went by on the Sphinx 4 forum a while ago gave the
location of a regularly updated page of Sphinx 3 and Sphinx 4 large
vocabulary accuracy results:

http://cmusphinx.sourceforge.net/LargeVocabResults.html

It might be worth looking here for improvement, but at the moment
the conclusion is that neither of these programs is of any use for
dictation.  The best accuracy on any run was 12% word error rate.

I think these programs do not permit speaker adaptation, thus perhaps
explaining the poor results.

			Ken

On Mon, 28 Jun 2004, Filip djMedrzec Zyzniewski wrote:

> time/model_parameters/time.s2models
> time/model_parameters/time.s2models/cep.256.vec
> time/model_parameters/time.s2models/cep.256.var
> time/model_parameters/time.s2models/d2cep.256.vec
> time/model_parameters/time.s2models/d2cep.256.var
> time/model_parameters/time.s2models/p3cep.256.vec
> time/model_parameters/time.s2models/p3cep.256.var
> time/model_parameters/time.s2models/xcep.256.vec
> time/model_parameters/time.s2models/xcep.256.var
> time/model_parameters/time.s2models/map
> time/model_parameters/time.s2models/phone

I would expect to see the HMMs in here somewhere. Perhaps the process died
prematurely -- check your logs.

> And how should i do this training for few dozens of sentences?

If you just want it to recognize you, I'd read the sentences a few times
and use that data; if you want it to recognize more people, they each need
to train it themselves. The more data, the better the recognition.

> It would be nice to force sphinx to recognize ONLY these sentences.
> So if I have sentences:
> foo bar
> acme xxx
>
> i would like sphinx to NOT recognize foo xxx :).

You need to build a language model that tells it this.

j

On Mon, Jun 28, 2004 at 09:43:53AM -0400, Jessica Perry Hekman wrote:
> On Sat, 26 Jun 2004, Filip djMedrzec Zyzniewski wrote:
> 
> > I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain,
> > but i don't know what to do with the result. It does not resemple
> > hmm directory structure...
> 
> What result do you get? Can you send us the resulting directory structure?

here it is:

time
time/bin
time/bin/agg_seg
time/bin/bldtree
time/bin/bw
time/bin/cp_parm
time/bin/delint
time/bin/dict2tri
time/bin/inc_comp
time/bin/init_gau
time/bin/init_mixw
time/bin/kmeans_init
time/bin/make_quests
time/bin/mixw_interp
time/bin/mk_flat
time/bin/mk_mdef_gen
time/bin/mk_mllr_class
time/bin/mk_model_def
time/bin/mk_s2cb
time/bin/mk_s2hmm
time/bin/mk_s2phone
time/bin/mk_s2phonemap
time/bin/mk_s2sendump
time/bin/mk_s3gau
time/bin/mk_s3mixw
time/bin/mk_s3tmat
time/bin/mk_ts2cb
time/bin/norm
time/bin/param_cnt
time/bin/printp
time/bin/prunetree
time/bin/QUICK_COUNT
time/bin/scripts_pl
time/bin/tiestate
time/bin/wave2feat
time/bin/maketopology
time/bin/make_feats
time/bin/make_dict
time/etc
time/etc/sphinx_train.cfg
time/etc/time.transcript
time/etc/time.fileids
time/etc/time.filler
time/etc/time.phone
time/feat
time/feat/time0001.feat
time/wav
time/wav/time0001.wav
time/logdir
time/logdir/01.vector_quantize
time/logdir/01.vector_quantize/time.vq.agg_seg.log
time/logdir/02.ci_schmm
time/logdir/02.ci_schmm/time.make_ci_mdef_fromphonelist.log
time/logdir/02.ci_schmm/time.makeflat_cischmm.log
time/logdir/02.ci_schmm/time.1-1.bw.log
time/logdir/02.ci_schmm/time.1.norm.log
time/logdir/03.makeuntiedmdef
time/logdir/03.makeuntiedmdef/time.make_alltriphonelist.log
time/logdir/04.cd_schmm_untied
time/logdir/04.cd_schmm_untied/time.copycitocd.log
time/logdir/04.cd_schmm_untied/time.1-1.bw.log
time/logdir/04.cd_schmm_untied/time.1.norm.log
time/logdir/05.buildtrees
time/logdir/05.buildtrees/time.make_questions.log
time/logdir/06.prunetree
time/logdir/06.prunetree/time.build.alltriphones.mdef.log
time/logdir/06.prunetree/time.prunetree.6000.log
time/logdir/06.prunetree/time.tiestate.6000.log
time/logdir/07.cd-schmm
time/logdir/07.cd-schmm/time.copy.ci.2.cd.log
time/logdir/07.cd-schmm/time.1-1.bw.log
time/logdir/07.cd-schmm/time.1-2.bw.log
time/logdir/07.cd-schmm/time.1.norm.log
time/logdir/08.deleted_interpolation
time/logdir/08.deleted_interpolation/time.deletedintrep-6000.log
time/logdir/09.make_s2_models
time/logdir/09.make_s2_models/time.mk_s2cb.log
time/logdir/09.make_s2_models/time.mk_s2chmm.log
time/logdir/09.make_s2_models/time.mk_s2sendump.log
time/logdir/09.make_s2_models/time.mk_s2phonemap.log
time/bwaccumdir
time/bwaccumdir/time_buff_1
time/bwaccumdir/time_buff_2
time/model_parameters
time/model_parameters/hub4
time/model_parameters/hub4/variances
time/model_parameters/hub4/transition_matrices
time/model_parameters/hub4/newfe.6000.mdef
time/model_parameters/hub4/mixture_weights
time/model_parameters/hub4/means
time/model_parameters/hub4/time.6000.mdef
time/model_parameters/time.ci_semi_flatinitial
time/model_parameters/time.ci_semi_flatinitial/means
time/model_parameters/time.ci_semi_flatinitial/variances
time/model_parameters/time.ci_semi_flatinitial/transition_matrices
time/model_parameters/time.ci_semi_flatinitial/mixture_weights
time/model_parameters/time.cd_semi_untied
time/model_parameters/time.cd_semi_initial
time/model_parameters/time.cd_semi_6000_delinterp
time/model_parameters/time.cd_semi_6000_interp
time/model_parameters/time.s2models
time/model_parameters/time.s2models/cep.256.vec
time/model_parameters/time.s2models/cep.256.var
time/model_parameters/time.s2models/d2cep.256.vec
time/model_parameters/time.s2models/d2cep.256.var
time/model_parameters/time.s2models/p3cep.256.vec
time/model_parameters/time.s2models/p3cep.256.var
time/model_parameters/time.s2models/xcep.256.vec
time/model_parameters/time.s2models/xcep.256.var
time/model_parameters/time.s2models/map
time/model_parameters/time.s2models/phone
time/model_architecture
time/model_architecture/time.phonelist
time/model_architecture/time.topology
time/model_architecture/time.ci.mdef
time/model_architecture/time.tree_questions
time/gifs
time/gifs/green-ball.gif
time/gifs/red-ball.gif
time/scripts_pl
time/scripts_pl/00.verify
time/scripts_pl/00.verify/verify_all.pl
time/scripts_pl/01.vector_quantize
time/scripts_pl/01.vector_quantize/agg_seg.pl
time/scripts_pl/01.vector_quantize/kmeans.pl
time/scripts_pl/01.vector_quantize/slave.VQ.pl
time/scripts_pl/02.ci_schmm
time/scripts_pl/02.ci_schmm/baum_welch.pl
time/scripts_pl/02.ci_schmm/norm.pl
time/scripts_pl/02.ci_schmm/norm_and_launchbw.pl
time/scripts_pl/02.ci_schmm/slave_convg.pl
time/scripts_pl/03.makeuntiedmdef
time/scripts_pl/03.makeuntiedmdef/make_untied_mdef.pl
time/scripts_pl/04.cd_schmm_untied
time/scripts_pl/04.cd_schmm_untied/baum_welch.pl
time/scripts_pl/04.cd_schmm_untied/makeuntiedmixw.pl
time/scripts_pl/04.cd_schmm_untied/norm_and_launchbw.pl
time/scripts_pl/04.cd_schmm_untied/norm.pl
time/scripts_pl/04.cd_schmm_untied/slave_convg.pl
time/scripts_pl/05.buildtrees
time/scripts_pl/05.buildtrees/make_questions.pl
time/scripts_pl/05.buildtrees/slave.treebuilder.pl
time/scripts_pl/05.buildtrees/buildtree.pl
time/scripts_pl/06.prunetree
time/scripts_pl/06.prunetree/prunetree.pl
time/scripts_pl/06.prunetree/slave.state-tie-er.pl
time/scripts_pl/06.prunetree/tiestate.pl
time/scripts_pl/07.cd-schmm
time/scripts_pl/07.cd-schmm/baum_welch.pl
time/scripts_pl/07.cd-schmm/norm.pl
time/scripts_pl/07.cd-schmm/norm_and_launchbw.pl
time/scripts_pl/07.cd-schmm/slave_convg.pl
time/scripts_pl/08.deleted-interpolation
time/scripts_pl/08.deleted-interpolation/deleted_interpolation.pl
time/scripts_pl/09.make_s2_models
time/scripts_pl/09.make_s2_models/make_s2_models.pl
time/scripts_pl/mc
time/scripts_pl/mc/mc_status.pl
time/scripts_pl/mc/mc_run.pl
time/scripts_pl/mc/mc_kill.pl
time/scripts_pl/mc/mc_check.pl
time/builds2model
time/time.html
time/.02.bw.1.1.state.gif
time/.02.bw.1.2.state.gif

I just copied instructions from wiki, my system is gentoo. 
I installed sphinx (from CVS) and festival.
And also downloaded/unpacked this hup4 thing.

> I'm still here, but don't have a lot of time to spend on this stuff right
> now. So no promises, but I'll try to help.

:).

I'd just like to know how to apply these train data.
And how should i do this training for few dozens of sentences?
It would be nice to force sphinx to recognize ONLY these sentences.
So if I have sentences:
foo bar
acme xxx

i would like sphinx to NOT recognize foo xxx :).

bye,
Filip Zyzniewski

PS. I am subscribed, so you don't have to cc me :)

On Sat, 26 Jun 2004, Filip djMedrzec Zyzniewski wrote:

> I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain,
> but i don't know what to do with the result. It does not resemple
> hmm directory structure...

What result do you get? Can you send us the resulting directory structure?

> hope some of you are still here,

I'm still here, but don't have a lot of time to spend on this stuff right
now. So no promises, but I'll try to help.

j

Hello.

I've just discovered CMU Sphinx and I am starting to like it.

I've seen on xvoice-sphinx homepage that some of you played
with it to do various things like training.

I'd like to use sphinx2 to build a simple desktop environment
control system.

For now I just parse output of sphinx2-continuous with an
awk script. This script confirms commands using festival 
and launches what i specify to launch.

What i'd like to accomplish is a system, that:
- recognizes about 100-200 words (or phrases, like "workspace 1".
  this way it would be even simplier)
- does it with good accuracy
- doesn't eat up too much CPU.

My sphinx2 setup currently recognizes 22 phrases.
I created a simple dictionary with
http://www.speech.cs.cmu.edu/tools/lmtool.html
(problem: how to make sphinx2 recognize ONLY phrases, not
 single words in them).

For now, accuracy level is just too low. For that small
set of phrases i thing it can be done better. I'd like
to train sphinx.
Maybe I could use polish language instead of english?

I've tried http://xvoice.arborius.net/xvoice-sphinx/RunningSphinxTrain,
but i don't know what to do with the result. It does not resemple
hmm directory structure...

Sphinx documentation is just... scary...

BTW, I've tried cvoicecontrol, but it plainly sucks. Accuracy level
is low, and CPU usage is 100% all the time.

hope some of you are still here,

Filip Zyzniewski

Hi @all!!

I am a new user of the sphinx2 engine. I just need to know, which file
format allphone-test is able to read and what is exactly meant by pgm? Hope, that
someone knows.

Thanx a lot!! 

-- 
NEU FÜR ALLE - GMX MediaCenter - für Fotos, Musik, Dateien...
Fotoalbum, File Sharing, MMS, Multimedia-Gruß, GMX FotoService

Jetzt kostenlos anmelden unter http://www.gmx.net

+++ GMX - die erste Adresse für Mail, Message, More! +++

In an effort to defend my scripts, let me make a few notes:

First of all, for some reason my browser is unable to access the xvoice-sphinx site anymore so I don't know exactly what the content is.  Usage for my buildlm.sh script is as follows:

The script and all resource files should be located in the CMU_Cam_Toolkit_v2/bin directory, or wherever your LM Builder binaries are located.

The argument list at the top of buildlm.sh are your files to be tailored to your individual names.  Input.txt is the corpus, make sure it does not have any punctuation such as , ' . " etc.  

The purpose of make_dic.java is two-fold; First, it builds the dictionary by looking for any new words not in $DICT, and then using cmudict06d.dic as reference to add the new words.  Second, it conditions the corpus for input into the LM toolkit.  The LM Builder expects the corpus in a certain syntax: "<s> (caps)CORPUS-LINE </s>"  The make_dic.java outputs the conditioned line into $SENT.  Note:  I have it set so each iteration you run through this script make_dic appends the input to $SENT, does not over-write.  This is so you can feed various transcripts through and grow the LM & dictionary.  However during initial debugging, it might be a good idea to throw in a "rm $SENT" in the script so it doesn't get bogged down.

I built and tested my script using a small vocabulary, about 200-300 words, and it took about 3 minutes to build a dictionary from scratch.  If you already have a large Sphinx dictionary built and just need the LM Builder portion, you can comment out the java make_dic line.  Just make sure you have your corpus conditioned as described above, you can use 'sed' to do the same thing from the shell, I forget exactly how to do that though.  

I would also like to make a recommendation for the decoder settings.  Since I used a relatively small vocabulary, I was able to change the settings from default to "wide beam" mode without sacrificing much speed.  The documentation mentions it somewhere the settings to change from default are: " -top 4 -topsenfrm 4 -topsenthresh -80000 -beam 1e-06 npbeam 1e-06 -lpbeam 1e-05 -lpbeam 1e-05 -lponlybeam 0.0003 -nwbeam 0.0003 -fwdflat TRUE"  My accuracy rate improved significantly with these settings, decoding time increased by about 2-3 seconds.

I hope it helps, you can get in touch with me if you have any more questions about my scripts.

Steve

-----Original Message-----
From: xvo...@li... [mailto:xvo...@li...] 
Sent: Friday, October 31, 2003 11:51 PM
To: xvo...@li...
Subject: Xvoice-sphinx digest, Vol 1 #182 - 1 msg

Send Xvoice-sphinx mailing list submissions to
	xvo...@li...

To subscribe or unsubscribe via the World Wide Web, visit
	https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx
or, via email, send a message with subject or body 'help' to
	xvo...@li...

You can reach the person managing the list at
	xvo...@li...

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Xvoice-sphinx digest..."

Today's Topics:

   1. Re: Re: Xvoice-sphinx digest, Vol 1 #180 - 1 msg (Jessica P. Hekman)

--__--__--

Message: 1
Date: Fri, 31 Oct 2003 11:11:49 -0500 (EST)
To: S Samba Siva Rao <ss...@cs...>
cc: xvo...@li...
Subject: Re: [Xvoice-sphinx] Re: Xvoice-sphinx digest, Vol 1 #180 - 1 msg
From: "Jessica P. Hekman" <jph...@ar...>
Reply-To: "Jessica P. Hekman" <jph...@ar...>

On Thu, 30 Oct 2003, S Samba Siva Rao wrote:

>   Ya! Really I stucked at this point. I read that Jessica madam made one
>   LM. What about it? Is it working fine. I am unable to figure out what
>   exactly I should do. Any help please.

The LM I build does not break -- it does work. But it does not recognize 
very well. The project needs an LM which recognizes much better. That's 
what I was hoping you might be able to do.

In jy-scripts.tgz, you should see a run-cmu.sh file. Take a look at this 
file. It is a shell script which has the command that you need to run to 
build an LM. You will need to edit it to change the paths. I used this 
script to build an LM, but the LM that I build did not work.

In sb-scripts.tgz, you will see buildlm.sh. This is another shell script 
with another set of commands. When I ran this, the Java command ran for 
days before I killed it. When I re-ran the script with the Java command 
commented out, it successfully built an LM -- but not a good one. (The 
Java command was building the dictionary, and since it never completed, it 
wasn't a very good dictionary.)

Your first step should probably be to make sure you can get back to where 
I was: to run both scripts. Can you do that?

j

--__--__--

_______________________________________________
Xvoice-sphinx mailing list
Xvo...@li...
https://lists.sourceforge.net/lists/listinfo/xvoice-sphinx

End of Xvoice-sphinx Digest

On Sat, 1 Nov 2003, S Samba Siva Rao wrote:

>  What is the input for run-cmu.sh? 

 run-cmu.sh corpus.txt

where corpus.txt is whatever file you want to use as your corpus; the one 
I used is available for download.

>  where is cmu-unstressed dictionary?

Not sure where I got that from, so I made it available at

 http://xvoice.sourceforge.net/xvoice-sphinx/cmudict-unstressed.gz

>  Can you clearly explain what is the difference between dictionary and
> language model.

A dictionary is a list of words and how each one is pronounced.

A language model is a file detailing which words are likely to occur and 
when (for example, it is unlikely for a person to say "a the" right next 
to each other).

>  Can you bit clear about this script please?

I'm not sure what you need to know! I don't understand what each of the
commands do, myself. But the series of commands build a dictionary and a
language model, basically.

j

> In jy-scripts.tgz, you should see a run-cmu.sh file. Take a look at this 
> file. It is a shell script which has the command that you need to run to 
> build an LM. You will need to edit it to change the paths. I used this 
> script to build an LM, but the LM that I build did not work.
 Dear madam!  
 What is the input for run-cmu.sh? 
 where is cmu-unstressed dictionary?

 Can you clearly explain what is the difference between dictionary and
language model.
 Can you bit clear about this script please?

samba
-- 
-----
"When you do the common things in life in an uncommon way, you will command the attention of the world." 
- George Washington Carver (1864-1943)