The training data is the set of 321 speakers, from both the speaker
independent and speaker dependent portions in the training and development
test sets in the wsj0 and wsj1 database.
I would like to obtain a typical sample of the audio from the coprora to
build your WSJ acoustic models.
I am writing from Europe and I woudl like to get a feel for the rate and type
of speech and accent to feed into my program.
This may be forbidden under the terms of agreement with the supplier, in that
case a someone speaking in the manner of the training set audio would suffice.
Any suggestions would be most helpful.
Peter
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
In addition and still regarding that page, there is discussion about the
variables: language weight (lw), relative beam width (beam), and new word beam
width (nwbeam). From the config.xml files I am familiar with <config> </config>
But not new word beam width (nwbeam). Is this a Sphinx 3 setting that is not
avaiable under that name in Sphinx 4?
sphinx3 beams are different from sphinx4 beams, you can't use the same values.
If you are looking for WSJ config for sphinx4, you can find it in
tests/performance/wsj5k or tests/performance/wsj20k.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, can you please advise which of the following WSJ corpora were used to
train the WSJ acoustic models available with Sphinx 4.
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S6A
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S6B
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S6C
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC94S13A
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC94S13B
http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC94S13C
Is there a document that gives these details?
Many thanks. Peter
http://cmusphinx.sourceforge.net/wiki/sphinx4:wsjtasks33optimization
So both [http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S6A]
(http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC93S6A) and ht
tp://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC94S13A were used. It's
default CMU wsj setup.
Many thanks nshmyrev.
This is probably the wrong place to request this.
I would like to obtain a typical sample of the audio from the coprora to
build your WSJ acoustic models.
I am writing from Europe and I woudl like to get a feel for the rate and type
of speech and accent to feed into my program.
This may be forbidden under the terms of agreement with the supplier, in that
case a someone speaking in the manner of the training set audio would suffice.
Any suggestions would be most helpful.
Peter
You can find samples in pocketsphinx/test/data/wsj
Fantastic.
Thank you
I refer to the page you directed me to: http://cmusphinx.sourceforge.net/wiki
/sphinx4:wsjtasks33optimization
Where is the parent of this page?
It is not apparant on http://cmusphinx.sourceforge.net/wiki/
There is a page http://cmusphinx.sourceforge.net/wiki/sphinx4:webhome and it refers to
WSJTaskS3.3Optimization S3.3 Decoder optimization for WSJ with link: http://c
musphinx.sourceforge.net/wiki/sphinx4:wsjtasks3.3optimization_s3.3_decoder_opt
imization_for_wsj This link is broken.
Can you please advise the location with the link to: http://cmusphinx.sourcef
orge.net/wiki/sphinx4:wsjtasks33optimization
In addition and still regarding that page, there is discussion about the
variables: language weight (lw), relative beam width (beam), and new word beam
width (nwbeam). From the config.xml files I am familiar with <config> </config>
<property name="absoluteBeamWidth" value="500">
<property name="relativeBeamWidth" value="1E-80">
<property name="absoluteWordBeamWidth" value="20">
<property name="relativeWordBeamWidth" value="1E-60">
<property name="wordInsertionProbability" value="1E-16">
<property name="languageWeight" value="7.0">
<property name="silenceInsertionProbability" value=".1">
<property name="frontend" value="epFrontEnd">
<property name="recognizer" value="recognizer">
<property name="showCreations" value="false"> </property></property></property></property></property></property></property></property></property></property>
But not new word beam width (nwbeam). Is this a Sphinx 3 setting that is not
avaiable under that name in Sphinx 4?
Many thanks
There is no parent
sphinx3 beams are different from sphinx4 beams, you can't use the same values.
If you are looking for WSJ config for sphinx4, you can find it in
tests/performance/wsj5k or tests/performance/wsj20k.