Menu

Which corpus used to train WSJ acoustic model

Help
2010-04-09
2012-09-22
  • peterentropy

    peterentropy - 2010-04-11

    Many thanks nshmyrev.

    This is probably the wrong place to request this.

    I would like to obtain a typical sample of the audio from the coprora to
    build your WSJ acoustic models.

    I am writing from Europe and I woudl like to get a feel for the rate and type
    of speech and accent to feed into my program.

    This may be forbidden under the terms of agreement with the supplier, in that
    case a someone speaking in the manner of the training set audio would suffice.

    Any suggestions would be most helpful.

    Peter

     
  • Nickolay V. Shmyrev

    You can find samples in pocketsphinx/test/data/wsj

     
  • peterentropy

    peterentropy - 2010-04-11

    Fantastic.

    Thank you

     
  • peterentropy

    peterentropy - 2010-04-11

    I refer to the page you directed me to: http://cmusphinx.sourceforge.net/wiki
    /sphinx4:wsjtasks33optimization

    Where is the parent of this page?

    It is not apparant on http://cmusphinx.sourceforge.net/wiki/

    There is a page http://cmusphinx.sourceforge.net/wiki/sphinx4:webhome and it refers to
    WSJTaskS3.3Optimization S3.3 Decoder optimization for WSJ with link: http://c
    musphinx.sourceforge.net/wiki/sphinx4:wsjtasks3.3optimization_s3.3_decoder_opt
    imization_for_wsj
    This link is broken.

    Can you please advise the location with the link to: http://cmusphinx.sourcef
    orge.net/wiki/sphinx4:wsjtasks33optimization

    In addition and still regarding that page, there is discussion about the
    variables: language weight (lw), relative beam width (beam), and new word beam
    width (nwbeam). From the config.xml files I am familiar with <config> </config>

    <property name="absoluteBeamWidth" value="500">
    <property name="relativeBeamWidth" value="1E-80">
    <property name="absoluteWordBeamWidth" value="20">
    <property name="relativeWordBeamWidth" value="1E-60">
    <property name="wordInsertionProbability" value="1E-16">
    <property name="languageWeight" value="7.0">
    <property name="silenceInsertionProbability" value=".1">
    <property name="frontend" value="epFrontEnd">
    <property name="recognizer" value="recognizer">
    <property name="showCreations" value="false"> </property></property></property></property></property></property></property></property></property></property>

    But not new word beam width (nwbeam). Is this a Sphinx 3 setting that is not
    avaiable under that name in Sphinx 4?

    Many thanks

     
  • Nickolay V. Shmyrev

    Where is the parent of this page?

    There is no parent

    But not new word beam width (nwbeam). Is this a Sphinx 3 setting that is not
    avaiable under that name in Sphinx 4?

    sphinx3 beams are different from sphinx4 beams, you can't use the same values.
    If you are looking for WSJ config for sphinx4, you can find it in
    tests/performance/wsj5k or tests/performance/wsj20k.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.