Menu

error at .wav file while training acoustic

Help
2010-11-10
2012-09-22
<< < 1 2 (Page 2 of 2)
  • Nasir Hussain

    Nasir Hussain - 2010-11-11

    oops my bad type

    perl $SPHINXTRAINDIR/scripts_pl/Runall.pl

     
  • Basit Mahmood

    Basit Mahmood - 2010-11-11

    I have run this command(./scripts_pl/RunAll.pl). Here is the extract from
    tutorial

    To train just run

    ./scripts_pl/RunAll.pl

    and it will go through all the required stages. It will take few minutes to
    train. On large databases training could take month.

    During the stages the most important stage is the first one which checks that
    everything is configured correctly and your input data is consistent. Do not
    ignore the errors reported on the first 00.verify_all step.

    The typical output during decoding will look like:

    Baum welch starting for 2 Gaussian(s), iteration: 3 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 3
    Current Overall Likelihood Per Frame = 30.6558644286942
    Convergence Ratio = 0.633864444461992
    Baum welch starting for 2 Gaussian(s), iteration: 4 (1 of 1)
    0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
    Normalization for iteration: 4

    This scripts process all required steps to train the model. In the scripts
    directory (./scripts_pl), there are several directories numbered sequentially
    from 00 through 99. Each directory either has a directory named slave.pl or
    it has a single file with extension .pl. Sequentially go through the
    directories and execute either the the slave
    .pl or the single .pl file, as
    below.

    perl scripts_pl/00.verify/verify_all.pl
    perl scripts_pl/10.vector_quantize/slave.VQ.pl
    perl scripts_pl/20.ci_hmm/slave_convg.pl
    perl scripts_pl/30.cd_hmm_untied/slave_convg.pl
    perl scripts_pl/40.buildtrees/slave.treebuilder.pl
    perl scripts_pl/45.prunetree/slave-state-tying.pl
    perl scripts_pl/50.cd_hmm_tied/slave_convg.pl
    perl scripts_pl/90.deleted_interpolation/deleted_interpolation.pl

    The scripts will launch jobs on your machine, and the jobs will take a few
    minutes each to run through. Before you run any script, note the directory
    contents of your current directory. After you run each slave*.pl note the
    contents again. Several new directories will have been created. These
    directories contain files which are being generated in the course of your
    training. At this point you need not know about the contents of these
    directories, though some of the directory names may be self explanatory and
    you may explore them if you are curious.

    Now i am asking for the next step

    This scripts process all required steps to train the model. In the scripts
    directory (./scripts_pl), there are several directories numbered sequentially
    from 00 through 99. Each directory either has a directory named slave.pl or
    it has a single file with extension .pl. Sequentially go through the
    directories and execute either the the slave
    .pl or the single .pl file, as
    below.
    ....

    Here i am getting error. That i told you

    cd scripts_pl/

    You have new mail in /var/spool/mail/root

    cd 00.verify/

    ls

    verify_all.pl

    verify_all.pl

    -bash: verify_all.pl: command not found

    perl verify_all.pl

    Configuration (e.g. etc/sphinx_train.cfg) not defined
    Compilation failed in require at verify_all.pl line 47.
    BEGIN failed--compilation aborted at verify_all.pl line 47.
    You have new mail in /var/spool/mail/root

    cd ..

    You have new mail in /var/spool/mail/root

    cd ..

    Thanks

     
  • Basit Mahmood

    Basit Mahmood - 2010-11-11

    Sorry it is working :) I was confused :) sorry

    perl scripts_pl/00.verify/verify_all.pl

    MODULE: 00 verify training files
    O.S. is case sensitive ("A" != "a").
    Phones will be treated as case sensitive.
    Phase 1: DICT - Checking to see if the dict and filler dict agrees with the
    phonelist file.
    Found 23 words using 27 phones
    Phase 2: DICT - Checking to make sure there are not duplicate entries in the
    dictionary
    Phase 3: CTL - Check general format; utterance length (must be positive);
    files exist
    Phase 4: CTL - Checking number of lines in the transcript should match lines
    in control file
    Phase 5: CTL - Determine amount of training data, see if n_tied_states seems
    reasonable.
    Total Hours Training: 0.00349594017094017
    This is a small amount of data, no comment at this time
    Phase 6: TRANSCRIPT - Checking that all the words in the transcript are in the
    dictionary
    Words in dictionary: 20
    Words in filler dictionary: 3
    Phase 7: TRANSCRIPT - Checking that all the phones in the transcript are in
    the phonelist, and all phones in the phonelist appear at least once

    perl scripts_pl/01.vector_quantize/slave.VQ.pl

    MODULE: 01 Vector Quantization
    Skipped for continuous models
    You have new mail in /var/spool/mail/root

    thanks

     
  • Nasir Hussain

    Nasir Hussain - 2010-11-12

    Hello Basit,

    Sorry it is working :) I was confused :) sorry

    Galti to galti se hoti hai na?
    If you got any problem you know where to find me :)

    -Nasir

     
  • Basit Mahmood

    Basit Mahmood - 2010-11-12

    Hello Nasir,
    Hope you will be fine. Sorry to bother you again:). I have run all the
    commands that tutorial said.

    perl scripts_pl/00.verify/verify_all.pl
    perl scripts_pl/10.vector_quantize/slave.VQ.pl
    perl scripts_pl/20.ci_hmm/slave_convg.pl
    perl scripts_pl/30.cd_hmm_untied/slave_convg.pl
    perl scripts_pl/40.buildtrees/slave.treebuilder.pl
    perl scripts_pl/45.prunetree/slave-state-tying.pl
    perl scripts_pl/50.cd_hmm_tied/slave_convg.pl
    perl scripts_pl/90.deleted_interpolation/deleted_interpolation.pl

    But when i run decode command. I get this

    ./scripts_pl/decode/slave.pl

    MODULE: DECODE Decoding using models previously trained
    Decoding 7 segments starting at 0 (part 1 of 1)
    0%
    Aligning results to find error rate
    Can't open /usr/basit/sphinx/tutorial/testdb/result/testdb-1-1.match
    word_align.pl failed with error code 65280 at ./scripts_pl/decode/slave.pl
    line 172.

    perl scripts_pl/decode/slave.pl

    MODULE: DECODE Decoding using models previously trained
    Decoding 7 segments starting at 0 (part 1 of 1)
    0%
    Aligning results to find error rate
    Can't open /usr/basit/sphinx/tutorial/testdb/result/testdb-1-1.match
    word_align.pl failed with error code 65280 at scripts_pl/decode/slave.pl line
    172.

    cd ..

    I want to ask it's because of my small vocabulary or something wrong going
    here?

    Also in the model_parameters directory there are several directories. Like

    testdb.cd_cont_1000
    testdb.cd_cont_1000_1
    testdb.cd_cont_1000_2
    testdb.cd_cont_1000_4
    testdb.cd_cont_1000_8
    testdb.cd_cont_untied
    testdb.ci_cont
    testdb.ci_cont_flatinitial
    testdb.cd_cont_initial

    which one is my model that i will point out in my sphinx4 xml config file.
    Also in the xml config file

    <property name="location" value="the path to the model folder&lt;br&gt;for example \&lt;your_training_folder&gt;/model_parameters/\&lt;your_model_name&gt;.cd_cont_\&lt;senones&gt;"> </property>

    If i put this model suppose testdb.cd_cont_1000 in my winows xp folder
    (E:\basit\testdb.cd_cont_1000). Then this will become

    <property name="location" value="E:\basit\testdb.cd_cont_1000"> </property>

    and same goes for lm.DMP, .dic and .filler. Is it?
    Thanks

     
  • Nasir Hussain

    Nasir Hussain - 2010-11-12

    Hello Basit,

    I want to ask it's because of my small vocabulary or something wrong going
    here?

    I Frankly dont know about the decode stuff as i havent done it myself
    personally.

    which one is my model that i will point out in my sphinx4 xml config file.

    See you can either choose
    testdb.cd_cont_1000 or testdb.ci_cont
    Depending upon your test wave files.Now as i know about your wav files, So for
    your purpose you can use testdb.ci_cont as you have very few vocabulary,
    But you could have choosen testdb.cd_cont_1000 if only you had continuos
    speech in your sentences

    Also in the xml config file " <property name="location" value="the path to the model folder for example \&lt;your_training_folder&gt;/model_parameters/\&lt;your_model_name&gt;.cd_cont_\&lt;senones&gt;">
    " If i put this model suppose testdb.cd_cont_1000 in my winows xp folder
    (E:\basit\testdb.cd_cont_1000). Then this will become " <property name="location" value="E:\basit\testdb.cd_cont_1000"> " and same goes for
    lm.DMP, .dic and .filler. Is it? Thanks</property></property>

    Yes Exactly...:)

    -Nasir

     
  • Basit Mahmood

    Basit Mahmood - 2010-11-12

    Hi,
    It's nice to hear you again. Thanks:) cha gaye tum :) I want to ask one thing
    can i check my model to see it's progress. I found this tutorial

    http://sphinx.subwiki.com/sphinx/index.php/Hello_World_Decoder_QuickStart_G
    uide

    I have also created a file cfgfile

    -samprate 16000
    -hmm /usr/basit/sphinx/tutorial/testdb/model_parameters/testdb.cd_cont_1000
    -dict /usr/basit/sphinx/tutorial/testdb/etc/testdb.dic
    -fdict /usr/basit/sphinx/tutorial/testdb/etc/testdb.filler
    -lm /usr/basit/sphinx/tutorial/testdb/etc/testdb.lm.DMP

    but i am unable to run this command

    sphinx3_livepretend ctlfile . cfgfile

    where this sphinx3_livepretend located?

    Also i want to ask one thing in advance about .gram file. As i will use my
    model so i want to ask when i created .gram file then what words i will put in
    the .gram file. Words that i use in my model or .gram file can contain any
    kind of words.

    Secondly suppose i again train my model for large vocabulary. Then in this
    case will i speak the same sentence by different people. As you know my model,
    you know that it contains voice of mine only. When i will train large
    vocabulary then i will have to speak words or sentences by different people.
    Suppose the sentence

    MY NAME IS BASIt

    is said by three people Nasir, Asad and Shahrukh. Then how will i mention it
    in my .fileids and .transcription file. Like this

    basit/myname, Asad/myname, Shahrukh/myname, Nasir/myname

    and .transcription file will be like

    MY NAME IS BASIT (basit/myname) (Asad/myname) (Nasir/myname)

    Or in other words what steps should i follow when i train my model for large
    vocabulary

    too many questions ssheeww:)
    Thnaks

     
  • Nasir Hussain

    Nasir Hussain - 2010-11-12

    Hello Basit,

    Sorry for replying Late...I was kind of Busy in Something..

    where this sphinx3_livepretend located?

    This is a Sphinx 3 command.You need to install sphinx 3 as given in the
    tutorial..And install it with sudo preference :)

    Also i want to ask one thing in advance about .gram file. As i will use my
    model so i want to ask when i created .gram file then what words i will put in
    the .gram file. Words that i use in my model or .gram file can contain any
    kind of words.

    The .gram files should contain the words that you want to get recognised..It
    should not contain random words that you dont want to get recognised.

    Secondly suppose i again train my model for large vocabulary. Then in this
    case will i speak the same sentence by different people. As you know my model,
    you know that it contains voice of mine only. When i will train large
    vocabulary then i will have to speak words or sentences by different people.
    Suppose the sentence

    See it totally depends upon your need.If you want your model to be user
    specific. Than you can record your own voice for the acoustic model.You can
    just record paragraphs from a book or so and thats all... But if you want your
    model to be recognised by other people too. Than you will have to add voice of
    other people in your model to get good recognition ...:)

    is said by three people Nasir, Asad and Shahrukh. Then how will i mention it
    in my .fileids and .transcription file. Like this " basit/myname, Asad/myname,
    Shahrukh/myname, Nasir/myname "

    Yes the file structure should me like this

    and .transcription file will be like " MY NAME IS BASIT
    (basit/myname) (Asad/myname) (Nasir/myname) "

    No. It should me like this

    <s> MY NAME IS BASIT </s> (basit/myname)
    <s> MY NAME IS BASIT </s> (Asad/myname)
    <s> MY NAME IS BASIT </s> (Nasir/myname)
    

    -Nasir

     
  • Basit Mahmood

    Basit Mahmood - 2010-11-12

    Thanks:) no problem:) ye to hota ha :)
    Actually i have to made model after sometime that is recognized by other
    people. As regard to .transcript file it is clear how will i make
    transcription file for same sentence spoken by different people. But in
    .flieds will i separate with commas or use different line. Means the syntax is
    comma seperated. Like

    Asad/myname, Shahrukh/myname, Nasir/myname

    or new line seperated

    Asad/myname
    Sharukh/myname
    Nasir/myname

    As regard to grammar file i want to ask. You know that i have a very small
    vocabulary in my model (testdb). Suppose i train model with three words
    only(Please suppose only :) ) hello, Nasir, Basit. Just these three words. Now
    when i will made grammar file, can i use words like animal, how r u, what\s
    going wrong, in my .gram file. My model don't have these words that i am using
    in grammar file. Or you can say i want to ask is grammar file depends on
    vocabulary used in model or it is a separate thing that doesn't care about
    acoustic model, language model.

    Thanks

     
  • Nasir Hussain

    Nasir Hussain - 2010-11-13

    Hello Basit,

    Actually i have to made model after sometime that is recognized by other
    people. As regard to .transcript file it is clear how will i make
    transcription file for same sentence spoken by different people. But in
    .flieds will i separate with commas or use different line. Means the syntax is
    comma seperated. Like " Asad/myname, Shahrukh/myname, Nasir/myname " or new
    line seperated " Asad/myname Sharukh/myname Nasir/myname "

    Use new line...:D

    As regard to grammar file i want to ask. You know that i have a very small
    vocabulary in my model (testdb). Suppose i train model with three words
    only(Please suppose only :) ) hello, Nasir, Basit. Just these three words. Now
    when i will made grammar file, can i use words like animal, how r u, what\s
    going wrong, in my .gram file. My model don't have these words that i am using
    in grammar file. Or you can say i want to ask is grammar file depends on
    vocabulary used in model or it is a separate thing that doesn't care about
    acoustic model, language model.

    Lol. See grammer file depends upon the dictonary file.If the word is not
    present in dictonary than you cannot use it in your grammer.On the other hand
    acoustic model is totally a probalistic model.Unless and untill you feed it
    with huge amount of data you cannot keep your expectations high to get good
    result.Are you getting what i am saying???

    -Nasir

     
  • Basit Mahmood

    Basit Mahmood - 2010-11-22

    Hello Nasir,
    Sorry. You know why i am saying sorry to you. I have mentioned it in my email
    that i send to you.Anyways. You mean to say that grammar file does not depend
    upon acoustic model but it depends on dictionary. Grammar file does not
    concern about the pronunciation spoken by different people. It just rely on
    words or sentences that are define in the dictionary. Is it?

     
<< < 1 2 (Page 2 of 2)

Log in to post a comment.