Menu

I can't run the acoustic model for corpus an4 (Help me!)

Help
2017-11-27
2019-07-06
  • Fabiano Luz

    Fabiano Luz - 2017-11-27

    I can't generate the acoustic model for copus an4

    Hi everyone, I'm trying to run the acoustic model for sphinxtrain-5prealpha. I am using MAC OS 10.11.6 version.

    I installed the packages:

    sphinxbase-5prealpha
    sphinxtrain-5prealpha
    pocketsphinx-5prealpha

    according to: https://cmusphinx.github.io/wiki/tutorialam/#configuring-parallel-jobs-to-speedup-the-training

    I am using corpus an4 that I downloaded from: http://www.speech.cs.cmu.edu/databases/an4/
    in the sphere format audio.

    In the fold "sphinxtrain-5prealpha" I am running the commands:

    • sphinxtrain -t an4 setup (OK)
    • cd an4 (OK)
    • sphinxtrain run

    I get it in may output:

    Last login: Mon Nov 27 10:09:40 on console
    -bash: /Library/Java/JavaVirtualMachines/jdk1.8.0_111.jdk/Contents/Home/: is a directory
    spgpfileserv:~ fabiano.luz$ cd Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/
    spgpfileserv:sphinxtrain-5prealpha fabiano.luz$ sphinxtrain -t an4 setup
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Setting up the database an4
    spgpfileserv:sphinxtrain-5prealpha fabiano.luz$ cd an4
    spgpfileserv:an4 fabiano.luz$ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from  segments starting at  (part 1 of 1) 
    Extracting features from  segments starting at  (part 1 of 1) 
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 133 words using 34 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 0.702463888888889
            This is a small amount of data, no comment at this time
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 130
            Words in filler dictionary: 3
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 02 Train MLLT transformation
    Skipped (set $CFG_LDA_MLLT = 'yes' to enable)
    MODULE: 05 Vector Quantization
    Skipped for continuous models
    MODULE: 10 Training Context Independent models for forced alignment and VTLN
    Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    Skipped:  $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 11 Force-aligning transcripts
    Skipped:  $ST::CFG_FORCEDALIGN set to 'no' in sphinx_train.cfg
    MODULE: 12 Force-aligning data for VTLN
    Skipped:  $ST::CFG_VTLN set to 'no' in sphinx_train.cfg
    MODULE: 20 Training Context Independent models
        Phase 1: Cleaning up directories:
        accumulator...logs...qmanager...models...
        Phase 2: Flat initialize
    do "etc/sphinx_train.cfg" failed, '.' is no longer in @INC; did you mean do "./etc/sphinx_train.cfg"? at /usr/local/lib/sphinxtrain/scripts/20.ci_hmm/../prepare/../lib/SphinxTrain/Config.pm line 65.
    Configuration (e.g. etc/sphinx_train.cfg) not defined
    Compilation failed in require at /usr/local/lib/sphinxtrain/scripts/20.ci_hmm/../prepare/maketopology.pl line 43.
    BEGIN failed--compilation aborted at /usr/local/lib/sphinxtrain/scripts/20.ci_hmm/../prepare/maketopology.pl line 43.
    ERROR: This step had 1 ERROR messages and 0 WARNING messages.  Please check the log file for details.
        Phase 3: Forward-Backward
            Baum welch starting for 1 Gaussian(s), iteration: 1 (1 of 1)
    Waiting for /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    Waiting for /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    Waiting for /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    Waiting for /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    Waiting for /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    ...
    ...
    its makes eternally printing this line.
    

    the command "sphinxtrain run" generates three folders: 000.comp_feat, 05.vector_quantize and 20.ci_hmm
    and the log file "an4.makeflat_cihmm.log" :

    Current configuration:
    [NAME]      [DEFLT] [VALUE]
    -example    no  no
    -help       no  no
    -mixwfn         /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    -moddeffn       /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.ci.mdef
    -ndensity   256 1
    -nstream    4   1
    -tmatfn         /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/transition_matrices
    -topo           /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.topology
    
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 34 total models defined (34 base, 0 tri)
    INFO: model_def_io.c(575): 136 total states
    INFO: model_def_io.c(576): 102 total tied states
    INFO: model_def_io.c(577): 102 total tied CI states
    INFO: model_def_io.c(578): 34 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    ERROR: "topo_read.c", line 118: Unable to open /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.topology for reading
    : No such file or directory
    main.c(83): Reading model definition file /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.ci.mdef
    main.c(90): 34 models defined
    Mon Nov 27 10:39:03 2017
    

    If I change the sphinx_train.cfg file:

    from:

    $CFG_HMM_TYPE = '.cont.'; # Sphinx 4, PocketSphinx

    to:

    CFG_HMM_TYPE = '.semi.'; # PocketSphinx

    I get it:

    spgpfileserv:an4 fabiano.luz$ sphinxtrain run
    Sphinxtrain path: /usr/local/lib/sphinxtrain
    Sphinxtrain binaries path: /usr/local/libexec/sphinxtrain
    Running the training
    MODULE: 000 Computing feature from audio files
    Extracting features from  segments starting at  (part 1 of 1) 
    Extracting features from  segments starting at  (part 1 of 1) 
    Feature extraction is done
    MODULE: 00 verify training files
        Phase 1: Checking to see if the dict and filler dict agrees with the phonelist file.
            Found 133 words using 34 phones
        Phase 2: Checking to make sure there are not duplicate entries in the dictionary
        Phase 3: Check general format for the fileids file; utterance length (must be positive); files exist
        Phase 4: Checking number of lines in the transcript file should match lines in fileids file
        Phase 5: Determine amount of training data, see if n_tied_states seems reasonable.
            Estimated Total Hours Training: 0.702463888888889
            This is a small amount of data, no comment at this time
        Phase 6: Checking that all the words in the transcript are in the dictionary
            Words in dictionary: 130
            Words in filler dictionary: 3
        Phase 7: Checking that all the phones in the transcript are in the phonelist, and all phones in the phonelist appear at least once
    MODULE: 0000 train grapheme-to-phoneme model
    Skipped (set $CFG_G2P_MODEL = 'yes' to enable)
    MODULE: 01 Train LDA transformation
    Skipped for multistream setup, see CFG_NUM_STREAMS configuration
    LDA/MLLT only has sense for single stream features
    Skipping LDA training
    MODULE: 02 Train MLLT transformation
    Skipped for multistream setup, see CFG_NUM_STREAMS configuration
    LDA/MLLT only has sense for single stream features
    Skipping MLLT training
    MODULE: 05 Vector Quantization
    do "etc/sphinx_train.cfg" failed, '.' is no longer in @INC; did you mean do "./etc/sphinx_train.cfg"? at /usr/local/lib/sphinxtrain/scripts/05.vector_quantize/../lib/SphinxTrain/Config.pm line 65.
    Configuration (e.g. etc/sphinx_train.cfg) not defined
    Compilation failed in require at /usr/local/lib/sphinxtrain/scripts/05.vector_quantize/agg_seg.pl line 51.
    BEGIN failed--compilation aborted at /usr/local/lib/sphinxtrain/scripts/05.vector_quantize/agg_seg.pl line 51.
    

    in this case I do not get any error in the log files, but the run procedure only generates two folders 000.comp_feat and 05.vector_quantize

    my file "sphinx_train.cfg" can be viewed at:

    https://www.dropbox.com/s/o4yqztb6trh3zg7/sphinx_train.cfg?dl=0

    I'm a beginner on sphinx, could anyone help me?

     

    Last edit: Fabiano Luz 2017-11-27
    • Nickolay V. Shmyrev

      Clone all required packages (sphinxbase, pocketsphinx, sphinxtrain) from github and reinstall them.

       
  • Fabiano Luz

    Fabiano Luz - 2017-12-18

    I got its in my "an4.makeflat_cihmm.log" file:

    Current configuration:
    [NAME]      [DEFLT] [VALUE]
    -example    no  no
    -help       no  no
    -mixwfn         /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/mixture_weights
    -moddeffn       /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.ci.mdef
    -ndensity   256 1
    -nstream    4   1
    -tmatfn         /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_parameters/an4.ci_cont_flatinitial/transition_matrices
    -topo           /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.topology
    
    INFO: model_def_io.c(573): Model definition info:
    INFO: model_def_io.c(574): 34 total models defined (34 base, 0 tri)
    INFO: model_def_io.c(575): 136 total states
    INFO: model_def_io.c(576): 102 total tied states
    INFO: model_def_io.c(577): 102 total tied CI states
    INFO: model_def_io.c(578): 34 total tied transition matrices
    INFO: model_def_io.c(579): 4 max state/model
    INFO: model_def_io.c(580): 4 min state/model
    ERROR: "topo_read.c", line 118: Unable to open /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.topology for reading
    : No such file or directory
    main.c(83): Reading model definition file /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/model_architecture/an4.ci.mdef
    main.c(90): 34 models defined
    Mon Dec 18 15:46:41 2017
    

    Can someone help me?

     
  • Fabiano Luz

    Fabiano Luz - 2017-12-18

    I reinstalled perl and managed to walk a little. Agora eu tenho o seguinte erro:

    Failed to open /Users/fabiano.luz/Dropbox/AIProjects/ASR/sphinxtrain-5prealpha/an4/etc/feat.params: No such file or directory at /usr/local/lib/sphinxtrain/scripts/20.ci_hmm/../lib/SphinxTrain/Util.pm line 652.
            Current Overall Likelihood Per Frame = -151.880882996072
            Baum welch starting for 1 Gaussian(s), iteration: 2 (1 of 1)
    
     
  • waleed.makarem

    waleed.makarem - 2019-07-06

    did you resolve above issue ? I have same issue
    Failed to open D:/Sphinx/ara/etc/feat.params: No such file or directory at D:\sphinx\sphinxtrain\scrips\lib/SphinxTrain/Util.pm line 652.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.