Menu

Adapting a default acoustic model: segmentation fault when reading mdef file

Help
2016-08-24
2016-08-24
  • Dino The Dinosaur

    Hello!

    I ran into problems when trying to adapt a continuous russian acoustic model (zero_ru.cd_cont_4000) at the step of collecting statistics. My command and log are below:

    $ ./bw -hmmdir ru2 -moddeffn ru2/mdef.txt -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -dictfn ru.dic -ctlfn ru_train.fileids -lsnfn ru_train.transcription -accumdir .
    INFO: main.c(229): Compiled on Aug 4 2016 at 10:44:45
    Current configuration:
    [NAME] [DEFLT] [VALUE]
    -2passvar no no
    -abeam 1e-100 1.000000e-100
    -accumdir .
    -agc none none
    -agcthresh 2.0 2.000000e+000
    -bbeam 1e-100 1.000000e-100
    -cb2mllrfn .1cls. .1cls.
    -cepdir
    -cepext mfc mfc
    -ceplen 13 13
    -ckptintv 0
    -cmn live current
    -cmninit 40,3,-1 40,3,-1
    -ctlfn ru_train.fileids
    -diagfull no no
    -dictfn ru.dic
    -example no no
    -fdictfn
    -feat 1s_c_d_dd 1s_c_d_dd
    -fullvar no no
    -help no no
    -hmmdir ru
    -latdir
    -latext
    -lda
    -ldadim 0 0
    -lsnfn ru_train.transcription
    -lw 11.5 1.150000e+001
    -maxuttlen 0 0
    -meanfn
    -meanreest yes yes
    -mixwfn
    -mixwreest yes yes
    -mllrmat
    -mmie no no
    -mmie_type rand rand
    -moddeffn ru/mdef.txt
    -mwfloor 0.00001 1.000000e-005
    -npart 0
    -nskip 0
    -outphsegdir
    -outputfullpath no no
    -part 0
    -pdumpdir
    -phsegdir
    -phsegext phseg phseg
    -runlen -1 -1
    -sentdir
    -sentext sent sent
    -spthresh 0.0 0.000000e+000
    -svspec
    -timing yes yes
    -tmatfn
    -tmatreest yes yes
    -topn 4 4
    -tpfloor 0.0001 1.000000e-004
    -ts2cbfn .cont.
    -varfloor 0.00001 1.000000e-005
    -varfn
    -varnorm no no
    -varreest yes yes
    -viterbi no no

    INFO: feat.c(715): Initializing feature stream to type: '1s_c_d_dd', ceplen=13, CMN='current', VARNORM='no', AGC='none'
    INFO: cmn.c(143): mean[0]= 12.00, mean[1..12]= 0.0
    INFO: main.c(318): Reading ru/mdef.txt
    Segmentation fault

    I tried adding the -lda option, but the log then looks like this:

    $ ./bw -hmmdir ru2 -moddeffn ru2/mdef -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -dictfn ru.dic -ctlfn ru_train.fileids -lsnfn ru_train.transcription -accumdir -lda ru/feature_transform .
    INFO: main.c(229): Compiled on Aug 4 2016 at 10:44:45
    ERROR: "cmd_ln.c", line 604: Unknown argument name 'ru/feature_transform'
    ERROR: "cmd_ln.c", line 701: Failed to parse arguments list
    ERROR: "cmd_ln.c", line 750: Failed to parse arguments list, forced exit

    What could be the possible problems?

    Here is my working directory without audio files (it shouldn't be the case anyway, I think) http://www.megafileupload.com/snn0/2gis_adapt.zip

     
    • Nickolay V. Shmyrev

      It should be

      $ ./bw -hmmdir ru2 -moddeffn ru2/mdef -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -dictfn ru.dic -ctlfn ru_train.fileids -lsnfn ru_train.transcription -accumdir . -lda ru/feature_transform
      

      Dot is an value fo the argument of accumdir (current folder).

       
      • Dino The Dinosaur

        Oh, right, my bad.
        It gives the segmentation error too though

         
        • Nickolay V. Shmyrev

          To debug the crash you need to provide the information about the version of software you are using. What OS, what sphinxtrain version and so on. If you are on Linux you might provide the output of the tool under valgrind:

          valgrind ./bw -hmmdir ru2 -moddeffn ru2/mdef -ts2cbfn .cont. -feat 1s_c_d_dd -cmn current -agc none -dictfn ru.dic -ctlfn ru_train.fileids -lsnfn ru_train.transcription -accumdir . -lda ru/feature_transform
          

          You also have incompatible phoneset. You need to use phoneset from ru2/ru2.dic, not ru.dic The ru.dic you have in root folder is quite aligned with Russian phonetics either. Stress is very important for Russian.

           
  • Dino The Dinosaur

    Yes, my mistakes are even embarassing! I understood what was the isuue. I copied wrong binaries to the folder (I work from Cygwin, but copied Windows binaries) and messed up with the dictionaries (used the one that I made myself, when trying to train an acoustic model).

    Thank you for your help!

     

    Last edit: Dino The Dinosaur 2016-08-24

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.