Menu

gmm-align-compiled did not successfully decode file

Help
Jake
2014-12-19
2015-01-03
  • Jake

    Jake - 2014-12-19

    I'm using Kaldi to train the data I have, and the process stopped right after "train_mono.sh: Aligning data". By looking the log file, it shows as follows:

    gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1 --self-loop-scale=0.1 --beam=6 --retry-beam=24 'gmm-boost-silence --boost=1.25 1 exp/mono/1.mdl - |' 'ark:gunzip -c exp/mono/fsts.10.gz|' 'ark,s,cs:apply-cmvn --utt2spk=ark:data/train/split20/10/utt2spk scp:data/train/split20/10/cmvn.scp scp:data/train/split20/10/feats.scp ark:- | add-deltas ark:- ark:- |' 'ark,t:|gzip -c >exp/mono/ali.10.gz'
    gmm-boost-silence --boost=1.25 1 exp/mono/1.mdl -
    WARNING (gmm-boost-silence:main():gmm-boost-silence.cc:82) The pdfs for the silence phones may be shared by other phones (note: this probably does not matter.)
    LOG (gmm-boost-silence:main():gmm-boost-silence.cc:93) Boosted weights for 5 pdfs, by factor of 1.25
    LOG (gmm-boost-silence:main():gmm-boost-silence.cc:103) Wrote model to -
    add-deltas ark:- ark:-
    apply-cmvn --utt2spk=ark:data/train/split20/10/utt2spk scp:data/train/split20/10/cmvn.scp scp:data/train/split20/10/feats.scp ark:-
    WARNING (gmm-align-compiled:main():gmm-align-compiled.cc:143) Retrying utterance aa-ve070829 with beam 24
    WARNING (gmm-align-compiled:main():gmm-align-compiled.cc:172) Did not successfully decode file aa-ve070829, len = 4638
    ....

    All audio files (1-2 minutes per utterances) were failed in this step. Could you please help me figure out? Thanks a lot.

     
    • Jan "yenda" Trmal

      you can try either to increase the beam sizes (--beam and --retry-beam) or
      split the audio into smaller chunks (if feasible).
      y.

      On Fri, Dec 19, 2014 at 3:40 PM, Lawrence vjdtao@users.sf.net wrote:

      I'm using Kaldi to train the data I have, and the process stopped right
      after "train_mono.sh: Aligning data". By looking the log file, it shows as
      follows:

      gmm-align-compiled --transition-scale=1.0 --acoustic-scale=0.1
      --self-loop-scale=0.1 --beam=6 --retry-beam=24 'gmm-boost-silence
      --boost=1.25 1 exp/mono/1.mdl - |' 'ark:gunzip -c exp/mono/fsts.10.gz|'
      'ark,s,cs:apply-cmvn --utt2spk=ark:data/train/split20/10/utt2spk
      scp:data/train/split20/10/cmvn.scp scp:data/train/split20/10/feats.scp
      ark:- | add-deltas ark:- ark:- |' 'ark,t:|gzip -c >exp/mono/ali.10.gz'
      gmm-boost-silence --boost=1.25 1 exp/mono/1.mdl -
      WARNING (gmm-boost-silence:main():gmm-boost-silence.cc:82) The pdfs for
      the silence phones may be shared by other phones (note: this probably does
      not matter.)
      LOG (gmm-boost-silence:main():gmm-boost-silence.cc:93) Boosted weights for
      5 pdfs, by factor of 1.25
      LOG (gmm-boost-silence:main():gmm-boost-silence.cc:103) Wrote model to -
      add-deltas ark:- ark:-
      apply-cmvn --utt2spk=ark:data/train/split20/10/utt2spk
      scp:data/train/split20/10/cmvn.scp scp:data/train/split20/10/feats.scp ark:-
      WARNING (gmm-align-compiled:main():gmm-align-compiled.cc:143) Retrying
      utterance aa-ve070829 with beam 24
      WARNING (gmm-align-compiled:main():gmm-align-compiled.cc:172) Did not
      successfully decode file aa-ve070829, len = 4638
      ....

      All audio files (1-2 minutes per utterances) were failed in this step.
      Could you please help me figure out? Thanks a lot.


      gmm-align-compiled did not successfully decode file
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/c55b8c37/?limit=25#b394


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
    • Feiteng Li

      Feiteng Li - 2015-01-03

      hi,
      Maybe the bad/retry wav doesn't contain all the words in the transcript.
      In my experience, the bad/retry wavs often stop earlyer, some words at the end of the transcripts are not been recorded.

      Feiteng

       
  • Jake

    Jake - 2014-12-19

    Thanks, Jan. What the beam size values (--beam and --retry-beam) do you recommend?

    It's a great idea to split the long audio, but it'll take time to implement. I saw the librispeech ICASSP paper uses audio segmentation for long speech files, but it seems that the librispeech under the example directory doesn't include that work.

     
    • Jan "yenda" Trmal

      You will have to experiment with it. It is ok if not all of the utterances
      will align at this stage. You can also find some utterances that are short
      and align well -- for monophone training, not really much data is necessary.
      Yes, splitting the audio without any additional information would take
      some time to implement -- that's why I was saying "if feasible" :)
      y.

      On Fri, Dec 19, 2014 at 3:51 PM, Lawrence vjdtao@users.sf.net wrote:

      Thanks, Jan. What the beam size values (--beam and --retry-beam) do you
      recommend?

      It's a great idea to split the long audio, but it'll take time to
      implement. I saw the librispeech ICASSP paper uses audio segmentation for
      long speech files, but it seems that the librispeech under the example
      directory doesn't include that work.


      gmm-align-compiled did not successfully decode file
      https://sourceforge.net/p/kaldi/discussion/1355348/thread/c55b8c37/?limit=25#ef69


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Gilles Boulianne

        Another solution is to select of subset of the shortest audio files for the first few steps of training (which do not require a lot of data).

        Gilles

        Le 2014-12-19 à 09:58, Jan Trmal jtrmal@users.sf.net a écrit :

        You will have to experiment with it. It is ok if not all of the utterances
        will align at this stage. You can also find some utterances that are short
        and align well -- for monophone training, not really much data is necessary.
        Yes, splitting the audio without any additional information would take
        some time to implement -- that's why I was saying "if feasible" :)
        y.

        On Fri, Dec 19, 2014 at 3:51 PM, Lawrence vjdtao@users.sf.net wrote:

        Thanks, Jan. What the beam size values (--beam and --retry-beam) do you
        recommend?

        It's a great idea to split the long audio, but it'll take time to
        implement. I saw the librispeech ICASSP paper uses audio segmentation for
        long speech files, but it seems that the librispeech under the example
        directory doesn't include that work.

        gmm-align-compiled did not successfully decode file
        https://sourceforge.net/p/kaldi/discussion/1355348/thread/c55b8c37/?limit=25#ef69

        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

        gmm-align-compiled did not successfully decode file

        Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355348/

        To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/

         
    • Guoguo Chen

      Guoguo Chen - 2014-12-19

      I'm running some experiments for this right now. That will be finished pretty soon.

      Guoguo