Menu

train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments

2015-07-15
2015-07-15
  • Angel Castro

    Angel Castro - 2015-07-15

    Hi everyone,

    I have been trying to use the train_mmi.sh to train a model using boosted mmi, withouth the boosted flag, it seems to be working properly but once I gave it a value the following error came up

    lattice-boost-ali --b=0.3 --silence-phones=1:2:3:4:5:6:7:8:9:10:11:12:13:14:15 exp/tri4b_dnn_multi_ali/final.mdl scp:exp/tri4b_dnn_multi_bmmi/lat.scp 'ark,p:gunzip -c exp/tri4b_dnn_multi_ali/ali..gz |' ark:-
    WARNING (lattice-boost-ali:LoadCurrent():util/kaldi-table-inl.h:224) TableReader: failed to open file gunzip
    ERROR (lattice-boost-ali:Value():util/kaldi-table-inl.h:143) TableReader: failed to load object from gunzip (to suppress this error, add the permissive (p, ) option to the rspecifier.
    WARNING (lattice-boost-ali:Close():kaldi-io.cc:446) Pipe gunzip -c exp/tri4b_dnn_multi_ali/ali.
    .gz | had nonzero return status 13
    ERROR (lattice-boost-ali:Value():util/kaldi-table-inl.h:143) TableReader: failed to load object from gunzip (to suppress this error, add the permissive (p, ) option to the rspecifier.

    At the beginning I didn't have the p option in the ark:gunzip ... rspecifier but then I added and still won't change a thing, I also double checked that the alignments didn't contain any errors.

    Could anyone please tell me what am I doing wrong?

    Cheers,
    Angel

     
    • Jan "yenda" Trmal

      Do you have gunzip on PATH?
      y.

      On Wed, Jul 15, 2015 at 7:29 AM, Angel Castro angel-castro@users.sf.net
      wrote:

      Hi everyone,

      I have been trying to use the train_mmi.sh to train a model using boosted
      mmi, withouth the boosted flag, it seems to be working properly but once I
      gave it a value the following error came up

      lattice-boost-ali --b=0.3
      --silence-phones=1:2:3:4:5:6:7:8:9:10:11:12:13:14:15
      exp/tri4b_dnn_multi_ali/final.mdl scp:exp/tri4b_dnn_multi_bmmi/lat.scp
      'ark,p:gunzip -c exp/tri4b_dnn_multi_ali/ali..gz |' ark:-
      WARNING (lattice-boost-ali:LoadCurrent():util/kaldi-table-inl.h:224)
      TableReader: failed to open file gunzip
      ERROR (lattice-boost-ali:Value():util/kaldi-table-inl.h:143) TableReader:
      failed to load object from gunzip (to suppress this error, add the
      permissive (p, ) option to the rspecifier.
      WARNING (lattice-boost-ali:Close():kaldi-io.cc:446) Pipe gunzip -c
      exp/tri4b_dnn_multi_ali/ali.
      .gz | had nonzero return status 13
      ERROR (lattice-boost-ali:Value():util/kaldi-table-inl.h:143) TableReader:
      failed to load object from gunzip (to suppress this error, add the
      permissive (p, ) option to the rspecifier.

      At the beginning I didn't have the p option in the ark:gunzip ...
      rspecifier but then I added and still won't change a thing, I also double
      checked that the alignments didn't contain any errors.

      Could anyone please tell me what am I doing wrong?

      Cheers,
      Angel


      train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments


      Sent from sourceforge.net because you indicated interest in <
      https://sourceforge.net/p/kaldi/discussion/1355347/>

      To unsubscribe from further messages, please visit <
      https://sourceforge.net/auth/subscriptions/>

       
      • Daniel Povey

        Daniel Povey - 2015-07-15

        I don't think his issue is coming from his archive that starts with
        "gunzip". I think it's more likely that his file
        exp/tri4b_dnn_multi_bmmi/lat.scp contains something like "foo gunzip"
        as one of its lines.

        Dan

        On Wed, Jul 15, 2015 at 9:05 AM, Jan jtrmal@users.sf.net wrote:

        ERROR! The markdown supplied could not be parsed correctly. Did you forget
        to surround a code snippet with "~~~~"?

        Do you have gunzip on PATH?
        y.

        On Wed, Jul 15, 2015 at 7:29 AM, Angel Castro angel-castro@users.sf.net
        wrote:

        Hi everyone,

        I have been trying to use the train_mmi.sh to train a model using boosted
        mmi, withouth the boosted flag, it seems to be working properly but once I
        gave it a value the following error came up

        lattice-boost-ali --b=0.3
        --silence-phones=1:2:3:4:5:6:7:8:9:10:11:12:13:14:15
        exp/tri4b_dnn_multi_ali/final.mdl scp:exp/tri4b_dnn_multi_bmmi/lat.scp
        'ark,p:gunzip -c exp/tri4b_dnn_multi_ali/ali..gz |' ark:-
        WARNING (lattice-boost-ali:LoadCurrent():util/kaldi-table-inl.h:224)
        TableReader: failed to open file gunzip
        ERROR (lattice-boost-ali:Value():util/kaldi-table-inl.h:143) TableReader:
        failed to load object from gunzip (to suppress this error, add the
        permissive (p, ) option to the rspecifier.
        WARNING (lattice-boost-ali:Close():kaldi-io.cc:446) Pipe gunzip -c
        exp/tri4b_dnn_multi_ali/ali.
        .gz | had nonzero return status 13
        ERROR (lattice-boost-ali:Value():util/kaldi-table-inl.h:143) TableReader:
        failed to load object from gunzip (to suppress this error, add the
        permissive (p, ) option to the rspecifier.

        At the beginning I didn't have the p option in the ark:gunzip ...
        rspecifier but then I added and still won't change a thing, I also double
        checked that the alignments didn't contain any errors.

        Could anyone please tell me what am I doing wrong?

        Cheers,
        Angel


        [train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments](

        https://sourceforge.net/p/kaldi/discussion/1355347/thread/fba7f9f5/?limit=25#7f76
        )


        Sent from sourceforge.net because you indicated interest in <
        https://sourceforge.net/p/kaldi/discussion/1355347/>

        To unsubscribe from further messages, please visit <
        https://sourceforge.net/auth/subscriptions/>


        train_mmi.sh Issue: lattice-boost-ali is unable to open zip
        alignments


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355347/

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/

         
        • Angel Castro

          Angel Castro - 2015-07-15

          Hi Dan,

          you were absolutely right the problem was that the $dir/lat.scp produced by the train_mmi.sh script when the option --boost had a non-zero value was something like:

          UTT-ID gunzip:

          After further inspection, the bug derives from:
          if [[ "$boost" != "0.0" && "$boost" != 0 ]]; then
          #make lattice scp with same order as the shuffled feature scp
          awk '{ if(r==0) { latH[$1]=$2; }
          if(r==1) { if(latH[$1] != "") { print $1" "latH[$1] } }
          }' $denlatdir/lat.scp r=1 $dir/train.scp > $dir/lat.scp

          so the awk command was only copying the var $2 but not the rest and since the line has actually up to $5 variables because of the default field separator

          I made this small change and now it works:

          awk '{ if(r==0) { latH[$1]=substr($0,length($1) + 2); }
          if(r==1) { if(latH[$1] != "") { print $1" "latH[$1] } }

          So the substr($0,length($1) + 2) replacement takes the whole string - the length of the first column $1 + 2; + 1 because length is zero indexed and substr starting point is not and the other + 1 to avoid repeating the space.

          Thank you Dan and Yenda for your help

           
          • Daniel Povey

            Daniel Povey - 2015-07-15

            Karel, could you please fix this?
            I think a comment explaining what the "r=1" thing is doing would be
            helpful, too; that seems like a quite obscure feature of awk.
            Dan

            On Wed, Jul 15, 2015 at 1:09 PM, Angel Castro angel-castro@users.sf.net
            wrote:

            Hi Dan,

            you were absolutely right the problem was that the $dir/lat.scp produced
            by the train_mmi.sh script when the option --boost had a non-zero value was
            something like:

            UTT-ID gunzip:

            After further inspection, the bug derives from:
            if [[ "$boost" != "0.0" && "$boost" != 0 ]]; then

            make lattice scp with same order as the shuffled feature scp

            awk '{ if(r==0) { latH[$1]=$2; }
            if(r==1) { if(latH[$1] != "") { print $1" "latH[$1] } }
            }' $denlatdir/lat.scp r=1 $dir/train.scp > $dir/lat.scp

            so the awk command was only copying the var $2 but not the rest and since
            the line has actually up to $5 variables because of the default field
            separator

            I made this small change and now it works:

            awk '{ if(r==0) { latH[$1]=substr($0,length($1) + 2); }
            if(r==1) { if(latH[$1] != "") { print $1" "latH[$1] } }

            So the substr($0,length($1) + 2) replacement takes the whole string - the
            length of the first column $1 + 2; + 1 because length is zero indexed and
            substr starting point is not and the other + 1 to avoid repeating the
            space.

            Thank you Dan and Yenda for your help

            train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments
            https://sourceforge.net/p/kaldi/discussion/1355347/thread/fba7f9f5/?limit=25#7f76/f9fa/780d/b64c


            Sent from sourceforge.net because you indicated interest in
            https://sourceforge.net/p/kaldi/discussion/1355347/

            To unsubscribe from further messages, please visit
            https://sourceforge.net/auth/subscriptions/

             
            • Karel Vesely

              Karel Vesely - 2015-07-15

              Ok, I'll fix that. Thanks for finding the bug!
              K.

              Dne 15. 7. 2015 v 13:32 Daniel Povey napsal(a):

              Karel, could you please fix this?
              I think a comment explaining what the "r=1" thing is doing would be
              helpful, too; that seems like a quite obscure feature of awk.
              Dan

              On Wed, Jul 15, 2015 at 1:09 PM, Angel Castro
              angel-castro@users.sf.net angel-castro@users.sf.net
              wrote:

              Hi Dan,
              
              you were absolutely right the problem was that the $dir/lat.scp
              produced
              by the train_mmi.sh script when the option --boost had a non-zero
              value was
              something like:
              
              UTT-ID gunzip:
              
              After further inspection, the bug derives from:
              if [[ "$boost" != "0.0" && "$boost" != 0 ]]; then
              
              
                make lattice scp with same order as the shuffled feature scp
              
              awk '{ if(r==0) { latH[$1]=$2; }
              if(r==1) { if(latH[$1] != "") { print $1" "latH[$1] } }
              }' $denlatdir/lat.scp r=1 $dir/train.scp > $dir/lat.scp
              
              so the awk command was only copying the var $2 but not the rest
              and since
              the line has actually up to $5 variables because of the default field
              separator
              
              I made this small change and now it works:
              
              awk '{ if(r==0) { latH[$1]=substr($0,length($1) + 2); }
              if(r==1) { if(latH[$1] != "") { print $1" "latH[$1] } }
              
              So the substr($0,length($1) + 2) replacement takes the whole
              string - the
              length of the first column $1 + 2; + 1 because length is zero
              indexed and
              substr starting point is not and the other + 1 to avoid repeating the
              space.
              
              
                  Thank you Dan and Yenda for your help
              
              train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments
              https://sourceforge.net/p/kaldi/discussion/1355347/thread/fba7f9f5/?limit=25#7f76/f9fa/780d/b64c
              
              ------------------------------------------------------------------------
              
              Sent from sourceforge.net because you indicated interest in
              https://sourceforge.net/p/kaldi/discussion/1355347/
              <https://sourceforge.net/p/kaldi/discussion/1355347>
              
              To unsubscribe from further messages, please visit
              https://sourceforge.net/auth/subscriptions/
              <https://sourceforge.net/auth/subscriptions>
              

              train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments
              http://sourceforge.net/p/kaldi/discussion/1355347/thread/fba7f9f5/?limit=25#7f76/f9fa/780d/b64c/8601


              Sent from sourceforge.net because you indicated interest in
              https://sourceforge.net/p/kaldi/discussion/1355347/
              https://sourceforge.net/p/kaldi/discussion/1355347

              To unsubscribe from further messages, please visit
              https://sourceforge.net/auth/subscriptions/
              https://sourceforge.net/auth/subscriptions

               
            • Angel Castro

              Angel Castro - 2015-07-15

              Yes, it is actually a very cool feature from awk that lets you control the parsing of different files, I didn't know you could do that. Thanks Karel for the lesson.

               
      • Angel Castro

        Angel Castro - 2015-07-15

        Hi Yenda,

        Yes gunzip is on the path. I even unzip the files into a common one and try to parse it directly and still will show the same message.

         
        • Jan "yenda" Trmal

          Then check the scp file as dan suggested.
          Also, just an idea... How many files is there? there is some built-in
          limitation on the length of the command line in the kernel or there might
          be a problem with the wildcard substitution -- please just try one file
          (without wildcards) to see if it relates to this.
          y.

          On Wed, Jul 15, 2015 at 2:16 PM, Angel Castro angel-castro@users.sf.net
          wrote:

          Hi Yenda,

          Yes gunzip is on the path. I even unzip the files into a common one and
          try to parse it directly and still will show the same message.


          train_mmi.sh Issue: lattice-boost-ali is unable to open zip alignments


          Sent from sourceforge.net because you indicated interest in <
          https://sourceforge.net/p/kaldi/discussion/1355347/>

          To unsubscribe from further messages, please visit <
          https://sourceforge.net/auth/subscriptions/>

           
  • Karel Vesely

    Karel Vesely - 2015-07-15

    Hi, it sholud be working well now! Thanks for finding the bug!
    K.

     
  • Angel Castro

    Angel Castro - 2015-07-15

    No problem thanks for fixing it