Menu

Problem creating OnlineNnet2FeaturePipelinein decoding

Help
2015-07-12
2015-07-14
  • Adam Dahlgren

    Adam Dahlgren - 2015-07-12

    Hi!

    I'm developing a C++ native application using Kaldi for online decoding. So the problem I have is that once I run my prototype code it complains about the creation of the OnlineNnet2Feature pipeline.

    *OnlineNnet2FeaturePipelineConfig feature_config;
    ...
    ParseOptions po(NULL);
    feature_config.Register(&po);
    nnet2_decoding_config.Register(&po);
    endpoint_config.Register(&po);
    wordboundary_config.Register(&po);
    po.ReadConfigFile("online_nnet2_decoding.conf");

    nnet2_decoding_config.decodable_opts.acoustic_scale=0.1;
    nnet2_decoding_config.decoder_opts.max_active=7000;
    nnet2_decoding_config.decoder_opts.beam=15.0;
    nnet2_decoding_config.decoder_opts.lattice_beam=6.0;
    ...
    OnlineNnet2FeaturePipelineInfo feature_info(feature_config);
    ...
    feature_info.ivector_extractor_info.use_most_recent_ivector = true;
    feature_info.ivector_extractor_info.greedy_ivector_extractor = true;
    ...
    OnlineNnet2FeaturePipeline feature_pipeline(feature_info);
    *
    Now this line results in the following error:

    ERROR (AcceptWaveform():online-feature.cc:57) Sampling frequency mismatch, expected 16000, got 8000

    with the empty online_nnet2_decoding.conf file.

    If I change online__nnet2_decoding.conf to
    --feature-type=mfcc
    --mfcc-config=mfcc.conf

    and mfcc.conf to
    --sample-frequency=8000 <-- freq. of audio data

    I get the following problem

    ERROR (NnetComputer():nnet-compute.cc:70) Feature dimension is 13 but network expects 113

    [stack trace: ]
    kaldi::KaldiGetStackTrace()
    kaldi::KaldiErrorMessage::~KaldiErrorMessage()
    kaldi::nnet2::NnetComputer::NnetComputer(kaldi::nnet2::Nnet constr&, kaldi::CuMatrixBase<float> const&, bool, kaldi::nnet2:Nnet)
    kaldi::nnet2::NnetComputation(kaldi__nnet2:Nnet constr&, kaldi::CuMatrixBase<float> const&, bool, kaldi::CuMatrixBase<float>
    )
    kaldi::nnet2::DecodableNnet2Online::ComputeForFrame(int)
    ...
    kaldi::SingleUtteranceNnet2Decoder::AdvanceDecoding()

    And changing mfcc.conf to include --num-ceps=113

    gives to following error

    KALDI_ASSERT: at SubMatrix:kaldi-matric.cc:1427, failed: static cast<UnsignedMatrixIndexT>(ro) < static_cast<UnsignedMatrixIndexT>(M.num_rows_) && static_cast<UnsginedMatrixIndextT>(co) < static_cast<UnsingedMatrixIndexT>(M.numCols_) && static_cast<UnsignedMatrxIndexT>(r) <= static_cast<UnsignedMatrixIndexT>(M.num_rows_ - ro- && static_cast<UnsignedMatrixIndexT>(c) <= static_cast<UnsignedMatrixIndexT>(M.num_cols_ - co)

    Adding --num-mel-bins=113 does not solve it either, producing other errors linked to start and stop index being the same (113 for both I suppose).

    So, have anyone experienced the same issue, and/or could anyone point me towards a solution to this problem?

    I am sorry for the formatting of this post, I'm new to Sourceforge. If there is any other information you need I'm happy to provide.

    Kind regards,
    Adam

     
    • Daniel Povey

      Daniel Povey - 2015-07-12

      Firstly, if the model was trained for 16kHz data you cannot recognize
      8kHz data, and if you upsample that data it won't work either. You'd
      have to use a model trained for 8kHz data, e.g. a Switchboard model.
      Regarding the other issue, with the 13 vs 40 dimension, you need to
      use the original mfcc.conf that it was using in the config directory
      supplied on kaldi-asr.org. It will specify 40-dimensional MFCCs.

      Dan

      On Sun, Jul 12, 2015 at 4:20 PM, Adam Dahlgren daliumu@users.sf.net wrote:

      Hi!

      I'm developing a C++ native application using Kaldi for online decoding. So
      the problem I have is that once I run my prototype code it complains about
      the creation of the OnlineNnet2Feature pipeline.

      *OnlineNnet2FeaturePipelineConfig feature_config;
      ...
      ParseOptions po(NULL);
      feature_config.Register(&po);
      nnet2_decoding_config.Register(&po);
      endpoint_config.Register(&po);
      wordboundary_config.Register(&po);
      po.ReadConfigFile("online_nnet2_decoding.conf");

      nnet2_decoding_config.decodable_opts.acoustic_scale=0.1;
      nnet2_decoding_config.decoder_opts.max_active=7000;
      nnet2_decoding_config.decoder_opts.beam=15.0;
      nnet2_decoding_config.decoder_opts.lattice_beam=6.0;
      ...
      OnlineNnet2FeaturePipelineInfo feature_info(feature_config);
      ...
      feature_info.ivector_extractor_info.use_most_recent_ivector = true;
      feature_info.ivector_extractor_info.greedy_ivector_extractor = true;
      ...
      OnlineNnet2FeaturePipeline feature_pipeline(feature_info);
      *
      Now this line results in the following error:

      ERROR (AcceptWaveform():online-feature.cc:57) Sampling frequency mismatch,
      expected 16000, got 8000

      with the empty online_nnet2_decoding.conf file.

      If I change online__nnet2_decoding.conf to
      --feature-type=mfcc
      --mfcc-config=mfcc.conf

      and mfcc.conf to
      --sample-frequency=8000 <-- freq. of audio data

      I get the following problem

      ERROR (NnetComputer():nnet-compute.cc:70) Feature dimension is 13 but
      network expects 113

      [stack trace: ]
      kaldi::KaldiGetStackTrace()
      kaldi::KaldiErrorMessage::~KaldiErrorMessage()
      kaldi::nnet2::NnetComputer::NnetComputer(kaldi::nnet2::Nnet constr&,
      kaldi::CuMatrixBase<float> const&, bool, kaldi::nnet2:Nnet)
      kaldi::nnet2::NnetComputation(kaldi__nnet2:Nnet constr&,
      kaldi::CuMatrixBase<float> const&, bool, kaldi::CuMatrixBase<float>)
      kaldi::nnet2::DecodableNnet2Online::ComputeForFrame(int)
      ...
      kaldi::SingleUtteranceNnet2Decoder::AdvanceDecoding()

      And changing mfcc.conf to include --num-ceps=113

      gives to following error

      KALDI_ASSERT: at SubMatrix:kaldi-matric.cc:1427, failed: static
      cast<UnsignedMatrixIndexT>(ro) <
      static_cast<UnsignedMatrixIndexT>(M.num_rows_) &&
      static_cast<UnsginedMatrixIndextT>(co) <
      static_cast<UnsingedMatrixIndexT>(M.numCols_) &&
      static_cast<UnsignedMatrxIndexT>(r) <=
      static_cast<UnsignedMatrixIndexT>(M.num_rows_ - ro- &&
      static_cast<UnsignedMatrixIndexT>(c) <=
      static_cast<UnsignedMatrixIndexT>(M.num_cols_ - co)

      Adding --num-mel-bins=113 does not solve it either, producing other errors
      linked to start and stop index being the same (113 for both I suppose).

      So, have anyone experienced the same issue, and/or could anyone point me
      towards a solution to this problem?

      I am sorry for the formatting of this post, I'm new to Sourceforge. If there
      is any other information you need I'm happy to provide.

      Kind regards,
      Adam


      Problem creating OnlineNnet2FeaturePipelinein decoding


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Adam Dahlgren

        Adam Dahlgren - 2015-07-14

        Thank you Dan! I first fixed this problem by editing the conf-files, but after getting the whole thing to run correctly the output was only <eps>'s.. After reading your answer and checking kaldi-asr again I realized that I'd failed the sed-step of the guide on http://kaldi.sourceforge.net/online_decoding.html, producing empty conf-files! So thank you. I now have another issue but I'll make that a separate post.

        Adam