Kaldi / Discussion / Help: Data dimension is different than model dimension

Lucian Georgescu - 2015-02-18

Hi,

I want to set up a project using my own audio files and language model, but keeping the acoustic model from Tedlium project.
First two decodings have completed successfully (mono and tri1), but I get this error when I try to run tri2 and tri3:

gmm-latgen-faster-parallel --num-threads=4 --max-active=2000 --beam=10.0 --lattice-beam=6.0 --acoustic-scale=0.083333 --allow-partial=true --word-symbol-table=exp/tri3/graph/words.txt exp/tri3/final.mdl exp/tri3/graph/HCLG.fst 'ark,s,cs:apply-cmvn --utt2spk=ark:data/test/split1/1/utt2spk scp:data/test/split1/1/cmvn.scp scp:data/test/split1/1/feats.scp ark:- | add-deltas ark:- ark:- |' 'ark:|gzip -c > exp/tri3/decode_test.si/lat.1.gz'
apply-cmvn --utt2spk=ark:data/test/split1/1/utt2spk scp:data/test/split1/1/cmvn.scp scp:data/test/split1/1/feats.scp ark:-
add-deltas ark:- ark:-
ERROR (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50) Dim mismatch: data dim = 39 vs. model dim = 40
ERROR (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50) Dim mismatch: data dim = 39 vs. model dim = 40
ERROR (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50) Dim mismatch: data dim = 39 vs. model dim = 40
ERROR (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50) Dim mismatch: data dim = 39 vs. model dim = 40
terminate called after throwing an instance of 'std::runtime_error'
what(): ERROR (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50) Dim mismatch: data dim = 39 vs. model dim = 40

[stack trace: ]
kaldi::KaldiGetStackTrace()
kaldi::KaldiErrorMessage::~KaldiErrorMessage()
kaldi::DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased(int, int)
kaldi::DecodableAmDiagGmmScaled::LogLikelihood(int, int)
kaldi::LatticeFasterDecoder::ProcessEmitting(kaldi::DecodableInterface)
kaldi::LatticeFasterDecoder::Decode(kaldi::DecodableInterface)
kaldi::DecodeUtteranceLatticeFasterClass::operator()()
kaldi::TaskSequencer<kaldi::decodeutterancelatticefasterclass>::RunTask(void*)
/lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7fefc1081e9a]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fefc05902ed]</kaldi::decodeutterancelatticefasterclass>

I understand that my data dimension is different than model dimension. How can I manage this?

Thanks.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Jan "yenda" Trmal - 2015-02-18
  
  perhaps you forget to perform splicing?
  y.
  
  On Wed, Feb 18, 2015 at 8:03 AM, Lucian Georgescu lucgeo92@users.sf.net
  wrote:
  
  Hi,
  
  I want to set up a project using my own audio files and language model,
  but keeping the acoustic model from Tedlium project.
  First two decodings have completed successfully (mono and tri1), but I get
  this error when I try to run tri2 and tri3:
  
  gmm-latgen-faster-parallel --num-threads=4 --max-active=2000 --beam=10.0
  --lattice-beam=6.0 --acoustic-scale=0.083333 --allow-partial=true
  --word-symbol-table=exp/tri3/graph/words.txt exp/tri3/final.mdl
  exp/tri3/graph/HCLG.fst 'ark,s,cs:apply-cmvn
  --utt2spk=ark:data/test/split1/1/utt2spk scp:data/test/split1/1/cmvn.scp
  scp:data/test/split1/1/feats.scp ark:- | add-deltas ark:- ark:- |'
  'ark:|gzip -c > exp/tri3/decode_test.si/lat.1.gz'
  apply-cmvn --utt2spk=ark:data/test/split1/1/utt2spk
  scp:data/test/split1/1/cmvn.scp scp:data/test/split1/1/feats.scp ark:-
  add-deltas ark:- ark:-
  ERROR
  (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
  Dim mismatch: data dim = 39 vs. model dim = 40
  ERROR
  (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
  Dim mismatch: data dim = 39 vs. model dim = 40
  ERROR
  (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
  Dim mismatch: data dim = 39 vs. model dim = 40
  ERROR
  (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
  Dim mismatch: data dim = 39 vs. model dim = 40
  terminate called after throwing an instance of 'std::runtime_error'
  what(): ERROR
  (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
  Dim mismatch: data dim = 39 vs. model dim = 40
  
  [stack trace: ]
  kaldi::KaldiGetStackTrace()
  kaldi::KaldiErrorMessage::~KaldiErrorMessage()
  kaldi::DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased(int, int)
  kaldi::DecodableAmDiagGmmScaled::LogLikelihood(int, int)
  kaldi::LatticeFasterDecoder::ProcessEmitting(kaldi::DecodableInterface
  ) kaldi::LatticeFasterDecoder::Decode(kaldi::DecodableInterface)
  kaldi::DecodeUtteranceLatticeFasterClass::operator()()
  
  kaldi::TaskSequencer<kaldi::decodeutterancelatticefasterclass>::RunTask(void*)
  /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7fefc1081e9a]
  /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fefc05902ed]</kaldi::decodeutterancelatticefasterclass>
  
  I understand that my data dimension is different than model dimension. How
  can I manage this?
  
  Thanks.
  
  Data dimension is different than model dimension
  https://sourceforge.net/p/kaldi/discussion/1355348/thread/028ffd7f/?limit=25#912d
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Daniel Povey - 2015-02-18
    
    Actually, since the data dim is 39 and the model dim is 40, I think it's
    most likely that he is using delta+accel features but the model expected
    LDA+MLLT features.
    Lucian, have a look at the decoding command line that was used when you
    decoded your original data where you trained the model. Most likely it
    will have a splicing step (splice-feats) followed by a projection step
    (transform-feats)- you should try to replicate that when you decode.
    If you are using a decoding script, it will automatically pick up the
    LDA+MLLT features if you copy the final.mat to the directory where your
    final.mdl exists. Also make sure to copy splice_opts and cmvn_opts to that
    directory.
    
    Dan
    
    On Wed, Feb 18, 2015 at 10:28 AM, Jan Trmal jtrmal@users.sf.net wrote:
    
    perhaps you forget to perform splicing?
    y.
    
    On Wed, Feb 18, 2015 at 8:03 AM, Lucian Georgescu lucgeo92@users.sf.net
    wrote:
    
    Hi,
    
    I want to set up a project using my own audio files and language model,
    but keeping the acoustic model from Tedlium project.
    First two decodings have completed successfully (mono and tri1), but I get
    this error when I try to run tri2 and tri3:
    
    gmm-latgen-faster-parallel --num-threads=4 --max-active=2000 --beam=10.0
    --lattice-beam=6.0 --acoustic-scale=0.083333 --allow-partial=true
    --word-symbol-table=exp/tri3/graph/words.txt exp/tri3/final.mdl
    exp/tri3/graph/HCLG.fst 'ark,s,cs:apply-cmvn
    --utt2spk=ark:data/test/split1/1/utt2spk scp:data/test/split1/1/cmvn.scp
    scp:data/test/split1/1/feats.scp ark:- | add-deltas ark:- ark:- |'
    'ark:|gzip -c > exp/tri3/decode_test.si/lat.1.gz'
    apply-cmvn --utt2spk=ark:data/test/split1/1/utt2spk
    scp:data/test/split1/1/cmvn.scp scp:data/test/split1/1/feats.scp ark:-
    add-deltas ark:- ark:-
    ERROR
    
    (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
    Dim mismatch: data dim = 39 vs. model dim = 40
    ERROR
    
    (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
    Dim mismatch: data dim = 39 vs. model dim = 40
    ERROR
    
    (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
    Dim mismatch: data dim = 39 vs. model dim = 40
    ERROR
    
    (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
    Dim mismatch: data dim = 39 vs. model dim = 40
    terminate called after throwing an instance of 'std::runtime_error'
    what(): ERROR
    
    (gmm-latgen-faster-parallel:LogLikelihoodZeroBased():decodable-am-diag-gmm.cc:50)
    Dim mismatch: data dim = 39 vs. model dim = 40
    
    [stack trace: ]
    kaldi::KaldiGetStackTrace()
    kaldi::KaldiErrorMessage::~KaldiErrorMessage()
    kaldi::DecodableAmDiagGmmUnmapped::LogLikelihoodZeroBased(int, int)
    kaldi::DecodableAmDiagGmmScaled::LogLikelihood(int, int)
    kaldi::LatticeFasterDecoder::ProcessEmitting(kaldi::DecodableInterface
    ) kaldi::LatticeFasterDecoder::Decode(kaldi::DecodableInterface)
    kaldi::DecodeUtteranceLatticeFasterClass::operator()()
    
    kaldi::TaskSequencer<kaldi::decodeutterancelatticefasterclass>::RunTask(void*)
    /lib/x86_64-linux-gnu/libpthread.so.0(+0x7e9a) [0x7fefc1081e9a]
    /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7fefc05902ed]</kaldi::decodeutterancelatticefasterclass>
    
    I understand that my data dimension is different than model dimension. How
    can I manage this?
    Thanks.
    
    Data dimension is different than model dimension
    
    https://sourceforge.net/p/kaldi/discussion/1355348/thread/028ffd7f/?limit=25#912d
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    Data dimension is different than model dimension
    http://sourceforge.net/p/kaldi/discussion/1355348/thread/028ffd7f/?limit=25#912d/bb82
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lucian Georgescu - 2015-02-18

I copied those files and now it works. Thank you!

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Data dimension is different than model dimension

Forums

Help

Data dimension is different than model dimension document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

Thanks.

https://sourceforge.net/p/kaldi/discussion/1355348/thread/028ffd7f/?limit=25#912d

Data dimension is different than model dimension