Kaldi / Discussion / Help: LIN Adaptation, Karel DNN

speechmahine - 2015-06-30

I tried searching the forum for any previous questions on training or adapting a network based on all the utterances from a particular speaker -- in the spirit of LIN adaptation (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=687200, http://research.microsoft.com/pubs/230082/IS141354.pdf). I'm using nnet1(or Karel's DNN implementation),

I'm looking for a nnet1 binary that has functionality similar to gmm-est-fmllr-gpost, which is able to take in a --spk2utt option to train fMLLR matrices based on all of the utterances from a speaker after decoding using a SI model (unsupervised adaptation). Instead I would like to estimate speaker dependent transforms as is done in LIN adaptation using one of the nnet-train-frmshuff-... binaries. I don't see such an option in nnet-train-frmshuff, but I do see a binary called nnet-train-perutt. nnet-train-perutt, I understand shuffles frames from within an utterance.

If the kind of functionality I'm looking for doesn't exist, what might be the best way of going about implementing it?

Any help would be appreciated.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-06-30
  
  That paper is very old and I doubt the results are applicable to modern systems.
  Various people have published papers on adaptation techniques like
  what you mention, but the results were always very disappointing. So
  no, it's not supported, and probably won't be unless these methods
  start to look promising.
  (Papers that reported improvements often used supervised adaptation
  data, which is an unusual scenario, and the improvements were often
  still quite small).
  
  Dan
  
  On Tue, Jun 30, 2015 at 10:53 AM, Mohan speechmachine@users.sf.net wrote:
  
  I tried searching the forum for any previous questions on training or
  adapting a network based on all the utterances from a particular speaker --
  in the spirit of LIN adaptation
  (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=687200,
  http://research.microsoft.com/pubs/230082/IS141354.pdf). I'm using nnet1(or
  Karel's DNN implementation),
  
  I'm looking for a nnet1 binary that has functionality similar to
  gmm-est-fmllr-gpost, which is able to take in a --spk2utt option to train
  fMLLR matrices based on all of the utterances from a speaker after decoding
  using a SI model (unsupervised adaptation). Instead I would like to estimate
  speaker dependent transforms as is done in LIN adaptation using one of the
  nnet-train-frmshuff-... binaries. I don't see such an option in
  nnet-train-frmshuff, but I do see a binary called nnet-train-perutt.
  nnet-train-perutt, I understand shuffles frames from within an utterance.
  
  If the kind of functionality I'm looking for doesn't exist, what might be
  the best way of going about implementing it?
  
  Any help would be appreciated.
  
  LIN Adaptation, Karel DNN
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Karel Vesely - 2015-06-30
    
    Hi,
    no, the adaptation of NN weights to individual speakers is not supported.
    The current state-of-the art technique is wither to use iVector based
    features
    and/or fMLLR features computed by an auxiliary GMM model.
    
    The functionality of the binaries is following:
    nnet-train-frmshuff - SGD with frame-shuffling, useful for feedforward NNs,
    nnet-train-perutt - SGD with per-utterance updates (i.e. w/o frame
    shuffling), the lists are shuffled, it's useful for training recurrent
    networks
    But it does not do speaker adaptation...
    
    Best regards,
    Karel.
    
    Dne 30. 6. 2015 v 18:58 Daniel Povey napsal(a):
    
    That paper is very old and I doubt the results are applicable to
    modern systems.
    Various people have published papers on adaptation techniques like
    what you mention, but the results were always very disappointing. So
    no, it's not supported, and probably won't be unless these methods
    start to look promising.
    (Papers that reported improvements often used supervised adaptation
    data, which is an unusual scenario, and the improvements were often
    still quite small).
    
    Dan
    
    On Tue, Jun 30, 2015 at 10:53 AM, Mohan speechmachine@users.sf.net
    speechmachine@users.sf.net wrote:
    
    I tried searching the forum for any previous questions on training or adapting a network based on all the utterances from a particular speaker -- in the spirit of LIN adaptation (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=687200, http://research.microsoft.com/pubs/230082/IS141354.pdf). I'm using nnet1(or Karel's DNN implementation), I'm looking for a nnet1 binary that has functionality similar to gmm-est-fmllr-gpost, which is able to take in a --spk2utt option to train fMLLR matrices based on all of the utterances from a speaker after decoding using a SI model (unsupervised adaptation). Instead I would like to estimate speaker dependent transforms as is done in LIN adaptation using one of the nnet-train-frmshuff-... binaries. I don't see such an option in nnet-train-frmshuff, but I do see a binary called nnet-train-perutt. nnet-train-perutt, I understand shuffles frames from within an utterance. If the kind of functionality I'm looking for doesn't exist, what might be the best way of going about implementing it? Any help would be appreciated. ------------------------------------------------------------------------ LIN Adaptation, Karel DNN ------------------------------------------------------------------------ Sent from sourceforge.net because you indicated interest in https://sourceforge.net/p/kaldi/discussion/1355348/ <https://sourceforge.net/p/kaldi/discussion/1355348> To unsubscribe from further messages, please visit https://sourceforge.net/auth/subscriptions/ <https://sourceforge.net/auth/subscriptions>
    
    LIN Adaptation, Karel DNN
    http://sourceforge.net/p/kaldi/discussion/1355348/thread/c39f17c4/?limit=25#c4c0/b43a
    
    Sent from sourceforge.net because you indicated interest in
    https://sourceforge.net/p/kaldi/discussion/1355348/
    https://sourceforge.net/p/kaldi/discussion/1355348
    
    To unsubscribe from further messages, please visit
    https://sourceforge.net/auth/subscriptions/
    https://sourceforge.net/auth/subscriptions
    
    alternate
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

LIN Adaptation, Karel DNN

Forums

Help

LIN Adaptation, Karel DNN document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

LIN Adaptation, Karel DNN