Menu

LIN Adaptation, Karel DNN

Help
2015-06-30
2015-06-30
  • speechmahine

    speechmahine - 2015-06-30

    I tried searching the forum for any previous questions on training or adapting a network based on all the utterances from a particular speaker -- in the spirit of LIN adaptation (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=687200, http://research.microsoft.com/pubs/230082/IS141354.pdf). I'm using nnet1(or Karel's DNN implementation),

    I'm looking for a nnet1 binary that has functionality similar to gmm-est-fmllr-gpost, which is able to take in a --spk2utt option to train fMLLR matrices based on all of the utterances from a speaker after decoding using a SI model (unsupervised adaptation). Instead I would like to estimate speaker dependent transforms as is done in LIN adaptation using one of the nnet-train-frmshuff-... binaries. I don't see such an option in nnet-train-frmshuff, but I do see a binary called nnet-train-perutt. nnet-train-perutt, I understand shuffles frames from within an utterance.

    If the kind of functionality I'm looking for doesn't exist, what might be the best way of going about implementing it?

    Any help would be appreciated.

     
    • Daniel Povey

      Daniel Povey - 2015-06-30

      That paper is very old and I doubt the results are applicable to modern systems.
      Various people have published papers on adaptation techniques like
      what you mention, but the results were always very disappointing. So
      no, it's not supported, and probably won't be unless these methods
      start to look promising.
      (Papers that reported improvements often used supervised adaptation
      data, which is an unusual scenario, and the improvements were often
      still quite small).

      Dan

      On Tue, Jun 30, 2015 at 10:53 AM, Mohan speechmachine@users.sf.net wrote:

      I tried searching the forum for any previous questions on training or
      adapting a network based on all the utterances from a particular speaker --
      in the spirit of LIN adaptation
      (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=687200,
      http://research.microsoft.com/pubs/230082/IS141354.pdf). I'm using nnet1(or
      Karel's DNN implementation),

      I'm looking for a nnet1 binary that has functionality similar to
      gmm-est-fmllr-gpost, which is able to take in a --spk2utt option to train
      fMLLR matrices based on all of the utterances from a speaker after decoding
      using a SI model (unsupervised adaptation). Instead I would like to estimate
      speaker dependent transforms as is done in LIN adaptation using one of the
      nnet-train-frmshuff-... binaries. I don't see such an option in
      nnet-train-frmshuff, but I do see a binary called nnet-train-perutt.
      nnet-train-perutt, I understand shuffles frames from within an utterance.

      If the kind of functionality I'm looking for doesn't exist, what might be
      the best way of going about implementing it?

      Any help would be appreciated.


      LIN Adaptation, Karel DNN


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

       
      • Karel Vesely

        Karel Vesely - 2015-06-30

        Hi,
        no, the adaptation of NN weights to individual speakers is not supported.
        The current state-of-the art technique is wither to use iVector based
        features
        and/or fMLLR features computed by an auxiliary GMM model.

        The functionality of the binaries is following:
        nnet-train-frmshuff - SGD with frame-shuffling, useful for feedforward NNs,
        nnet-train-perutt - SGD with per-utterance updates (i.e. w/o frame
        shuffling), the lists are shuffled, it's useful for training recurrent
        networks
        But it does not do speaker adaptation...

        Best regards,
        Karel.

        Dne 30. 6. 2015 v 18:58 Daniel Povey napsal(a):

        That paper is very old and I doubt the results are applicable to
        modern systems.
        Various people have published papers on adaptation techniques like
        what you mention, but the results were always very disappointing. So
        no, it's not supported, and probably won't be unless these methods
        start to look promising.
        (Papers that reported improvements often used supervised adaptation
        data, which is an unusual scenario, and the improvements were often
        still quite small).

        Dan

        On Tue, Jun 30, 2015 at 10:53 AM, Mohan speechmachine@users.sf.net
        speechmachine@users.sf.net wrote:

        I tried searching the forum for any previous questions on training or
        adapting a network based on all the utterances from a particular
        speaker --
        in the spirit of LIN adaptation
        (http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=687200,
        http://research.microsoft.com/pubs/230082/IS141354.pdf). I'm using
        nnet1(or
        Karel's DNN implementation),
        
        I'm looking for a nnet1 binary that has functionality similar to
        gmm-est-fmllr-gpost, which is able to take in a --spk2utt option
        to train
        fMLLR matrices based on all of the utterances from a speaker after
        decoding
        using a SI model (unsupervised adaptation). Instead I would like
        to estimate
        speaker dependent transforms as is done in LIN adaptation using
        one of the
        nnet-train-frmshuff-... binaries. I don't see such an option in
        nnet-train-frmshuff, but I do see a binary called nnet-train-perutt.
        nnet-train-perutt, I understand shuffles frames from within an
        utterance.
        
        If the kind of functionality I'm looking for doesn't exist, what
        might be
        the best way of going about implementing it?
        
        Any help would be appreciated.
        
        ------------------------------------------------------------------------
        
        LIN Adaptation, Karel DNN
        
        ------------------------------------------------------------------------
        
        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/
        <https://sourceforge.net/p/kaldi/discussion/1355348>
        
        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/
        <https://sourceforge.net/auth/subscriptions>
        

        LIN Adaptation, Karel DNN
        http://sourceforge.net/p/kaldi/discussion/1355348/thread/c39f17c4/?limit=25#c4c0/b43a


        Sent from sourceforge.net because you indicated interest in
        https://sourceforge.net/p/kaldi/discussion/1355348/
        https://sourceforge.net/p/kaldi/discussion/1355348

        To unsubscribe from further messages, please visit
        https://sourceforge.net/auth/subscriptions/
        https://sourceforge.net/auth/subscriptions