Kaldi / Discussion / Help: question about DNN model initialization

Yan Yin - 2015-06-02

Hi All,

I am a new user of Kaldi and have little background knowledge of the toolkit. Right now I am trying some benchmark setup to compare Kaldi and our own DNN training toolkit. So I am thinking about some quick plan to do below for the comparison,
1) convert our data alignment files to Kaldi format
2) do DNN training with Kaldi
3) convert Kaldi-trained DNN model back to our own DNN format for testing

However, when I took a look at DNN training scripts (for Dan's implementation) in swbd 'steps/nnet2/train_pnorm_accel2.sh', I noticed the below initialization of shallow network seems not done from alignment, and tree is needed.

nnet-am-init $alidir/tree $lang/topo "nnet-init --srand=$srand $dir/nnet.config -|" $dir/0.mdl

I am wondering, for Dan's DNN implementation, how DNN model is initialized from algorithm side in Kaldi and why tree is needed. The Kaldi homepage "Dan's DNN implementation" section does not have enough information regarding this.

Thanks,
Yan

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-06-02
  
  It needs that stuff because the .nnet files contain the
  transition-model as well as the actual neural net. In most situations
  the transition model is not used though. Getting rid of this might
  require writing new binaries.
  Also the nnet2 setup uses nonlinearity types that probably do not
  exist in your setup (p-norm, normalize-layer, splicing layers). If it
  is a speech task it would probably be much less work to just train a
  Kaldi acoustic model, and the performance will probably be better
  also.
  
  Dan
  
  I am a new user of Kaldi and have little background knowledge of the
  toolkit. Right now I am trying some benchmark setup to compare Kaldi and our
  own DNN training toolkit. So I am thinking about some quick plan to do below
  for the comparison,
  1) convert our data alignment files to Kaldi format
  2) do DNN training with Kaldi
  3) convert Kaldi-trained DNN model back to our own DNN format for testing
  
  However, when I took a look at DNN training scripts (for Dan's
  implementation) in swbd 'steps/nnet2/train_pnorm_accel2.sh', I noticed the
  below initialization of shallow network seems not done from alignment, and
  tree is needed.
  
  nnet-am-init $alidir/tree $lang/topo "nnet-init --srand=$srand
  $dir/nnet.config -|" $dir/0.mdl
  
  I am wondering, for Dan's DNN implementation, how DNN model is initialized
  from algorithm side in Kaldi and why tree is needed. The Kaldi homepage
  "Dan's DNN implementation" section does not have enough information
  regarding this.
  
  Thanks,
  Yan
  
  question about DNN model initialization
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yan Yin - 2015-06-02

Thanks Dan!

what is 'splicing layers'? it seems not mentioned in the documentation, maybe my misunderstanding.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-06-02
  
  SpliceComponent. For this type of thing you will have to search the
  code, not the documentation; the documentation is only very
  high-level.
  Dan
  
  On Tue, Jun 2, 2015 at 5:00 PM, Yan Yin riyijiye1976@users.sf.net wrote:
  
  Thanks Dan!
  
  what is 'splicing layers'? it seems not mentioned in the documentation,
  maybe my misunderstanding.
  
  question about DNN model initialization
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yan Yin - 2015-06-04

Thanks Dan.

does your implementation only support the configuration you mentioned (p-norm, norm layer, slicing layer), or also support pretty standard DNN configuration?

Yan

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-06-04
  
  It does support more standard configurations but the performance of
  those is not always quite as good, and it hasn't been tuned as
  recently. Actually ReLUs sometimes give better performance than
  p-norm, but we always train them with the normalization layer to
  ensure stability during training, and you can't test without that
  layer being included. So without adding that to your toolkit you
  wouldn't be able to do the comparison. Anyway that would probably be
  the least of your problems.
  
  Dan
  
  On Thu, Jun 4, 2015 at 12:53 PM, Yan Yin riyijiye1976@users.sf.net wrote:
  
  Thanks Dan.
  
  does your implementation only support the configuration you mentioned
  (p-norm, norm layer, slicing layer), or also support pretty standard DNN
  configuration?
  
  Yan
  
  question about DNN model initialization
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
  - Yan Yin - 2015-06-04
    
    Thanks Dan.
    
    So in case of relu, you are saying, without normalziation, the training is not stable. We have been training relu net without any normalization and did not see the stability issue. we tried mean-normalized SGD as well and did not turned out to help. So does the stability issue without normalization layer has something to do with your parallelization and optimization methods (parameter averaging and natural gradient)?
    
    I am OK with the relu net with normalization layer, and decode with our decoder. In decoding, the normalization layer should be treated as standard layer without extra support needed from our decoder side. Is there any existing recipe with relu net? It is ok if it is not well tuned.
    
    regarding slicing layers, I know this is to handle the left and right feature context. But by looking at the nnet.config this still looks confusing to me. We are using pretty standard approach to directly feed feature with context to input layer. At this point I am trying to quickly get some idea without the need to look into source codes (I am pretty new to Kaldi), I want to see whether extra support is needed from our decoder for slicing layers. Overall my first goal is to set up plan quickly. Moving forward, I will for sure need to look into more code details.
    
    By the way, do you have a sample run with all output dirs for either your wsj or switchboard DNN receipt somewhere that I can access?
    
    thanks,
    Yan
    
    If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
    - Daniel Povey - 2015-06-04
      
      So in case of relu, you are saying, without normalziation, the training is
      not stable. We have been training relu net without any normalization and did
      not see the stability issue. we tried mean-normalized SGD as well and did
      not turned out to help. So does the stability issue without normalization
      layer has something to do with your parallelization and optimization methods
      (parameter averaging and natural gradient)?
      
      Not really. The natural gradient actually improves the stability.
      People who train ReLUs with many layers usually have to resort to some
      kind of trick to stabilize it, this happens to be the trick we have
      chosen.
      
      I am OK with the relu net with normalization layer, and decode with our
      decoder. In decoding, the normalization layer should be treated as standard
      layer without extra support needed from our decoder side. Is there any
      existing recipe with relu net? It is ok if it is not well tuned.
      
      regarding slicing layers, I know this is to handle the left and right
      feature context. But by looking at the nnet.config this still looks
      confusing to me. We are using pretty standard approach to directly feed
      feature with context to input layer. At this point I am trying to quickly
      get some idea without the need to look into source codes (I am pretty new to
      Kaldi), I want to see whether extra support is needed from our decoder for
      slicing layers. Overall my first goal is to set up plan quickly. Moving
      forward, I will for sure need to look into more code details.
      
      In the nnet2 code the splicing is done internally to the network but
      you could just discard the SpliceComponent and do it from externally.
      However the current ReLU recipes that we are using (e.g.
      steps/nnet2/train_multisplice_accel2.sh if you set --pnorm-input-dim
      and --pnorm-output-dim to the same value) actually also do splicing at
      intermediate layers so your framework wouldn't be able to handle it.
      We don't have any ReLU recipes currently, that don't do that.
      
      By the way, do you have a sample run with all output dirs for either your
      wsj or switchboard DNN receipt somewhere that I can access?
      
      You can look on kaldi-asr.org and see if there is something.
      
      You obviously have a lot of questions, because you've chosen to use
      Kaldi in a way that is inherently quite difficult. I'm a busy person
      and I'm not going to be able to hold your hand and take you through
      all the things you need to do.
      
      Dan
      
      question about DNN model initialization
      
      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/kaldi/discussion/1355348/
      
      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/
      
      If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yan Yin - 2015-06-04

Thanks Dan.

I understand, I will not expect that much help when I really start playing with the tool. Currently all the general questions are to estimate the effort we need for the work.

<< However the current ReLU recipes that we are using (e.g.steps/nnet2
<< /train_multisplice_accel2.sh if you set --pnorm-input-dim and --pnorm-output-dim to the << same value) actually also do splicing at
<< intermediate layers so your framework wouldn't be able to handle it.
<< We don't have any ReLU recipes currently, that don't do that.

Just want to confirm, I expect the modification of relu multi-splice receipt (so internal splicing of intermediate layer not done) to be just at shell script level with configuration changes, is it right? or c++ code level as well?

Yan

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-06-04
  
  Yes, the changes are at the command line level, you would just remove
  all the splicing specifications that say layer1/xxx and layer2/xxx and
  so on, leaving only the layer0 one.
  
  Dan
  
  On Thu, Jun 4, 2015 at 4:50 PM, Yan Yin riyijiye1976@users.sf.net wrote:
  
  Thanks Dan.
  
  I understand, I will not expect that much help when I really start playing
  with the tool. Currently all the general questions are to estimate the
  effort we need for the work.
  
  << However the current ReLU recipes that we are using (e.g.steps/nnet2
  << /train_multisplice_accel2.sh if you set --pnorm-input-dim and
  --pnorm-output-dim to the << same value) actually also do splicing at
  << intermediate layers so your framework wouldn't be able to handle it.
  << We don't have any ReLU recipes currently, that don't do that.
  
  Just want to confirm, I expect the modification of relu multi-splice receipt
  (so internal splicing of intermediate layer not done) to be just at shell
  script level with configuration changes, is it right? or c++ code level as
  well?
  
  Yan
  
  question about DNN model initialization
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Yan Yin - 2015-07-07

Hi Dan,

regarding setting relu activation component instead of pnorm, from what you mentioned earlier in this thread (--pnorm-input-dim and --pnorm-output-dim to the same value), I can add below in nnet.config

PnormComponent input-dim=$pnorm_input_dim output-dim=$pnorm_input_dim p=?

I believe what p value is does not really matter in this case, or I do not need to specify p=? in above line?

in the meanwhile I am wondering how such pnorm (same input and output dim) setting will ends up with same as relu activation given y = max(0,x) for relu while y = (|x|^p)^(1/p) for such pnorm?

thanks,
Yan

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Daniel Povey - 2015-07-07
  
  regarding setting relu activation component instead of pnorm, from what you
  mentioned earlier in this thread (--pnorm-input-dim and --pnorm-output-dim
  to the same value), I can add below in nnet.config
  
  PnormComponent input-dim=$pnorm_input_dim output-dim=$pnorm_input_dim p=?
  
  I believe what p value is does not really matter in this case, or I do not
  need to specify p=? in above line?
  
  in the meanwhile I am wondering how such pnorm (same input and output dim)
  setting will ends up with same as relu activation given y = max(0,x) for
  relu while y = (|x|^p)^(1/p) for such pnorm?
  
  No, what I was talking about related to the TDNN scripts, which use
  the RectfiedLinearComponent in that case. You have to use the
  RectifiedLinearComponent if you want ReLU.
  
  Dan
  
  question about DNN model initialization
  
  Sent from sourceforge.net because you indicated interest in
  https://sourceforge.net/p/kaldi/discussion/1355348/
  
  To unsubscribe from further messages, please visit
  https://sourceforge.net/auth/subscriptions/
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

question about DNN model initialization

Forums

Help

question about DNN model initialization document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

question about DNN model initialization