As I see from existing Kaldi recipe, we usually start with network with one hidden layer and hidden layers are gradually inserted into network during network training. Now I want to train a network with a smaller-size low rank layer between top hidden and output layer. In nnet.config the input dimension of output layer need to be same as output dim of hidden layer, and then later during training when the low rank layer is inserted at the end, will this ends up with issue (output dim of low rank layer will not match input dim of output layer (defined in nnet.config)?
If this is the case, I am wondering how this can be done properly?
thanks,
Yan
Last edit: Yan Yin 2015-07-10
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
One easy way to do this is simply to set up the neural network with
two affine components after the final hidden nonlinearity and before
the output softmax, with a smaller dimension in between. The program
nnet-am-limit-rank-final will do this for you by performing an SVD on
the last layer, if you are using the nnet2 setup (assuming that
program still works..). I never ended up using this in the standard
recipes because it didn't improve the WER results and at the time I
wasn't so interested in pure speed improvements.
Hi All,
As I see from existing Kaldi recipe, we usually start with network with one hidden layer and hidden layers are gradually inserted into network during network training. Now I want to train a network with a smaller-size low rank layer between top hidden and output layer. In nnet.config the input dimension of output layer need to be same as output dim of hidden layer, and then later during training when the low rank layer is inserted at the end, will this ends up with issue (output dim of low rank layer will not match input dim of output layer (defined in nnet.config)?
If this is the case, I am wondering how this can be done properly?
thanks,
Yan
Last edit: Yan Yin 2015-07-10
an extreme case is, If I want to build a shrinking network (hidden layer size keep decreasing), how to achieve this?
Last edit: Yan Yin 2015-07-10
One easy way to do this is simply to set up the neural network with
two affine components after the final hidden nonlinearity and before
the output softmax, with a smaller dimension in between. The program
nnet-am-limit-rank-final will do this for you by performing an SVD on
the last layer, if you are using the nnet2 setup (assuming that
program still works..). I never ended up using this in the standard
recipes because it didn't improve the WER results and at the time I
wasn't so interested in pure speed improvements.
Dan
On Fri, Jul 10, 2015 at 9:46 AM, Yan Yin riyijiye1976@users.sf.net wrote: