I am wondering if there is any randomness in the DNN training recipe such as steps/nnet2/train_tahn.sh.
I’m asking this because I noticed that when I run the same script twice, I got slightly different results. What could be the randomness in the script and how can I get exactly the same result each time I run the script?
Thanks!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Probably the most important source of randomness that can't be removed is
that if you are using the CPU-based training, there are multiple threads
and they are not synchronized (asynchronous SGD), so depending on the order
of execution the result gets slightly different. Some authors, when
publishing results with DNNs, run the same thing a few times and show the
average.
Dan
I am wondering if there is any randomness in the DNN training recipe such
as steps/nnet2/train_tahn.sh.
I’m asking this because I noticed that when I run the same script twice, I
got slightly different results. What could be the randomness in the script
and how can I get exactly the same result each time I run the script?
There are a lot of places where rand() is called in the code... mostly it
should still be deterministic if you run on the same hardware and OS, but
there are probably some cases where it's not deterministic. And maybe some
GPU computations can give slightly different results because of scheduling
issues. I'm not sure exactly what the source of the randomness is; I
generally haven't viewed being deterministic as a super-important feature
that we should spend a lot of effort to preserve.
Dan
Hi all,
I am wondering if there is any randomness in the DNN training recipe such as steps/nnet2/train_tahn.sh.
I’m asking this because I noticed that when I run the same script twice, I got slightly different results. What could be the randomness in the script and how can I get exactly the same result each time I run the script?
Thanks!
Probably the most important source of randomness that can't be removed is
that if you are using the CPU-based training, there are multiple threads
and they are not synchronized (asynchronous SGD), so depending on the order
of execution the result gets slightly different. Some authors, when
publishing results with DNNs, run the same thing a few times and show the
average.
Dan
On Mon, Mar 2, 2015 at 4:19 PM, Troy troymonty@users.sf.net wrote:
Thank you. I am using GPU for the nnet2 recipe, does the randomness still occur then?
There are a lot of places where rand() is called in the code... mostly it
should still be deterministic if you run on the same hardware and OS, but
there are probably some cases where it's not deterministic. And maybe some
GPU computations can give slightly different results because of scheduling
issues. I'm not sure exactly what the source of the randomness is; I
generally haven't viewed being deterministic as a super-important feature
that we should spend a lot of effort to preserve.
Dan
On Mon, Mar 2, 2015 at 4:23 PM, Troy troymonty@users.sf.net wrote: