Stopping criteria: change in error per epoch?

Help
Anonymous
2009-12-03
2012-12-24

  • Anonymous
    2009-12-03

    Hi all NN lovers!

    I see Neuroph as a very potential framework to be used in my project, which is related to function approximation / forecasting. I am especially interested in momentum backpropagation learning. However, I wasn't able to find stopping criteria based on the change of epoch's average error. That is, when the absolute rate of change in the average squared error per epoch is sufficient small, the learning is stopped. I have understood that this stopping is widely used.

    Maybe it could be implemented relatively easily in the SupevisedLearning.doLearningEpoch() in the same condition where totalNetworkError is checked.

    What you think? Could this be implemented or did I just miss the part of the code where it is implemented?

    Anyway, thank you for the developers of the current framework! It is great!

    Riksi

     
  • Zoran Sevarac
    Zoran Sevarac
    2009-12-05

    Hi Riksi,

    Thank you for supporting us this way , we're very glad you like this framework and we hope it will provide you a good solution for your project.

    Your suggestion is very interesting, could be very usefull and it isnt implemented yet. It could be implemented easily the way you said, and then the training will require one more setting - the ammount of error change which is a stop condition. Also this feature may be optional setting. However I'm not sure if this is allways good thing to do since sometimes error function doesnt change much for lot of iterations and suddenly drops down. We'll investigate this for sure and most likely implement in some futrure release, and if you have some experience with this we'll appreciate your help.contribution with this very much. At the moment  we're also considering other backpropagation improvements like dynamic learning rate.

    Thank again and I hope that we'll collaborate on this in future.

    Best regards,
    Zoran

     

  • Anonymous
    2009-12-07

    Zoran,

    Unfortunately, I cannot say that I am expert on this field - yet! ;-) But I have a good book where this stopping criteria is explained, Haykin: Neural Networks page 173.

    However, I will be happy to help in anyway I can!

    Riksi

     
  • Elle
    Elle
    2009-12-07

    Hi Riksi,

    As Zoran mentioned the error profile of a network may stay flat (even rise) then suddenly drop. So a fixed per epoch error may not be optimal. This raises a larger issue about training a network, which I wont mention here.

    However, I believe that shortly a tutorial on time-series prediction will be available. Have a look through that as it mentions a few 'areas of further work' towards the end. Focussing on the learning algorithm and error function is not always the correct approach. Sampling the data and network topology can have huge effects on the accuracy and convergence profile of the function approximation. At the end of the tutorial it mentions a few further areas of work.

    When its published have a look and we would love to here your thoughts.

    Best wishes

    Laura

     

  • Anonymous
    2009-12-07

    Hi Laura,

    Thank you for your comments. A tutorial on time-series prediction wouldn't harm me. I will be happy to read that when it is ready. :)

    Riksi

     

  • Anonymous
    2010-01-19

    Thanks for your tutorial,

    In my problem I have to create an automatic NN training. The NN will train itself periodically to adapt to changed conditions (e.g. once a month). I wont be able to train the NN manually all the time. I can choose the NN's parameters and the learning parameters before hand but it would be good for me if the NN would be able to stop the training automatically. Then I won't have to use the same number of iterations for all trainings. Also, I believe that this feature could be useful for other Neuroph users, because it would reduce the number of parameters which needs to be chosen.

    What about if I implement the change as I proposed it in my first post? Then you check it and decide if you want to use it in Neuroph library.

    Best regards,
    Riksi

     
  • Elle
    Elle
    2010-01-20

    Hi Riksi,

    Its a good idea you suggest. Also if you do need to keep periodically retraining I can see your point now, sorry for being slow initially. Sevarac and myself have had a few very preliminary chats about of more rounded framework for training, i.e. a more automated way rather than the manual method currently implemented.

    So it would be good to maybe work with you to implement this, taking into account your requirements as well? What do you think? Maybe 'we' could work on a design together, get it agreed in priciple with the other members and then implement it together. What do you reckon? After all two heads are better than one in my book.

    I do assume you are looking for an offline learning process. I.e. train the NN (automatically) on the new data, then swap the new NN for the old NN in the production environment. Once everyone is happy it still works aswell?

    Let me know what you think.

    Best wishes

    Laura

     

  • Anonymous
    2010-01-20

    Hi Laura,

    I totally agree that 'we' is always better than 'I'. That way we will get better results for sure. Which way we could start this? Should we start the planning and discussion in this thread or in somewhere else?

    I'll be happy to contribute to this project!

    Riksi

     
  • Elle
    Elle
    2010-01-20

    Hi Riksi,

    Well I think the best way to start would be to see what Severac has to say, regards approach. After all at the moment he is taking the lead, so I am more than happy to follow his advice on the project management/approach.

    I had started writing something up with regards the requirements for this upgrade, but it is still work in progress. For me personally Id rather we all worked on requirements, then maybe us two come up with a design document. Run it past the other members. Then split the jobs in two, you build half, I build half and we test each others half??? Make sense?

    Gosh maybe I should take my Architects hat off ! ;-)

    Laura

     

  • Anonymous
    2010-01-21

    Laura,

    That all makes sense. Let's listen what Sevarac has to say.  Maybe you can send me what you have planned for the requirements. I'll send you my email.

    Riksi

     
  • Zoran Sevarac
    Zoran Sevarac
    2010-01-21

    Hi, all I have to say is go, go , go! :)

    I think that stopping criteria suggested by Riksi can be done easily so I suggest that you add it right now and then we'll test it.
    Do it like you said in doLearningEpoch method. Also I'm thinking about one additional feature:

    stop when the absolute rate of change in the average squared error per epoch is sufficient small for n iterations, where n can be customized

    Thats beacuse error change might stall for a while and then drop. This way we can say: stop training only if error change is very small for some big number of iterations. Does that makes sense to you?

    Regarding the training framework, I've been also working on this and I have some rough specification and we'll put al that together.
    Maybe we could use neuroph trac system to collaborate on this
    https://sourceforge.net/apps/trac/neuroph

    Trac provides wiki and ticket system for bugs/features/tasks so it will be a lot easier then emailing our specs to each other.
    If you agree we can strat creating pages on trac.

    And by the way, I've made significant improvments for Backprop, I'll let you know more soon when I finish testing and commit them. Also did some thread synchronization which resulted in more than 5 times faster training. I'll let you know when I commit those, so you can work with the latest version of code.

    Zoran

     

  • Anonymous
    2010-01-21

    Hi Zoran,

    I think it is a good idea to add number of iterations for which the change in error have to stay small. I added a Feature Request to the tracker. Maybe you can check it and change it if you feel so.  After that I will implement it.

    I am happy to hear that you have been making even better backprop learning. :)

    Riksi

     
  • Zoran Sevarac
    Zoran Sevarac
    2010-01-22

    Thats it, you can implement it. Be sure to check out the latest code from SVN there are many changes

     
  • Zoran Sevarac
    Zoran Sevarac
    2010-01-23

    Hey Riksi, check out  the wootton's comment at tracker for this task, it makes a lot of sense.

     
  • Zoran Sevarac
    Zoran Sevarac
    2010-01-31

    Hi Riksi, have you tried this?
    If you did not, I'll add it's not a big deal

    I just realised that this can be very interesting feature to use to check if learning rate should be changed…

     

  • Anonymous
    2010-02-23

    Hi Zoran and Laura,
    I'm very sorry to come back to this so late. I have been kept busy by my work. I did try to implement the stopping criteria but I didn't find it help full in the learning process. There was still randomness in the learning process. The learning gave me very different results between consecutive trainings with the exactly same code. Instead of the stopping criteria that I proposed, I controlled the training in another way. I trained a NN one epoch at the time. Between epochs I tested the NN with an independent test set. If the error of the test set decreased, I increased the number of maximum iterations by one and started the learning again. If the test set's error increased between epochs, I stopped the learning. I believe this method is called early stopping. It gave me better results, but it didn't remove all randomness in the learning process. I got different results if I run the same code twice or more. The randomness is caused by the initialization of the weights with a random number. If I remove the usage of Weight.random(), I get the same results every time with the same code. But I guess that the removal of the initialization, is not supported by the literature. So I tried with smaller random numbers I set the weights to be between . This gave me better results. Now if I run the same code twice, I don't get exactly the same results, but the results are close to each other. Maybe the smaller initial values for the weights could be considered in the next release. I'm not sure what the literature says about the size of the initial weights, but these values worked for me. Also, a simple build in function for training a NN one epoch at the time could be usefull.

    Anyway, thanks again for doing this great framework! My gratitude goes to a new iteration round. :)

    Riksi   

     
  • Zoran Sevarac
    Zoran Sevarac
    2010-02-24

    Well Riksi, this is your lucky day :)
    Almost all of the features you mentioned is implemented and available in new relase candidate which can be downloaded at https://sourceforge.net/projects/neuroph/files/neuroph_2.4RC1.zip/download

    The solution for randomnes problem is also provided in thet release see the discussion here for more details

     
  • Zoran Sevarac
    Zoran Sevarac
    2010-02-24

    And by the way, it would be very helpfull if you test these new features and let us know your results.
    You could also publish your tests on our wiki at https://sourceforge.net/apps/trac/neuroph/ which is under construction at the moment..

     

  • Anonymous
    2010-02-24

    Nice! This is my lucky day! I'll see the new features.

     
  • Yoosef
    Yoosef
    2011-09-21

    Hi everybody
    I think I have the same problem too . The problem is that the error function returns just one value and this number doesn't change and the learning process of the RBF network never stops .
    I am using "neuroph-2.5.1RC2" .
    I'm working on a simulation project in estimating value of functions which is really important and vital for me and I need serious and fast help ..
    Thanks for your help ..

     
  • Zoran Sevarac
    Zoran Sevarac
    2011-09-24

    Hi Yoosef,

    The problem is that at the moment RBF network lacks the k-means clustering algorithm for learning the hidden layer, so it learns only by using LMS for output layer. Very likely the problem you're trying to solve is not possible to solve with with only LMS. So you can try to implement K-Means clustering to find best settings for hidden-rbf layer: number of neurons and RBF function settings. I hope to add that feature soon, but I cannot tell you when.
    For more info about RBF learning see:
    http://www.physicsarchives.com/index.php/courses/670
    http://homepages.gold.ac.uk/nikolaev/311rbf2.htm
    http://en.wikipedia.org/wiki/K-means_clustering

    Good luck,
    Zoran

     
  • Yoosef
    Yoosef
    2011-09-27

    Hi Zoran,
    Thanks for your guide . I am working on "Rescue Simulation Project" and I want to use Estimating Function like RBF Network in one of its part .
    is there any Network for this Project Like RBF Network that is Complete Network in Neuroph ?
    Thanks again for your help ..

     
  • Zoran Sevarac
    Zoran Sevarac
    2011-09-28

    You're welcome. Try Multi Layer Perceptron, that could work.

    Zoran