I'm testing EKF at present, hope to have it ready soon. EKF, however, depends on the colt library - another dependency.
Anyhow, I'll probably fix up the RTRL as well, have learned a lot with the EKF which is a plugin, RTRL should probably be the same.
It is very nice that you are experimenting with new things, but please make RTRL a plugin too. And I would like to know if you still are interested in supporting RTRL?
EKF is working! For now, this means that all the matrices have their columns and rows match whenever matrix multiplication is done... Still testing it a bit and refining, will upload at some stage, but again, note the colt dependency. Next RTRL will be a plugin as mentioned.
Sounds great, looking forward to trying it out. I would also like to know if you are interested in writing some documentation for RTRL and EKF to be included in the user guide?
Documentation?! The code is commented really well and I give references in the class headers, which should do. I could give some overview of the techniques maybe if that is what you are looking for.
EKF looks fine, but again no fireworks in terms of success in training a net. And I again have some practical questions. Do you include the input nodes in the state vector or only nodes that are in U, with U the RTRL U set of nodes that are not in the input layer? I've tried both without any real difference, but I hope just U as this saves dramatically on the calcs. Convergence remains a problem in the test net - it does converge but on a horrible RMSE.
BTW, EKF works beautifully as a plugin. RTRL to follow suit - if only I then knew what I now know about joone.... The worst piece of code ever written in the universe must be org.joone.helpers.structure.NeuralNetMatrix.java. OK, sorry Paulo, it is actually a very clever piece of code, albeit lacking a few - ahem - comments.
From the EKF working paper: "One clear result that emerged from these experiments is that the noise covariances must be made sufficiently large or successful learning will not occur." I've calculated these from the data but since it seems that the network I'm testing with is not making headway, I am barking up every tree. Do I have a 0.001, an 0.01 anybody. How about a 0.1.....
I meant documentation about how to use EKF and RTRL. Like there is a section about SON in the JooneCompleteGuide. There have to be some documentation about them, so people can start using it.
Phew, I'm now onto 1,000,000,000 and the net is still diverging. Anybody out there ever trained a net using EKF? Or RTRL?
I see your intention with the docs. Keep in mind that EKF/RTRL is for the command line only at present, and a sample is in each class' main method. The new plugin should easily be gui wrapable though....
Found this one paper where they suggest using 100 and where they discuss using a small learning rate, which they incorporate as 1/LR in the EKF. Thus if LR=0.001, then they are using 1000 in the EKF.
My feeling is that recurrent nets are not going to be trained by clicking a few things in a gui.......
EKF now working for both FFN and RNN networks. What a struggle and bugs present in almost all the working papers I have looked at...... Now working on RTRL again.
Yea! Some progress. Reworked RTRL and it is also working now, albeit at a super small learning rate. All of these (RTRL, EKF FNN and EKF RNN) are now working on a few linear dummy networks. I *think* they will also work on non-linear and big nets, but will toy with these soon.
Just an update. EKF seems to be working, but is exceptionally slow. On a production network, with 200 inputs, it takes a day for one iteration. I killed it after three, and the RMSE was 0 in the initial run, then something like 0.12 and then again 0.12 - very small improvement. The data's RMSE is something like 0.06, so a long way to go. RTRL also works, and at least there I could push through around 200 cycles in the same time. The optimised net's out of sample performance was dismal, but what can you expect after a mere 200 iterations......
Anyhow, the code seems to work now so let me know if you are interested. It comes with many snags - EKF requires colt and to get RTRL to work I had to fiddle in many joone places. My idea is to later on, when things are more stable, maybe submit it as a new project or a new branch given all the tweakings.
Another update. Lost a few days tracking down NANs to the way in which joone speeds up calculations in the tanh layer. Then lost even more days due to a bug in openjdk when calculating tanh. Calculate tanh in different threads for a few thousand times and it goes weird on you.... Anyhow, completed an optimisation run of a very simple recurrent network during the night using RTRL and EKF and both seems to work. RTRL gives best performance at this stage. EKF takes forever - RTRL probably took less than an hour while EKF used the rest of the time.
Will upload the code somewhere at some stage. My wife suggested I call it july - very good idea. July comes after June and was developed mostly in July......
Now looking at java clusters to try and speed up the EKF and also at MIDI software to try and use the recurrent nets to compose some music - just a fun exercise.
Nice work ferra. I look forward to trying it out, when it has been made as a plugin, but nice work so far
Thanks! RTRL looks fine but EKF - either there is still a bug or its not that great or it needs more coaching. Works on small networks, not as good as RTRL on big ones, and terribly slow. Anyhow, I'll write up some kind of a manual and then commit it all as a cvs branch.
A bit off topic... I've been struggling a lot with cvs lately. Switched over to Fedora 9 and cervisia have not been the same since. After many frustrations I tried tkcvs, which works well locally, but constantly needs a password when working remotely on sf. So back to cervisia which managed to mess up the pristine RC2.0.0 as soon as I got it working. At least I still have a pristine copy, but don't tell cervisia about that.
Once all is working I'll commit the new branch, pristine RC2.0.0 and the manual. In the meantime, don't be shy to tell me about any good GUI linux CVS frontends that you know about.....