2010-07-19 05:53:25 PDT
First, let me introduce myself. My name is Jeff Heaton, I am the lead developer for the Encog project. Encog is a Java/C#/Silverlight framework for neural networks and other machine learning structures. I became aware of Neuroph through some articles that benchmarked Neuroph to Encog. I took a look at the Neuroph code and was immediately impressed with the very high-level, neuron-based way that it represents a range of neural network architectures. I also read [url=
http://neuroph.sourceforge.net/improving_performance.html]Zoran's article on improving Neuroph performance, where he talked about building a core that hide much of the complexities of multicore, GPU, matrix operations and other performance oriented architecture decisions.
This made me start to think, the Encog project is already working on the second generation of just such a core. Could we not share some technology and get the best of both? We were already isolating the core into a separate framework for other reasons. So I sent Zoran an email, and for the past week we've been discussing how we might work together. As a result, I am joining the Neuroph team to help with this integration and other advances as well.
I also plan to use Neuroph extensivly in my "Introduction to Neural Networks in Java" book. I don't use Encog for that, because multicore programming, GPU, etc just gets in the way of explaining such things as how a neural network calculates and how backpropagation works. So I will be pushing for wider adoption of Neuroph as well.
So now I am looking at how to make Neuroph use the same engine as Encog. This will effectively case both frameworks to produce the same benchmark results. It will also mean that Neuroph will support GPU acceleration and multicore training. I want to present my plan for how to do this here, before I actually implement it. My main goals are to minimize any breakage to Neuroph backwards compatibility, and keep with Neuroph's design philosophy.
Here is how I think we should actually do this. I am still evolving this idea, and am not quite ready to write code yet. I want to get input from people here as well.
I think we should leave all of the existing Neuroph training and network calculation, as it is now. The Neuroph backpropagation classes are very easy to follow, and really do a good job of showing how backpropagation calculation works. I think at some point soon, and I will help with this, we should add resilient propagation (RPROP) and batch training to Neuroph. Implementing RPROP inside of Neuroph, using only Neuroph technology will also allow RPROP to be demonstrated as effectively as Neuroph demonstrates backprop. But that will be another discussion, and does not have to be done to make use of the kernel.
To implement the kernel, we need to cause Neuroph to store a networks weights inside of an Encog flat network. An Encog flat network basically stores weights as a double array. Fundamentally, Encog and Neuroph store neural networks as a bunch of doubles. How they are represented is the difference. Neuroph stores them in Weight classes that are contained in Connection classes. This makes the network easier to visualize, but means much more work for the computer to get to all the values during training.
I suggest we create a subclass of Weight, called FlatWeight that would basically hold a reference to a double array and an index. This would be a direct mapping between Neuroph and the Encog flat network's weight array. This way there would be no need to ever translate between Neuroph and Encog neural network formats. The data would already be in Encog format, and ready to go. Neuroph would never know the difference because it would still be weight objects. The getValue on the weight would simply read the weight from the Encog flat network. The Neuroph NeuralNetwork class would be extended to hold a reference to an Encog FlatNetwork, which is what all the FlatWeights would point to. If you want Neuroph to behave as it always has, just put regular Neuroph Weight objects into the Connection objects, and make the flatNetwork attribute of your Neuroph NeuralNetwork class be null, and everything is the same as it always was.
Not to train, I will provide a new LearningRule derived class to Neuroph. It will be called something like FlatLearning, or something. But it can be added to the neural network to cause it to take advantage of the Encog Engine's multithreaded or even GPU training. You will also be able to specify if you want backprop, rprop, etc. Once it is set, as the learning rule for the NeuralNetwork class, you work with the neural network just as you would any Neuroph neural network.
This is basically how I see it fitting in. There will be additional implementation details. But I think it can be a really clean integration.
Any suggestions or comments are very welcome.
Jeff