I'm trying to use the Neural networks demo included in SpeakerIdentApp but haven't been able to successfully run it. I'm using the following arguments --train -raw -aggr -nn.
When I run it with the stable version of MARF (0.3.0.5, marf-0.3.0-devel-20060226) it seems to eat more and more memory until the computer runs out of it. How much memory would be enough?
When I use the latest development snapshot (0.3.0.6, marf-0.3.0-devel-20070108) an Exception is risen. I am attaching the Exception here. The part surrounded by <BEGIN> and </END> tags repeats about 70 times. I think this is probably caused by a recursion that never ends.
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass$FieldReflector.getPrimFieldValues(Unknown Source)
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source)
<BEGIN>
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
</END>
Prior to running each one of these tests I cleaned up the workspace, deleting gzbin files generated by MARF. I also ran SpeakerIdentApp with the --reset argument before each run. The tests were ran several times with the same results in a Linux machine and one running MS Windows.
I also tried SpeakerIdentApp using these arguments: --train -raw -aggr -eucl. This test works gracefuly.
Thanks in advance!
Best regards,
Silvio
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your report. Here are some points I can offer now in the
reply:
First, our Neural Network (NN) is known not to behave well (i.e. run
with the default settings and large number of features on the
input layer. The "large" varies around 300+ features. FFT and Aggregator
produce 512 and 522 features respectively (thereby the input layer
will have as many neurons). We even explicitly avoid those configurations
in our default training script (testing.sh/testing.bat):
...
# XXX: We cannot cope gracefully right now with these combinations in the
# typical PC/JVM set up --- too many links in the fully-connected NNet,
# so can run out of memory quite often; hence, skip them for now.
if("$class" == "-nn" && ("$feat" == "-fft" || "$feat" == "-randfe" || "$feat" == "-aggr")) then
echo "skipping..."
continue
endif
...
The NN has three layers by default dynamically created as follows:
The number if input layer neurons always equals to the number of incoming
features f (the length of the feaure vector), and the size h of the middle
hidden layer is h = |f - n|; if f = n, then h = f/2. By default, the network
is _fully-interconnected_. (n is currently 32 corresponding to a 32-bit
integer).
Thus, with -aggr -nn you get 522 * (522 - 32) * 32 = 8184960 links between
neurons that have to stored and traversed. This is where it would run out
of memory with the default JVM memory limits and availability of the physical
memory.
Why is that so? Well, the NN research is very vast and people have different
ideas of the NN interconnectivity, activation functions, how many hidden
layers there should be and what is their size, etc. -- it requires a lot of
theory and experimentation to be examined, we did not have a lot of man
power to do so for NN. You are welcome to volunteer though ;-)
You can still get it run and not to run out of memory (through it still will
be slow) by increasing the amount of memory JVM is allowed to use, e.g.
java -Xmx2048m ... SpeakerIdentApp ...
The stack overflow exception you quote above is a new beast I will
have to look into to see why the recursion is there, it may be a
"newly" introduced bug.
You can also try -nn with other than -aggr or -fft (or reduce the number
of features -aggr and -fft produce to around 100).
I have not managed to release another CVS snapshot even though there were
numerous fixes since the last snapshot release, would you mind trying
the following .jar? It has debug symbols included, so we would see the
source code line numbers instead of "Unknown Source"...
If you carefully read the old stack trace again (like I've done just now), you will notice there aren't any MARF Classes in it.
Despite that I did what you asked and here's the output:
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass$FieldReflector.getPrimFieldValues(Unknown Source)
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source)
<BEGIN>
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
</END>
Again the part surrounded by <BEGIN> and </END> tags repeats several times.
After that I decided to do some more debugging so I did the following things.
I modified the StorageManager class' restoreGzipBinary() method like this:
...
public synchronized void restoreGzipBinary()
throws StorageException
{
try
{
System.out.println("ANTES oFIS");
FileInputStream oFIS = new FileInputStream(this.strFilename);
System.out.println("oFIS");
GZIPInputStream oGZIS = new GZIPInputStream(oFIS);
System.out.println("oGZIS");
ObjectInputStream oOIS = new ObjectInputStream(oGZIS);
System.out.println("oOIS");
this.oObjectToSerialize = (Serializable)oOIS.readObject();
System.out.println("oObjectToSerialize");
...
I also modified the NeuralNetwork class' restore() method like this:
...
public void restore()
throws StorageException
{
switch(this.iCurrentDumpMode)
{
case DUMP_GZIP_BINARY:
System.out.println("restore DUMP_GZIP_BINARY");
restoreGzipBinary();
break;
case DUMP_BINARY:
System.out.println("restore DUMP_BINARY");
restoreBinary();
break;
This is the output:
ANTES oFIS
restore DUMP_GZIP_BINARY
ANTES oFIS
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass$FieldReflector.getPrimFieldValues(Unknown Source)
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source)
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
...
What puzzles me is that: System.out.println("ANTES oFIS"); is reached but System.out.println("oFIS"); isn't. Also the instruction System.out.println("ANTES oFIS"); is reached twice.
Maybe because an exception was triggered in restoreGzipBinary() or because the application runs multiple threads.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm trying to use the Neural networks demo included in SpeakerIdentApp but haven't been able to successfully run it. I'm using the following arguments --train -raw -aggr -nn.
When I run it with the stable version of MARF (0.3.0.5, marf-0.3.0-devel-20060226) it seems to eat more and more memory until the computer runs out of it. How much memory would be enough?
When I use the latest development snapshot (0.3.0.6, marf-0.3.0-devel-20070108) an Exception is risen. I am attaching the Exception here. The part surrounded by <BEGIN> and </END> tags repeats about 70 times. I think this is probably caused by a recursion that never ends.
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass$FieldReflector.getPrimFieldValues(Unknown Source)
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source)
<BEGIN>
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
</END>
Prior to running each one of these tests I cleaned up the workspace, deleting gzbin files generated by MARF. I also ran SpeakerIdentApp with the --reset argument before each run. The tests were ran several times with the same results in a Linux machine and one running MS Windows.
I also tried SpeakerIdentApp using these arguments: --train -raw -aggr -eucl. This test works gracefuly.
Thanks in advance!
Best regards,
Silvio
Hi Silvio,
Thanks for your report. Here are some points I can offer now in the
reply:
First, our Neural Network (NN) is known not to behave well (i.e. run
with the default settings and large number of features on the
input layer. The "large" varies around 300+ features. FFT and Aggregator
produce 512 and 522 features respectively (thereby the input layer
will have as many neurons). We even explicitly avoid those configurations
in our default training script (testing.sh/testing.bat):
...
# XXX: We cannot cope gracefully right now with these combinations in the
# typical PC/JVM set up --- too many links in the fully-connected NNet,
# so can run out of memory quite often; hence, skip them for now.
if("$class" == "-nn" && ("$feat" == "-fft" || "$feat" == "-randfe" || "$feat" == "-aggr")) then
echo "skipping..."
continue
endif
...
The NN has three layers by default dynamically created as follows:
The number if input layer neurons always equals to the number of incoming
features f (the length of the feaure vector), and the size h of the middle
hidden layer is h = |f - n|; if f = n, then h = f/2. By default, the network
is _fully-interconnected_. (n is currently 32 corresponding to a 32-bit
integer).
Thus, with -aggr -nn you get 522 * (522 - 32) * 32 = 8184960 links between
neurons that have to stored and traversed. This is where it would run out
of memory with the default JVM memory limits and availability of the physical
memory.
Why is that so? Well, the NN research is very vast and people have different
ideas of the NN interconnectivity, activation functions, how many hidden
layers there should be and what is their size, etc. -- it requires a lot of
theory and experimentation to be examined, we did not have a lot of man
power to do so for NN. You are welcome to volunteer though ;-)
You can still get it run and not to run out of memory (through it still will
be slow) by increasing the amount of memory JVM is allowed to use, e.g.
java -Xmx2048m ... SpeakerIdentApp ...
The stack overflow exception you quote above is a new beast I will
have to look into to see why the recursion is there, it may be a
"newly" introduced bug.
You can also try -nn with other than -aggr or -fft (or reduce the number
of features -aggr and -fft produce to around 100).
I have not managed to release another CVS snapshot even though there were
numerous fixes since the last snapshot release, would you mind trying
the following .jar? It has debug symbols included, so we would see the
source code line numbers instead of "Unknown Source"...
http://users.encs.concordia.ca/~mokhov/marf/
Thanks again and looking forward to hearing from you soon,
-s
If you carefully read the old stack trace again (like I've done just now), you will notice there aren't any MARF Classes in it.
Despite that I did what you asked and here's the output:
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass$FieldReflector.getPrimFieldValues(Unknown Source)
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source)
<BEGIN>
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
at java.io.ObjectOutputStream.writeObject(Unknown Source)
</END>
Again the part surrounded by <BEGIN> and </END> tags repeats several times.
After that I decided to do some more debugging so I did the following things.
I modified the StorageManager class' restoreGzipBinary() method like this:
...
public synchronized void restoreGzipBinary()
throws StorageException
{
try
{
System.out.println("ANTES oFIS");
FileInputStream oFIS = new FileInputStream(this.strFilename);
System.out.println("oFIS");
GZIPInputStream oGZIS = new GZIPInputStream(oFIS);
System.out.println("oGZIS");
ObjectInputStream oOIS = new ObjectInputStream(oGZIS);
System.out.println("oOIS");
this.oObjectToSerialize = (Serializable)oOIS.readObject();
System.out.println("oObjectToSerialize");
...
I also modified the NeuralNetwork class' restore() method like this:
...
public void restore()
throws StorageException
{
switch(this.iCurrentDumpMode)
{
case DUMP_GZIP_BINARY:
System.out.println("restore DUMP_GZIP_BINARY");
restoreGzipBinary();
break;
case DUMP_BINARY:
System.out.println("restore DUMP_BINARY");
restoreBinary();
break;
default:
System.out.println("super.restore");
super.restore();
}
//restoreXML();
}
...
This is the output:
ANTES oFIS
restore DUMP_GZIP_BINARY
ANTES oFIS
Exception in thread "main" java.lang.StackOverflowError
at java.io.ObjectStreamClass$FieldReflector.getPrimFieldValues(Unknown Source)
at java.io.ObjectStreamClass.getPrimFieldValues(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteFields(Unknown Source)
at java.io.ObjectOutputStream.defaultWriteObject(Unknown Source)
at java.util.ArrayList.writeObject(Unknown Source)
at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at java.io.ObjectStreamClass.invokeWriteObject(Unknown Source)
at java.io.ObjectOutputStream.writeSerialData(Unknown Source)
at java.io.ObjectOutputStream.writeOrdinaryObject(Unknown Source)
at java.io.ObjectOutputStream.writeObject0(Unknown Source)
...
What puzzles me is that: System.out.println("ANTES oFIS"); is reached but System.out.println("oFIS"); isn't. Also the instruction System.out.println("ANTES oFIS"); is reached twice.
Maybe because an exception was triggered in restoreGzipBinary() or because the application runs multiple threads.