This a new CVS snaphost release feature a number of
bug fixes and completion of integration of a couple
feature requests. While being the latest and the
greatest bleeding edge code release, it may also
contain latest bugs. It passes well though the
standard testing runs. The details on the changes
are provided below in the ChangeLog. Any constructive
feedback is welcome.
Download:
https://sourceforge.net/project/showfiles.php?group_id=63118&release_id=446448
* Fix a bug #1539695, where a NullPointerException has been thrown because
a valid instance of a feature extraction modue was not necessarily available
with the new API of train(double[]) and classify(double[]). First the problem
was spotted and fixed in the Distributed MARF, and released in its PoC-demo,
and now the fix made it to the main tree. This patch remedies this problem by
always keeping a local reference to the feature arrays regardless where it
is coming from.
* Fix two bugs in the Zipf's Law implementation, namely #1459461 and #1551592.
In the latter a case when just a corpus filename is supplied was overlooked
and was giving and error of invalid options. The former of the
irregular ArrayIndexOutOfBounds, was due to the fact that for the large
corpora (and depending on the tokenizer settings) there exist words
with the occurence frequency greater than 100, but we were collecting up
to the hundred only, so this was for now fixed by placing a warning that
beyond the page boundary, no C(f,w) is computed. This ought to change
for the general case, but in some near future. Clean up some comments
in the StatisticalObject.
* Implement Serializable MARF Configuration object placeholder. This
basic implementation completes the Feature Request #1539611 and is
already present the PoC-demo version of the Distributed MARF.
All the config parameters can be encpasulated inside this object
and passed around over the network or serialized on the file system.
May aid recovery or replication of the same configuration.
* Add a reader that allows pre-loading a file into a set of byte array
buffers, e.g. for later transmission. Originated from ByteArrayFileLoader
of Distributed MARF, and was renamed to ByteArrayFileReader. This
completes the Feature Request # 1539612.
* Add a basic JUnit test suite for the Sample class. Specifically, its
serializability.
* Refactor the Sample and MARFAudioFileFormat.
This decouples MARFAudioFileFormat as a dependency for Sample to be
truly Serializeable. This fixes the bug #1541308. An instance of
MARFAudioFileFormat is reconstructed upon deserialization in
the readObject() implementation. equals() implemented to test
whether the saved and re-loaded samples are actually the same.
* Perform some minor code and comments clean ups.