I've been once more prodded into activity by Ilya, and so the SPR source repository is once again active on SourceForge. After my poll of preferred VCS systems, I decided that it was best to stick with the centralised version control that we know and love, so the repository is now using Subversion (sort of "CVS++", if you're unfamiliar: most cvs commands will continue to work, with the "cvs" command itself exchanged for "svn"). I encourage wannabe contributors to use Bazaar (http://bazaar-vcs.org) to use the bzr-svn plugin to make versioned patches and send them to me, thus avoiding the problem of needing lots of people to have commit access to the main repository.... read more
I've just pushed up a new release of SPR, which does the ROOT detection / absence-handling better, and also uses Ilya's preferred compilation flags (using ROOT will downgrade the optimisation level somewhat).
Sorry to those who know that I've basically had this version ready for release for about 2 months and haven't had the time/energy to do the final push online. Oops.
On that note, I'd like again to appeal for anyone who's actively using the package to consider becoming a maintainer... while I hope to get more active again in future, it's been a very non-SPR year for me!... read more
I've just pushed out a new release of SPR, version 3.3.1, which just fixes a few little bugs and a big compilation problem involving ROOT's linker flags. The whole ROOT detection and linking system is now pretty robust, I think.
(Note that the version number system has changed to the "time-invariant" major,minor,patch scheme used by most GNU projects, but this is essentially a bug-fix of SPR-08-02-00.)... read more
SPR management has been taken over by Andy Buckley (andy dot buckley at durham dot ac dot uk). Please send your requests for support and development to him with a cc to me.
Most important changes in this release:
1) Gene Expression Programming implemented by Julian Bunn is now ready for use. See section 2.12 in README.
2) Piti Ongmongkolkul implemented matrix generation for the multi-class learner by Allwein-Schapire-Singer. The generator is capable of optimizing minimal or average Hamming distance between rows (classes) and/or columns (classifiers). See Section 2a.4 in README.... read more
I have accepted a position with The MathWorks and am relocating to the Boston area in late May. As of mid May I will no longer develop SPR due to the non-competition agreement and the lack of time.
In the foreseeable future, I will continue to fix bugs and answer questions on SPR usage. Of course, this only covers algorithms implemented before June 2008. Bug fixes will be applied to the latest SPR release at Sourceforge and in some unusual circumstances can be propagated to earlier releases. There will be no support for customized SPR installs at BaBar, CMS or elsewhere. All queries should be sent to inarsky spamdies yahoo diesagain com.... read more
SPR-08-00-00 implements a new multiclass learning algorithm and a new framework for handling missing values. See Release Notes and README.
SPR-07-08-00 implements the "add n remove r" model for variable selection, input variable normalization, and various fixes. See README and HISTORY for details.
Release SPR-07-06-01 introduces Prinicipal Component Analysis implemented within a general framework for variable transformation and feature selection. Feature extraction methods other than PCA will be added in the future.
This is a much more mature (compared to the early 7 series) release of the ROOT interface. Several features have been added and several bugs have been fixed. The release has been tested against Root 5.14 and 5.16.
Major changes since SPR-07-02-01 include:
- Filtering on user-defined input classes. The user can now redefine classes as formulas by supplying a function that takes data variables as input and computes an integer class label. See README "7. Handling input variables" and exampleUserCuts.cc.... read more
Users who like Root do not need to build two SPR versions - one for Ascii input/output and another one for Root input/output. If the package is built against Root, the user can now choose between the two formats for input and output. The input and output formats can be chosen independently, that is, you can opt to read from Ascii and write to Root or the other way around. See https://sourceforge.net/project/shownotes.php?group_id=178324&release_id=544539
If anyone experienced problems with SprOutputWriterApp crashing at the very end, that's because there was a stupid bug there since release V06-06-00. It is now fixed.
I updated INSTALL instructions for builds against Root. See
The user is now asked to set several Root environment variables and run root/makedict.sh script before building the package. This is to avoid possible conflict between rootcint installed on your machine and the default SprAdapterDict class autogenerated by rootcint on my desktop. Bug reports are encouraged.... read more
I posted two SPR releases today, SPR-06-07-01 and SPR-07-00-00.
SPR-06-07-01 fixes (finally) the mutli-class learner bug that caused incorrect assignment of classes for decision trees. Unfortuntately, when I announced the fix last Friday, I fixed one thing but missed another. After the latest fix, I ran a bunch of tests on extreme cases (perfectly separated classes, classes with 1 event etc) and I don't see any problems. This bug really should be fixed now.... read more
See release notes in https://sourceforge.net/project/shownotes.php?group_id=178324&release_id=522792
I updated INSTALL instructions with a list of known problems and fixes. See http://statpatrec.cvs.sourceforge.net/statpatrec/StatPatternRecognition/INSTALL?revision=1.5&view=markup
Tag V06-00-00 implements new methods, various convenience features and non-critical fixes.
The major change is introduction of SprCombiner, a method for combining predictions by several powerful classifiers. A combiner class had been in the package since a long time ago but had limited capabilities. The re-worked combiner can combine arbitrary classifiers trained on arbitrary subsets of data. This is implemented using variable name matching. A section has been added in README giving instructions on using the combiner. ... read more
This version implements various convenience features upon request from Babar PID group. See HISTORY.
SPR-05-01-00 implements various improvements and minor fixes on top of the previous tag. For detail, see HISTORY. The main change is introduction of SprBaggerApp, an executable capable of bagging any classifier.
This tag introduces SprClassifierReader. Reading the saved configuration of a trained classifier from a file is handled by this class for any classifier. In previous tags, this functionality was spread among several reader classes such as SprAdaBoostDecisionTreeReader, SprBaggerDecisionTreeReader etc. Introduction of this new class required making numerous small changes to classifier formats. Because of these, this tag is not backwards compatible. You won't be able to read from classifier configuration files produced prior to this tag.... read more
It slipped my mind and I forgot to put it in SPR-04-05-01
File release SPR-04-05-01 introduces a trainable neural net and boosted neural nets, also various minor fixes.
For the record, list of all classifiers implemented in the package at the moment:
----------------------------------------
- decision split (or stump)
- decision tree (2 flavors)
- bump hunter (PRIM)
- LDA and QDA
- logistic regression
- boosting (discrete and real AdaBoost, epsilon-Boost)
- random forest
- arc-x4
- interfaces to two SNNS neural nets:
* backprop neural net with a logistic activation function
* RBF... read more
StatPatternRecognition was first introduced in early 2005 and used by the High Energy Physics community for data analysis. Two notes describing the package were posted at the xxx.lanl.gov physics archive in summer 2005. Since then I have personally distributed this package under GPL to about 50 people, mostly from HEP but also from other fields of academia/industry. The package has been also available from the BaBar CVS (open to the whole BaBar Collaboration) and posted at the phystat.org archive. That version depended on CLHEP and CERNLIB (HEP-specific packages) and assumed that it was the user's responsibility to provide a Makefile.... read more