From: william b. <wil...@us...> - 2009-01-25 08:20:00
|
Update of /cvsroot/jboost/jboost/scripts In directory fdv4jf1.ch3.sourceforge.com:/tmp/cvs-serv23938/scripts Added Files: VisualizeScores.DEMO.README Removed Files: VisualizeScores.DEMO Log Message: File rename --- NEW FILE: VisualizeScores.DEMO.README --- Score visualization demo using spambase demo data and nfold validation: Assumptions ========== 0. you have java, jython, and python installed. 1. you have downloaded the jboost dist from sourceforge.net and built jboost from source. You CANNOT use the prebuilt jboost.jar (version 1.4 or earlier) as we are using a yet-unreleased version of jboost. 2. You setup your CLASSPATH and JBOOST_DIR environment variables per the install instructions on http://jboost.sourceforge.net/install.html 3. You are running on linux or mac. Instructions are for BASH shell. Mod as appropriate for your environment. The Demo steps: we are going to do nFold validation on the spambase demo data found in $JOOST_DIR/demo. ============== 1. add some libs to your (existing) java CLASSPATH: > export CLASSPATH=$CLASSPATH:$JBOOST_DIR/lib/jcommon-1.0.8.jar:$JBOOST_DIR/lib/jfreechart-1.0.10.jar:$JBOOST_DIR/lib/swing-layout-1.0.jar 2. cd to the demo directory > cd $JBOOST_DIR/demo 3. add an index number to each row of example data and shuffle the examples. We have a python script to do this for you. > ls spam* spambase.data spambase.test spambase.spec spambase.train > ../scripts/AddRandomIndex.py spambase wrote spambase_idx.data. wrote spambase_idx.spec. > ls spam* spambase.data spambase.test spambase_idx.data spambase.spec spambase.train spambase_idx.spec 4. Now we are going to perform Nfold crossvalidation using these new spambase_idx.* files. You can look a the nfold.py script to decipher the command line args. > ../scripts/nfold.py --booster=LogLossBoost --folds=2 --data=spambase_idx.data --spec=spambase_idx.spec --rounds=50 --tree=ADD_ALL --generate k: 0 start:0 end:2300 k: 1 start:2300 end:4600 *=---------------------------------------------------------------------=-* * Fold 0 | *============ java -Xmx1000M -cp :/Users/wbeaver/Documents/workspace/jboost/dist/jboost.jar:/Users/wbeaver/Documents/workspace/jboost/lib/concurrent.jar:/Users/wbeaver/Documents/workspace/jboost/lib/jcommon-1.0.8.jar:/Users/wbeaver/Documents/workspace/jboost/lib/jfreechart-1.0.10.jar:/Users/wbeaver/Documents/workspace/jboost/lib/swing-layout-1.0.jar jboost.controller.Controller -b LogLossBoost -p 3 -a -1 -S trial0 -n trial.spec -ATreeType ADD_ALL -numRounds 50 Fileloader adding . to path. WARNING: configuration file jboost.config not found. Continuing... Found trial.spec Found trial0.train Found trial.spec Found trial0.test Booster type: jboost.booster.LogLossBoost Read 100 training examples Read 200 training examples Read 300 training examples Read 400 training examples Read 500 training examples Read 600 training examples [snip] you get the idea. 5. the results are placed in the directory ./cvdata-mm-dd-hh-mm-ss/<TREE-TYPE> 6. Now run the visualizer in the scripts. Assuming jython is not on your path, be explicit: > ~/jython2.2.1/jython ../scripts/VisualizeScores.py cvdata-mm-dd-hh-mm-ss/ADD_ALL/trial (note, this example output shows the cvdata-mm-dd-hh-mm-ss for my test run) ['cvdata-01-16-17-44-23/ADD_ALL/trial0.test.boosting.info', 'cvdata-01-16-17-44-23/ADD_ALL/trial1.test.boosting.info'] cvdata-01-16-17-44-23/ADD_ALL/trial0.test.boosting.info cvdata-01-16-17-44-23/ADD_ALL/trial1.test.boosting.info index=0, a.length=4600 index=1, a.length=4600 index=2, a.length=4600 index=3, a.length=4600 index=4, a.length=4600 index=5, a.length=4600 index=6, a.length=4600 [snip] Assuming all is setup, you will then see the GUI. 7. Select a range of examples to output for a given iteration using the slider and press the "Print Example Indices" button in the lower left corner. This will save a file called "SelectedExamples.txt" in to the directory you ran the GUI from (in this case $JBOOST_DIR/demos) Using this file, you can write a parser to drill back to your original examples. [end] --- VisualizeScores.DEMO DELETED --- |