Attached you could find a patch that fixes the “predictdistribution” command of “waffles_learn”.
The previous version didn’t work - threw an exception and wasn’t compatible with the rest of the waffles_learn commands.
Also the required input was subtle. And the output not compatible with the rest of the commands.
The current implementation requires the same parameters as “predict” and returns the predictions in the form of ARFF output.
It also uses the corresponding relation, columns and values (for categorical columns) names.
The implementation strives to sparely use the memory, preventing excessive allocations and copying.
Here is how the new implementation look like for 1 and 2 columns "predict distribution":
The models are built for the ionosphere.arff. The predictions are done on the same dataset.
You could find it attached. The output files with the predictions, mentioned above, are also attached for reference.
Please if you have any notes and feedback on the patch let me know and I’ll take care asap.
Thanks Krasimir! I have applied the patch and pushed it. (If you plan to do more development in the future, I could add you as a developer to the project. Just let me know.)
Mike
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks Mike! I'd be happy to be added as a developer to the project indeed! I'm extensively using it so as a need for a feature appears or a fix is needed I plan on contributing it. I'm running it on almost all Linux flavours (including docker containers) and Mac OS.
Thanks again for your time and the library!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi,
Attached you could find a patch that fixes the “predictdistribution” command of “waffles_learn”.
The previous version didn’t work - threw an exception and wasn’t compatible with the rest of the waffles_learn commands.
Also the required input was subtle. And the output not compatible with the rest of the commands.
The current implementation requires the same parameters as “predict” and returns the predictions in the form of ARFF output.
It also uses the corresponding relation, columns and values (for categorical columns) names.
The implementation strives to sparely use the memory, preventing excessive allocations and copying.
Here is how the new implementation look like for 1 and 2 columns "predict distribution":
./waffles-code/bin/waffles_learndbg predictdistribution randomforest1.model ./ionosphere.arff > predictdistribution.1.out
@RELATION ionosphere
@ATTRIBUTE b numeric
@ATTRIBUTE g numeric
@data
0.1,0.9
0.8,0.2
0,1
0.9,0.1
...
./waffles-code/bin/waffles_learndbg predictdistribution randomforest2.model ./ionosphere.arff -labels 33,34 > predictdistribution.2.out
@RELATION ionosphere
@ATTRIBUTE a34-mean numeric
@ATTRIBUTE a34-variance numeric
@ATTRIBUTE b numeric
@ATTRIBUTE g numeric
@data
-0.427727,0.0638013,0.2,0.8
-0.079298,0.0136724,0.9,0.1
-0.319678,0.0165995,0,1
0.8812,0.127021,0.9,0.1
…
Here are the same predictions, this time done with predict.
./waffles-code/bin/waffles_learndbg predict randomforest.model ./ionosphere.arff > predict.1.out
@RELATION ionosphere
@ATTRIBUTE class {b,g}
@DATA
g
b
g
b
...
./waffles-code/bin/waffles_learndbg predict randomforest2.model ./ionosphere.arff -labels 33,34 > predict.2.out
@RELATION ionosphere
@ATTRIBUTE a34 real
@ATTRIBUTE class {b,g}
@DATA
-0.427727,g
-0.079298,b
-0.319678,g
0.8812,b
...
The models are built for the ionosphere.arff. The predictions are done on the same dataset.
You could find it attached. The output files with the predictions, mentioned above, are also attached for reference.
Please if you have any notes and feedback on the patch let me know and I’ll take care asap.
Best regards,
Krasi
P.S. To apply the patch over master:
git apply --stat predictdistribution.patch
git apply --check predictdistribution.patch
git apply < predictdistribution.patch
Last edit: Krasimir Marinov 2015-04-03
Since I'm unable to add attachments on topic creation, I'm replying to myself with the attachments.
Please find the updated patch, fixing a compilation error on Mac OS.
Please find attached updated patch, fixing compilation error on Mac OS
The patch in "standard" format, not git patch
Thanks Krasimir! I have applied the patch and pushed it. (If you plan to do more development in the future, I could add you as a developer to the project. Just let me know.)
Mike
Thanks Mike! I'd be happy to be added as a developer to the project indeed! I'm extensively using it so as a need for a feature appears or a fix is needed I plan on contributing it. I'm running it on almost all Linux flavours (including docker containers) and Mac OS.
Thanks again for your time and the library!
done.