|
From: Martin G. <mar...@gm...> - 2013-08-15 10:18:00
|
Hi, First of all: hello to everybody on the list, and thanks to the authors of Clus to provide such a nice tool. We are using Clus as one possible MLC method, applied to our toxicological dataset. The data has around 700 instances, around 25 labels and lots (!) of missing labels. Unfortunately Clus (and the other MLC) methods do not perform as good as we would like to (Clus predictions have only around 60 percent accuracy and auc). Are there any settings especially suited for datasets with many missing labels? So far, I tried using GainRatio instead of VarianceReduction as heuristic. I have disabled pruning, and it showed no effect (identical results with Pruning=C4.5). Is there an explanation for this? Thanks and kind regards, Martin P.S.: I have adjusted the Clus source code to commons-math3-3.2 and weka-3-7-6, if s.o. is interested, I can share my changes -- Dipl-Inf. Martin Gütlein Phone: +49 (0)761 203 8442 (office) +49 (0)177 623 9499 (mobile) Email: gue...@in... |