[Clus-general] parameter tweaking

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

First of all: hello to everybody on the list, and thanks to the
authors of Clus to provide such a nice tool.

We are using Clus as one possible MLC method, applied to our
toxicological dataset. The data has around 700 instances, around 25
labels and lots (!) of missing labels.

Unfortunately Clus (and the other MLC) methods do not perform as good
as we would like to (Clus predictions have only around 60 percent
accuracy and auc).

Are there any settings especially suited for datasets with many
missing labels? So far, I tried using GainRatio instead of
VarianceReduction as heuristic.

I have disabled pruning, and it showed no effect (identical results
with Pruning=C4.5). Is there an explanation for this?

Thanks and kind regards,
Martin

P.S.:
I have adjusted the Clus source code to commons-math3-3.2 and
weka-3-7-6, if s.o. is interested, I can share my changes

-- 
Dipl-Inf. Martin Gütlein
Phone:
+49 (0)761 203 8442 (office)
+49 (0)177 623 9499 (mobile)
Email:
gue...@in...