Home / Datasets / Train-test Splits
Name Modified Size InfoDownloads / Week
Parent folder
tmc2007-500-train.arff 2014-09-30 3.2 MB
tmc2007-500-test.arff 2014-09-30 1.0 MB
README.txt 2014-09-30 661 Bytes
yeast-train.arff 2014-09-30 1.5 MB
yeast-test.arff 2014-09-30 918.5 kB
scene-train.arff 2014-09-30 3.1 MB
scene-test.arff 2014-09-30 3.1 MB
music-test.arff 2014-09-30 133.0 kB
medical-train.arff 2014-09-30 98.9 kB
genbase-train.arff 2014-09-30 1.7 MB
genbase-test.arff 2014-09-30 758.7 kB
enron-train.arff 2014-09-30 607.8 kB
enron-test.arff 2014-09-30 324.2 kB
emotions-train.arff 2014-09-30 252.1 kB
emotions-test.arff 2014-09-30 132.0 kB
Totals: 15 Items   16.8 MB 1
These datasets are supplied in train-test format; as also in http://mulan.sourceforge.net/datasets-mlc.html
The are good for comparing to e.g., Madjarov et al "An extensive experimental comparison of methods for multi-label learning" (among many others).
Here the -C flag is set into the header with the number of labels for Meka, so it does not need to be specified at run time. Simply use the flags

	 -t emotions-train.arff -T emotions-test.arff

Note that these datasets are in 'Mulan format' (with the classes at the end). Meka will automatically move them to the beginning before classification. This may take a few extra seconds for very large datasets.
Source: README.txt, updated 2014-09-30