Home / Output
Name Modified Size InfoDownloads / Week
Parent folder
wilt 2020-05-13
wdbc 2020-05-13
wall-robot-navigation 2020-05-13
vowel 2020-05-13
vehicle 2020-05-13
tic-tac-toe 2020-05-13
texture 2020-05-13
steel-plates-fault 2020-05-13
splice 2020-05-13
spambase 2020-05-13
sick 2020-05-13
semeion 2020-05-13
segment 2020-05-13
satimage 2020-05-13
qsar-biodeg 2020-05-13
phoneme 2020-05-13
PhishingWebsites 2020-05-13
pendigits 2020-05-13
pc3 2020-05-13
pc1 2020-05-13
ozone-level-8hr 2020-05-13
optdigits 2020-05-13
MiceProtein 2020-05-13
mfeat-zernike 2020-05-13
mfeat-pixel 2020-05-13
mfeat-morphological 2020-05-13
mfeat-karhunen 2020-05-13
mfeat-fourier 2020-05-13
mfeat-factors 2020-05-13
madelon 2020-05-13
letter 2020-05-13
kr-vs-kp 2020-05-13
kc2 2020-05-13
kc1 2020-05-13
jm1 2020-05-13
isolet 2020-05-13
Internet-Advertisements 2020-05-13
ilpd 2020-05-13
har 2020-05-13
GesturePhaseSegmentationProcessed 2020-05-13
first-order-theorem-proving 2020-05-13
eucalyptus 2020-05-13
dresses-sales 2020-05-13
dna 2020-05-13
diabetes 2020-05-13
cylinder-bands 2020-05-13
credit-g 2020-05-13
credit-approval 2020-05-13
cnae-9 2020-05-13
cmc 2020-05-13
climate-model-simulation-crashes 2020-05-13
churn 2020-05-13
car 2020-05-13
breast-w 2020-05-13
blood-transfusion-service-center 2020-05-13
Bioresponse 2020-05-13
banknote-authentication 2020-05-13
balance-scale 2020-05-13
analcatdata_dmft 2020-05-13
analcatdata_authorship 2020-05-13
Totals: 60 Items   1

Here you can find all the files and supplementary materials for the paper "Decoding machine learning benchmarks", published in BRACIS20. The files were organized as follows:

Results BRACIS

  • This folder concentrates all the results generated that are used in the paper.
    datasets.csv: List of ID's that the datasets used in the paper have in OpenML. This list serves as input for the first script.
    clf_rating.csv: File containing the classifier ranting ranking that is shown in Table 1 of the paper.
    Real_clf_nemenyi.csv: P-value matrix resulting from the Nemenyi calculation for the real classifiers.
    IRT_param_freq.txt: File that shows the percentages of difficult, discriminating and easy-to-guess instances for all datasets.
    modelosML.txt: File that lists all hyperparameters used in ML models that were analyzed in the paper.
    Fluxograma.png: Flowchart of the decodIRT execution, shown in Figure 1 of the paper.
    graph_percIRT.png: Image of the graph that compares the percentages of difficult and discriminating instances of the datasets, shown in Figure 2 of the paper.
    jm1_score.png: Image of the comparison chart between the True-Score obtained by the classifiers in the "jm1" dataset, shown in Figure 3 of the paper.
    heatmap_realclf.png: Image of the heapmap used to analyze the results of the Nemenyi test, shown in Figure 4 of the paper.

Output

  • This folder contains all the results generated for each dataset after the execution of each of the three scripts. All results are divided into folders named after each dataset. Each folder contains the following files:

    • Results of the decodIRT_OtML script:
      dataset_name.csv: The file without suffix indicates that its content is the answer of each of the ML models of the real classifiers and the artificial classifiers.
      dataset_name_acuracia.csv: The file with the suffix “_acuracia” means that its content is composed of a table containing the average accuracy of each real classifier, during cross-validation.
      dataset_name_final.csv: The file with the suffix “_final” means that its content consists of a table containing the accuracy of the real classifiers on the separate instances for testing.
      dataset_name_irt.csv: Just like the file without suffix in the name (dataset_name.csv), this file has an array of responses. However, it does contain a response vector for all real, artificial and MLP classifiers. This matrix is ​​used to generate the IRT item parameters in the second script.
      dataset_name_mlp.csv: Contains the final accuracy that the first set MLP’s classifiers obtained after the classification.
      dataset_name_test.csv: The contents of the file are a list of all instances of data that are part of the test set.

    • Results of the decodIRT_MLtIRT script:
      irt_item_param.csv: Table containing all item parameters (Difficulty, Discrimination and Guessing) generated for the test set instances.

    • Results of the decodIRT_analysis script:
      score_disPositivo.csv: Table containing the True-Score score obtained for each real classifier, considering only the instances with positive discrimination.
      score_total.csv: Table containing the True-Score score obtained for each real classifier, considering all instances.
      theta_list.csv: Table that shows the final Theta value obtained by each real classifier.
      dataset_name_score.png: Image of the comparison chart between the True-Score obtained by the real and artificial classifiers in the dataset.

Scripts

Source: README.md, updated 2020-05-28