Menu

UsingKeplerWeka

Peter Reutemann

Using KeplerWeka

Screenshots

The gallery hosts screenshots of example workflows:

Example workflows

Each release of of the KeplerWeka actors (binary or source code) contain a sub-directory called workflows, containing example workflows (incl. the necessary) that explain most of the actors available.

But you can download the workflows separately as well. Just go to the download section of the workflows and select the appropriate one for download.

After unpacking the workflows (Windows users will need an archive program that can handle gzip'ed tar files, e.g., 7-zip), you can just open them in Kepler and run them. Some of the workflows might require you to adjust the file/directory paths of datasets/models/etc.

I want to…

...perform a task with the KeplerWeka actors, but don't know how to connect them. The following sections cover some basic usage of these actors, each with a short description, followed by the example workflows that perform this task.

NB: The following information is always based on the latest release.

...load a dataset

  • Use the FileReader actor, it automatically determines the converter for the data, based on the extension of the filename. But you can also specify an explicit converter and configure it to your needs.
  • Set the filename of the dataset that you want to load.
  • The dataset output port outputs the whole dataset at once (batch-mode), the instance output port row-by-row (used for incremental classifiers).
  • example workflow(s):
    • batch mode
    • associator-01-build_output
    • attribute_selection-01-display_output
    • attribute_selection-02-plot_merit
    • classifier-02-crossvalidate-output_models_summary_roc
    • classifier-03-random_split
    • classifier-04-testset_evaluation
    • classifier-06-crossvalidate_multiple_files
    • classifier-07-crossvalidate_multiple_times-save_roc
    • classifier-08-crossvalidate_multiple_files-on_multiple_classifiers
    • clusterer-01-build-output_summary_graph
    • clusterer-02-crossvalidate-output_summary
    • clusterer-03-build_and_predict
    • clusterer-04-crossvalidate-output_summary-on_multiple_setups
    • misc-01-filter_train_and_test (train)
    • misc-02-train_test_generation_with random_split
    • misc-03-matrix_conversions
    • incremental mode
    • classifier-05-incremental_train_and_predict
    • misc-01-filter_train_and_test (test)

...filter data

  • The Filter actor allows you to apply any Weka filter to the data.
  • The inputOne/outputOne ports are for the first batch of data (a weka.core.Instances object), used for initializing the filter.

        ... --[somePort- inputOne] --> Filter --[outputOne - somePort]--> ...
    
  • The inputTwo/outputTwo ports to pass through a second batch of data (another weka.core.Instances object), using the already initialized filter.

        ... --[somePort - inputOne] --> +---------+ --[outputOne - somePort]--> ...
                                        |  Filter |
        ... --[somePort - inputTwo] --> +---------+ --[outputTwo - somePort]--> ...
    
  • The inputSingle/outputSingle ports can be used to pass single weka.core.Instance objects through the filter.
    Caution: batch filters will be trained with the first weka.core.Instance object being passed through the filter. This, of course, won't make any sense for filters like ReplaceMissingValues as not useful mean/mode can be derived from a single instance.

        ... --[somePort - inputSingle] --> Filter --[outputSingle - somePort]--> ...
    
  • example workflow(s):

    • misc-01-filter_train_and_test

...build an associator

  • TODO
  • example workflow(s):
    • associator-01-build_output

...perform attribute seletion

  • TODO
  • example workflow(s):
    • attribute_selection-01-display_output
    • attribute_selection-02-plot_merit

...train a classifier and output its model

  • TODO
  • example workflow(s):
    • classifier-01-train_on_dataset-output_tree
    • classifier-02-crossvalidate-output_models_summary_roc
    • classifier-03-random_split
    • classifier-04-testset_evaluation

...train a classifier and save the generated model

  • You will need a Classifier and a ModelWriter actor.
  • Connect the built output port of the Classifier actor with the model input port of the ModelWriter.

        ... --> Classifier --[built - model]--> ModelWriter
    
  • Set the correct filename of the model file in the ModelWriter actor.

  • The modelType property must be set to Classifier.
  • example workflow(s):
    • classifier-01-train_on_dataset-output_tree

...output the classifier predictions

  • TODO
  • example workflow(s):
    • classifier-05-incremental_train_and_predict

...use a serialized model

  • TODO

...cross-validate a classifier

  • TODO
  • example workflow(s):
    • classifier-02-crossvalidate-output_models_summary_roc
    • classifier-06-crossvalidate_multiple_files
    • classifier-07-crossvalidate_multiple_times-save_roc
    • classifier-08-crossvalidate_multiple_files-on_multiple_classifiers

...display a classifier's ROC

  • TODO
  • example workflow(s):
    • classifier-02-crossvalidate-output_models_summary_roc
    • classifier-07-crossvalidate_multiple_times-save_roc

...build a clusterer and save the generated model

  • You will need a Clusterer and a ModelWriter actor.
  • Connect the built output port of the Clusterer actor with the model input port of the ModelWriter.

        ... --> Clusterer --[built - model]--> ModelWriter
    
  • Set the correct filename of the model file in the ModelWriter actor.

  • The modelType property must be set to Clusterer.
  • example workflow(s):
    • clusterer-01-build-output_summary_graph

...output the cluster assignments

  • TODO
  • example workflow(s):
    • clusterer-03-build_and_predict

..run the Experimenter

  • The Experiment actor allows you to perform the same experiments and evaluations as the Basic setup in the Weka Experimenter. But in addition to that, you can also feed weka.core.Instances, File and String objects (and arrays of these objects) into this actor, enabling experiments with dynamic generated data. The latter two point to datasets.
  • For the evaluation of the experiment, you need the ExperimentEvaluation actor. This actor allows you to set the same parameters for performing the test as the Weka Experimenter.
  • For displaying the generated result, just use Display or TextDisplay actor.

     ... --[somePort - input]--> Experiment --[setup - input]--> ExperimentEvaluation --[output - input]--> Display
    
  • example workflow(s):

    • experiment-01-multiple_files
    • experiment-02-multiple_files-parameter_sweep

See also


Related

Wiki: Home
Wiki: InstallingKeplerWeka-1.0.x
Wiki: InstallingKeplerWeka-2.0
Wiki: InstallingKeplerWeka

MongoDB Logo MongoDB