Re: [Databionic-ESOM-User] Background Visualizations

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Katerina Mitrokotsa wrote:
> Dear Fabian,
> could you please let me know
> 1. If using different background visualizations  (P-Matrix, U-Matrix)
> may predict different classification results.
> Is the algrorithm performed differently for different Visualizations
> (P-Matrix, Two-Match, U-Matrix)?

I'm not sure if I understand your question, but let me try to clarify.
The U-Matrix and the Two-Match are both distance based visualizations,
i.e. large values (=mountains) represent large distances in the dataset.
These visualizations can be used with the floodfill option on the class
mask page, to select valleys as clusters, if this is what you mean with
classification.

The P-Matrix is a density based visualization, i.e. large values
represent large densities in the data. This should _not_ be used with
the floodfill automation, maybe we should ass an inverted P-Matrix or an
inverted floodfill for this. But as for any visualization you can always
manually label regions as clusters, for the P-Matrix you would typically
select connected regions with large densities.

> 2. In order to see different visualuizations of the data I have to
> select the visualization (P-Matrix, Two-Match, U-Matrix etc) and then
> train the data?

No. First you train and then you can display different visualizations
for the same trained ESOM. They do not affect the training.

> 3. If I select only a few features (enabled) from the tab component (not
> all that have been used in training procedure) and press update
> then the U-Matrix (or the corresponding selected background) is updated
> accordingly as if the training has been done unsing only the enabled
> features (from the component tab)???

Almost right. All components are used during training. After the
training you can use the components tab to select a subset for the
calculation of the visualization. This is not equivalent to a training
with a subset of the components, however, especially if the subset is
small. You can select a subset for training by setting the column key in
the *.lrn files to 0 for columnes not to be used.

bye
fabian