[Databionic-ESOM-User] Re: Parameter Selection for ESOM

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Michael,

let me just add a few comments to what Mario said.

- you don't neccessarily need more neurons than data points. it all
depends on the view on your data that you want to have. first off, ESOM
need to be large, otherwise they would be k-Means SOM where one neuron
is one cluster. Large starts somewhere > 1000, we hardly ever go below
50x82=4100 (~4096=64x64 what we used before we found out about
rectangular maps). with small dataset you will have enough room on the
map to see cluster structure (if there is any) _and_ inner cluster
relations. the more data you have, the less room there will be and data
points are placed on top of each other. you will still see the global
structure, but less details. enlargening the map will help, but slow
down the training of course. i would always start with the default size
and go larger if it seems neccessary based on the result. i have also
successfully used sampling on a large (30K) dataset. i trained 50x82
maps on a 3K sample, identified clusters with the class mask tool, and
used the classification mode to transfer the result to the complete data.

- start radius: like mario said about half the smaller grid size. a too
large value will 'waste' the early training episodes, because almost the
whole map will be pulled back and forth by the updates. what you want is
to have part of the map pulled towards a cluster by the updates of the
corresponding data points and other parts towards other clusters. if you
start with too small a radius, there is a danger of 'loosing' neurons
that will nevery be pulled anywhere and keep their random values from
the initialization.

and even though you didn't ask, but someone else might soon:

- end radius: small (=1) if you want a lot of detail, a little larger to
concentrate on coarser sturctures.

- episodes: the number of training episodes isn't closely related to the
choice of the other parameters. in some publications several thousand
training episodes are mentioned. this is a complete waste of computing
power. somewhere between 20 and 50 should provide a slow enough cooling
of the parameters.

the toy examples like hepta are a good starting point to explore the
behaviour of ESOM given different parameter settings. you do have to
make some extreme choices, however, to really not make it work. we
consider it fairly robust w.r.t. the parameters from our long year
experiences.

please note the technichal report, that covers some of the above
questions, but with a less hands-on tough:

[Ultsch 2005b] Ultsch, A., Moerchen, F.: ESOM-Maps: tools for
clustering, visualization, and classification with Emergent SOM,
Technical Report Dept. of Mathematics and Computer Science, University
of Marburg, Germany, No. 46, (2005)
http://www.mathematik.uni-marburg.de/~databionics/downloads/papers/ultsch05esom.pdf

bye
fabian

p.s. Michael did reply to the list, only a sf filter sent it to the list
admin (me) for approval first.