Re: [Databionic-ESOM-User] Parameter Selection & ...More

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

hi,

the ESOM training has not been proven to converge in the sense k-Means
or more generally EM does, except for simple 1D settings. the
'convergence' is simulated by the cooling of the parameters, the
neighborhood in particular (a constant learning rate doesn't hurt).

you can obtain exactly reproducible maps by _not_ using "-p" (permute
data) and "-i pca", as explained by Mario and Christian, but this may
not be what you want. "-p" in particular should always be active if you
data is sorted in a particular order, e.g. by known clusters as for many
of the toy examples. training the map with all data points from one
cluster before introducing points from other regions in the data space
may distort the map towards the first cluster. the pca initialization is
a good way of obtaining visually more comparable maps from several runs.

if there are clusters in the data, they should show up on different maps
run with the same parameters. the local neighborhood of a few points you
described is usually not expected to be the same over several runs. i
don't know the dimensionality of your data, but assume it is > 3d.
imagine what happens to the 2D grid of ESOM prototypes in the high
dimensional space. it adjusts to the data: many prototypes are placed
where the data resides, while the grid is stretched in the regions
between. the following picture show this for the chainlink datasets:

http://www.mathematik.uni-marburg.de/~databionics/de//images/chainlink_esom3d.png

if you have a, say 10 dimensional space and are looking at some data
points inside a densely populated region, there is no way of predicting,
how the 2D ESOM grid will locally adjust to this 10 dimensional cloud,
thus the local neighborhood relations will not show reproducible
behaviour. if your 5 points are relatively far from each other in an
otherwise empty region, they should be represented on different maps in
a similar way, but this is a special case.

bye
fabian

Christian Stamm wrote:
> Dear Michael,
> 
> the ESOM - algorithm is indeed non-convergent. Every map you train will be
> unique. This is because of the random initialization of the map and the
> (optional) permutation of the input data during the training process. The
> overall structure of the map will be similar, but inter as will as intra
> cluster neighbourhoods may be twisted or sorted in another fashion,
> without though beeing less meaningful. e.g. the U-Matrix view on the map
> will unveal where large or low distances are present.
> 
> welcome to the user community!
> 
> mfg Christian
> 
> Michael Dell Junior said:
> 
>>Dear Mario  & Fabian,
>>
>>
>>Thank you very much for your valuable comments and prompt response. I will
>>implement your recommendations on an immediate basis.
>>
>>
>>I have one other request pertaining to the training using the Databionics
>>ESOM tool. What are the actions one needs to take if with two *identical
>>runs* (in terms of parameter selection, and training set) obtains a
>>different end result in terms of the proximity/clustering of different
>>instances to one another? (I am aware that the Visual Part might be
>>different on each run but my expectation would be that that the underlying
>>structural sorting of the instances should be the same. e.g. In Run 1,
>>Instance "234" is surrounded by instances "456", "789" & "123". Shouldn't
>>the same "234" instance be surrounded by the same "456", "789" & "123"
>>instances in Run 2? )
>>
>> Is this a sign of non-convergence? Is this a sign of some other
>>underlying process that I am not aware of? Is there a random component
>>that I am not aware of?
>>
>>Your comments and suggestion on this issue will be very much appreciated.
>>
>>
>>Regards,
>>Michael
>>
>>
>>
>>
>>
>>---------------------------------
>>Post your free ad now! Yahoo! Canada Personals
>>
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by: NEC IT Guy Games.
> Get your fingers limbered up and give it your best shot. 4 great events, 4
> opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
> win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
> _______________________________________________
> Databionic-ESOM-User mailing list
> Dat...@li...
> https://lists.sourceforge.net/lists/listinfo/databionic-esom-user