Re: [Databionic-ESOM-User] iris data set

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Friday 12 October 2007, davide cittaro wrote:
> Hi again,I was trying to better understand how ESOM tools work and
> perform. I tried with a well-known data set: Iris classification. I
> used the one provided in Weka, converted in iris.lrn+iris.cls.
> I tried a standard esom training run, with a 64x64 map, 30 epochs, 31
> starting radius and default values for all the rest...

64x64 is way too large for the iris data set. The set is only 55 samples 
vs. 4096 neurons...

24x24 or 16x32 are a starting point. Also don't forget to preprocess the 
data. ZT and Robust-ZT which are builtin the esom tools are a start, 
but there might be more sopfisticated transformations.

> I was pretty surprised to see the results. I have a strong evidence
> that class 1 (iris setosa) is a separated entity (it is surrounded by
> a circular "mountain") but... Actually the remaining iris classes
> seems to be in the same group, the lay in the same "plane" or
> "valley". Selecting datas from class 2 or class 3 makes them visually
> separated but, I repeat, in the same plane.

The problem is that class 1 is suroundet by very high "mountains" which 
hide the fine structures on the map. Try playing with the "clip" slider 
on the "View" pannel und you will see that the other 2 classes are also 
seperated by "hills".

What you can do is train a separate ESOM for these. It will probably 
seperate them.

Also there are some strange outliers in this set, results improve if you 
remove them.

> Hi again, and sorry for bothering one more time.I've been reading
> "Emergence in Self Organizing Feature Maps" (Ultsch A.) and I figured
> out that I may need U*C clustering tools to complete analysis of my
> data.

AFAIK, it hasn't been implemented in ESOM-tools. Maybe its in the matlab 
stuff, but I don't realy know.

Greets Niko