Re: [Databionic-ESOM-User] iris data set

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Davide,

I disagree with Niko on the size a bit:

you can use 4k neurons for the iris data as long as your starting
radius is large enough not to leave any areas of the map untouched.
I like to compare the size of the map to the resolution of a monitor
that let's you peak into the high dimensional space. the more details
you want, the larger the map should be. with very small datasets
like iris, there is a limit of course where you are simply wasting
computing power and not gaining anything.

it is well known that 2 classes of iris are closer two each other
than to the 3rd. this can also be seen from a PCA projection.
As Niko explained the sliders or retraining on a subset can help
uncover such finer structures.

I heard a rumor that someone is implementing U*C for the ESOM
tools. If you don't want to wait ask Prof. Ultsch for the Matlab version.

fabian

Niko Efthymiou wrote:
> On Friday 12 October 2007, davide cittaro wrote:
>   
>> Hi again,I was trying to better understand how ESOM tools work and
>> perform. I tried with a well-known data set: Iris classification. I
>> used the one provided in Weka, converted in iris.lrn+iris.cls.
>> I tried a standard esom training run, with a 64x64 map, 30 epochs, 31
>> starting radius and default values for all the rest...
>>     
>
> 64x64 is way too large for the iris data set. The set is only 55 samples 
> vs. 4096 neurons...
>
> 24x24 or 16x32 are a starting point. Also don't forget to preprocess the 
> data. ZT and Robust-ZT which are builtin the esom tools are a start, 
> but there might be more sopfisticated transformations.
>
>   
>> I was pretty surprised to see the results. I have a strong evidence
>> that class 1 (iris setosa) is a separated entity (it is surrounded by
>> a circular "mountain") but... Actually the remaining iris classes
>> seems to be in the same group, the lay in the same "plane" or
>> "valley". Selecting datas from class 2 or class 3 makes them visually
>> separated but, I repeat, in the same plane.
>>     
>
> The problem is that class 1 is suroundet by very high "mountains" which 
> hide the fine structures on the map. Try playing with the "clip" slider 
> on the "View" pannel und you will see that the other 2 classes are also 
> seperated by "hills".
>
> What you can do is train a separate ESOM for these. It will probably 
> seperate them.
>
> Also there are some strange outliers in this set, results improve if you 
> remove them.
>
>   
>> Hi again, and sorry for bothering one more time.I've been reading
>> "Emergence in Self Organizing Feature Maps" (Ultsch A.) and I figured
>> out that I may need U*C clustering tools to complete analysis of my
>> data.
>>     
>
> AFAIK, it hasn't been implemented in ESOM-tools. Maybe its in the matlab 
> stuff, but I don't realy know.
>
> Greets Niko
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Splunk Inc.
> Still grepping through log files to find problems?  Stop.
> Now Search log events and configuration files using AJAX and a browser.
> Download your FREE copy of Splunk now >> http://get.splunk.com/
> _______________________________________________
> Databionic-ESOM-User mailing list
> Dat...@li...
> https://lists.sourceforge.net/lists/listinfo/databionic-esom-user
>
>
>