From: davide c. <daw...@gm...> - 2007-10-12 08:58:17
|
Hi again,I was trying to better understand how ESOM tools work and perform. I tried with a well-known data set: Iris classification. I used the one provided in Weka, converted in iris.lrn+iris.cls. I tried a standard esom training run, with a 64x64 map, 30 epochs, 31 starting radius and default values for all the rest... I was pretty surprised to see the results. I have a strong evidence that class 1 (iris setosa) is a separated entity (it is surrounded by a circular "mountain") but... Actually the remaining iris classes seems to be in the same group, the lay in the same "plane" or "valley". Selecting datas from class 2 or class 3 makes them visually separated but, I repeat, in the same plane. I've been trying to read the publications to better understand how to use ESOM, but it would be much more helpful if someone posts on this mailing lists :-) Thanks d |
From: Niko E. <ne...@ne...> - 2007-10-12 19:45:56
|
On Friday 12 October 2007, davide cittaro wrote: > Hi again,I was trying to better understand how ESOM tools work and > perform. I tried with a well-known data set: Iris classification. I > used the one provided in Weka, converted in iris.lrn+iris.cls. > I tried a standard esom training run, with a 64x64 map, 30 epochs, 31 > starting radius and default values for all the rest... 64x64 is way too large for the iris data set. The set is only 55 samples vs. 4096 neurons... 24x24 or 16x32 are a starting point. Also don't forget to preprocess the data. ZT and Robust-ZT which are builtin the esom tools are a start, but there might be more sopfisticated transformations. > I was pretty surprised to see the results. I have a strong evidence > that class 1 (iris setosa) is a separated entity (it is surrounded by > a circular "mountain") but... Actually the remaining iris classes > seems to be in the same group, the lay in the same "plane" or > "valley". Selecting datas from class 2 or class 3 makes them visually > separated but, I repeat, in the same plane. The problem is that class 1 is suroundet by very high "mountains" which hide the fine structures on the map. Try playing with the "clip" slider on the "View" pannel und you will see that the other 2 classes are also seperated by "hills". What you can do is train a separate ESOM for these. It will probably seperate them. Also there are some strange outliers in this set, results improve if you remove them. > Hi again, and sorry for bothering one more time.I've been reading > "Emergence in Self Organizing Feature Maps" (Ultsch A.) and I figured > out that I may need U*C clustering tools to complete analysis of my > data. AFAIK, it hasn't been implemented in ESOM-tools. Maybe its in the matlab stuff, but I don't realy know. Greets Niko |
From: Davide C. <daw...@gm...> - 2007-10-12 20:05:13
|
On Oct 12, 2007, at 9:40 PM, Niko Efthymiou wrote: > On Friday 12 October 2007, davide cittaro wrote: >> Hi again,I was trying to better understand how ESOM tools work and >> perform. I tried with a well-known data set: Iris classification. I >> used the one provided in Weka, converted in iris.lrn+iris.cls. >> I tried a standard esom training run, with a 64x64 map, 30 epochs, 31 >> starting radius and default values for all the rest... > > 64x64 is way too large for the iris data set. The set is only 55 > samples > vs. 4096 neurons... > > 24x24 or 16x32 are a starting point. Also don't forget to > preprocess the > data. ZT and Robust-ZT which are builtin the esom tools are a start, > but there might be more sopfisticated transformations. > Oh great! Thank you! Is there some recommendation about neurons/ samples ratio? > >> Hi again, and sorry for bothering one more time.I've been reading >> "Emergence in Self Organizing Feature Maps" (Ultsch A.) and I figured >> out that I may need U*C clustering tools to complete analysis of my >> data. > > AFAIK, it hasn't been implemented in ESOM-tools. Maybe its in the > matlab > stuff, but I don't realy know. D'oh! Thank you for the hints and the answers so far. dawe Blog http://daweonline.net Flickr http://flickr.com/photos/daweonline/ |
From: Niko E. <ne...@Ma...> - 2007-10-12 20:40:10
|
On Friday 12 October 2007, Davide Cittaro wrote: > Oh great! Thank you! Is there some recommendation about neurons/ > samples ratio? Well its quite a while.. but avoid too big and too small ones. In general you should have more neurons than data points (unless you have a great noumber of duplicates). As you might know that the height visualized is the average distance of a neuron to its direct neighboors. Niko |
From: <fa...@my...> - 2007-10-13 00:16:58
|
Davide, I disagree with Niko on the size a bit: you can use 4k neurons for the iris data as long as your starting radius is large enough not to leave any areas of the map untouched. I like to compare the size of the map to the resolution of a monitor that let's you peak into the high dimensional space. the more details you want, the larger the map should be. with very small datasets like iris, there is a limit of course where you are simply wasting computing power and not gaining anything. it is well known that 2 classes of iris are closer two each other than to the 3rd. this can also be seen from a PCA projection. As Niko explained the sliders or retraining on a subset can help uncover such finer structures. I heard a rumor that someone is implementing U*C for the ESOM tools. If you don't want to wait ask Prof. Ultsch for the Matlab version. fabian Niko Efthymiou wrote: > On Friday 12 October 2007, davide cittaro wrote: > >> Hi again,I was trying to better understand how ESOM tools work and >> perform. I tried with a well-known data set: Iris classification. I >> used the one provided in Weka, converted in iris.lrn+iris.cls. >> I tried a standard esom training run, with a 64x64 map, 30 epochs, 31 >> starting radius and default values for all the rest... >> > > 64x64 is way too large for the iris data set. The set is only 55 samples > vs. 4096 neurons... > > 24x24 or 16x32 are a starting point. Also don't forget to preprocess the > data. ZT and Robust-ZT which are builtin the esom tools are a start, > but there might be more sopfisticated transformations. > > >> I was pretty surprised to see the results. I have a strong evidence >> that class 1 (iris setosa) is a separated entity (it is surrounded by >> a circular "mountain") but... Actually the remaining iris classes >> seems to be in the same group, the lay in the same "plane" or >> "valley". Selecting datas from class 2 or class 3 makes them visually >> separated but, I repeat, in the same plane. >> > > The problem is that class 1 is suroundet by very high "mountains" which > hide the fine structures on the map. Try playing with the "clip" slider > on the "View" pannel und you will see that the other 2 classes are also > seperated by "hills". > > What you can do is train a separate ESOM for these. It will probably > seperate them. > > Also there are some strange outliers in this set, results improve if you > remove them. > > >> Hi again, and sorry for bothering one more time.I've been reading >> "Emergence in Self Organizing Feature Maps" (Ultsch A.) and I figured >> out that I may need U*C clustering tools to complete analysis of my >> data. >> > > AFAIK, it hasn't been implemented in ESOM-tools. Maybe its in the matlab > stuff, but I don't realy know. > > Greets Niko > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Splunk Inc. > Still grepping through log files to find problems? Stop. > Now Search log events and configuration files using AJAX and a browser. > Download your FREE copy of Splunk now >> http://get.splunk.com/ > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > > > |
From: davide c. <daw...@gm...> - 2007-10-14 18:37:15
|
Thank you all for the answers about the size of the map, I have now a "brighter" idea about ESOM... > I heard a rumor that someone is implementing U*C for the ESOM > tools. If you don't want to wait ask Prof. Ultsch for the Matlab version. > Mmm, I'll do that for sure, even if I'll have to check if it works in Octave (I don't have licenses for Matlab) d |