You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
|
Apr
(11) |
May
(6) |
Jun
(9) |
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(7) |
Dec
(2) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
(1) |
Feb
(2) |
Mar
(3) |
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2007 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
(4) |
Jul
(4) |
Aug
(13) |
Sep
|
Oct
(14) |
Nov
|
Dec
(3) |
2008 |
Jan
(3) |
Feb
(1) |
Mar
|
Apr
(2) |
May
(5) |
Jun
(1) |
Jul
(11) |
Aug
(3) |
Sep
|
Oct
(5) |
Nov
(1) |
Dec
(3) |
2009 |
Jan
|
Feb
(2) |
Mar
|
Apr
(1) |
May
(2) |
Jun
(1) |
Jul
(1) |
Aug
|
Sep
|
Oct
(5) |
Nov
|
Dec
(1) |
2010 |
Jan
(4) |
Feb
(1) |
Mar
(4) |
Apr
(1) |
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
(1) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2012 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(1) |
Dec
|
2013 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(4) |
2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: <fa...@Ma...> - 2006-03-20 21:32:18
|
Hi Katerina, the format of your file looks ok. the data itself looks quite weird. there are only few unique values in most columns with many duplicates. this seems to throw off the pareto density estimatio course into an endless loop finally running out of memory. the actual training suceeds, only the subsequent centering of the map around the closest neuron is affected. you can work around this using the command line option "-dc" (= don't center), that is, unfortunately only available on the command line, not in the gui. since our ressources are currently very limited I don't know when this can be fixed. It does seem like a very unlikely case caused by your actual dataset to me. I strongly recommend you to look closer at your data before you feed into ESOM or any clustering algorithm, in particular the scales of the variables that differ tremendously in your file. bye fabian > Dear Fabian, > I am facing some problems using eSOM and your help would be very valuable > for me. > I am using eSOM for an .lrn file and when I try to perform training (with > the default parameters) I receive the following message > > spheres for 18-tile contain on average 65% of the data searching pareto > radius ... > > > and finally > > at databionics.math.ParetoDensity.getParetoRadius(ParetoDensity.java:156) > at databionics.math.ParetoDensity.getParetoRadius(ParetoDensity.java:323) > > The training can not be completed it stops in 95%. > > I don't understand what is wrong?? > I suppose that something is wrong with the lrn file used for the > training. > I attach the file used for the training. > > I would appreciate it if you could help me. > > Thanking you in advance. > Katerina > > >>> @fabian could you please set the reply-to headers for the list? >> >> done >> >>> Katerina Mitrokotsa wrote: >>> >>>>I have recently tried to use ESOm and although I have found it really >>>> interesting I can't understand if there is a way to inspect neuron >>>> values Does this tool permit us to see which samples of data >>>> correspond to which neuron? >>> >>> >>> You can select the samples in the Data tab at the bottom, which are >>> then highlighted. >> >> addition: you can also select neurons in the map with the data mouse >> (activate the leftmost icon in the toolbar). the data points assigned to >> these neurons will be displayed in the data tab at the bottom. you can >> also load a *.names file with text labels for the data points. these >> will be displayed in the last columns of the data table. >> >>>>Furthermore which is the procedure in order to use a dataset for >>>> training and then another dataset for testing. >>> >>> To do this you have to add classmasks to your ESOM and then use the >>> Project tool to see if the test set is projected into the correct >>> classes. The prosses isn't automated as far as i know (fabian?) >> >> creating the class masks is manual (in cvs there is some semi-automated >> support with flood filling already). projection is automated and can be >> run via the menu or the command line. short summary: >> >> - create two seperate *.lrn for training and test data >> - train ESOM with training data >> - optional: load *.cls with known classification of training data >> - identify clusters and create class mask (also *.cls) >> - load *.lrn with test data >> - project this data on ESOM >> - save newly created *.cls for test data >> - optional: analyze *.cls for test data, e.g. compare to *.cls with >> known classification of test data. >> >> we offer no tools for the last step which is rather easy however. i >> could post some matlab code, if you wish. >> >> bye >> fabian >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by Yahoo. >> Introducing Yahoo! Search Developer Network - Create apps using Yahoo! >> Search APIs Find out how you can build Yahoo! directly into your own >> Applications - visit >> http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005 >> _______________________________________________ >> Databionic-ESOM-User mailing list >> Dat...@li... >> https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > > > > > |
From: <mit...@we...> - 2006-03-20 21:04:51
|
Dear Fabian, I am facing some problems using eSOM and your help would be very valuable for me. I am using eSOM for an .lrn file and when I try to perform training (with the default parameters) I receive the following message spheres for 18-tile contain on average 65% of the data searching pareto radius ... and finally at databionics.math.ParetoDensity.getParetoRadius(ParetoDensity.java:156) at databionics.math.ParetoDensity.getParetoRadius(ParetoDensity.java:323) The training can not be completed it stops in 95%. I don't understand what is wrong?? I suppose that something is wrong with the lrn file used for the training. I attach the file used for the training. I would appreciate it if you could help me. Thanking you in advance. Katerina >> @fabian could you please set the reply-to headers for the list? > > done > >> Katerina Mitrokotsa wrote: >> >>>I have recently tried to use ESOm and although I have found it really >>> interesting I can't understand if there is a way to inspect neuron >>> values Does this tool permit us to see which samples of data >>> correspond to which neuron? >> >> >> You can select the samples in the Data tab at the bottom, which are >> then highlighted. > > addition: you can also select neurons in the map with the data mouse > (activate the leftmost icon in the toolbar). the data points assigned to > these neurons will be displayed in the data tab at the bottom. you can > also load a *.names file with text labels for the data points. these > will be displayed in the last columns of the data table. > >>>Furthermore which is the procedure in order to use a dataset for >>> training and then another dataset for testing. >> >> To do this you have to add classmasks to your ESOM and then use the >> Project tool to see if the test set is projected into the correct >> classes. The prosses isn't automated as far as i know (fabian?) > > creating the class masks is manual (in cvs there is some semi-automated > support with flood filling already). projection is automated and can be > run via the menu or the command line. short summary: > > - create two seperate *.lrn for training and test data > - train ESOM with training data > - optional: load *.cls with known classification of training data > - identify clusters and create class mask (also *.cls) > - load *.lrn with test data > - project this data on ESOM > - save newly created *.cls for test data > - optional: analyze *.cls for test data, e.g. compare to *.cls with > known classification of test data. > > we offer no tools for the last step which is rather easy however. i > could post some matlab code, if you wish. > > bye > fabian > > > ------------------------------------------------------- > This SF.Net email is sponsored by Yahoo. > Introducing Yahoo! Search Developer Network - Create apps using Yahoo! > Search APIs Find out how you can build Yahoo! directly into your own > Applications - visit > http://developer.yahoo.net/?fr=offad-ysdn-ostg-q22005 > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user |
From: <fa...@in...> - 2006-02-18 11:18:09
|
released. enjoy. |
From: <fa...@in...> - 2006-02-13 17:46:38
|
released. for news see website. please test, as this will become 1.0 later this week if no major complaints come in. bye fabian |
From: <fa...@in...> - 2006-01-31 08:33:31
|
Hi, we fixed a severe bug in the ESOM tools. previous releases did not read in data files with numbers in scientific notation using small "e"s to indicate the exponent, e.g. 4.678342e02 was read as 4.6 instead of 467.342. this went unnoticed so far, because all our files have capitol E's. sorry about that. bye fabian |
From:
<fa...@in...> - 2005-12-12 10:56:22
|
http://www.mathematik.uni-marburg.de/~databionics/en//images/stx_umx.jpg was created using a java3d tool that is not available anymore. just recently someone contacted me with the interest of developing 3d support for the esom tools, but I don't know if and when it will be available. the picture in http://www.mathematik.uni-marburg.de/~databionics/en//downloads/papers/ultsch05clustering.pdf was created using matlab. the esom tools already include matlab code to load the files generated by the java tools, I will add the code used for 3D plotting to the next version. it should be pretty easy to import the output of the esom tools (*.umx for the heights and *.bm for the best matches) into other 3d tools. the file formats are documented in the manual. please let me know if you need further assitance and report any success. personally, I like to work with 2D better. it's easier to manually select data and create classes. 3D looks better of course, but it does not necessarily aid the analysis. bye fabian Alejandro Garcia wrote: > > Hi! I've just discovered your ESOM tools and I think they'll ve very > useful to me, I'm working in data analysis in a neuropsychological > hospital since oct 2003, we are also working with the Lokonat System > http://www.hocoma.ch/ for gait training rehab data analysis so I'm very > interested in your work. > I've started with the Fundamental Clustering Problem Suite to get > insight and I could not find how do you generate views like this: > this kind of 3D landscape of the U-Matrix is fantastic but I cannot > create it using the tools in the View tab and can't find how you do it > I could only generate the top view of the U-Matrix but could not also > generate the 3D view like for example Figure 3a) pag 77 in here > http://www.mathematik.uni-marburg.de/~databionics/en//downloads/papers/ultsch05clustering.pdf > Thanks > > > ------------------------------------------------------------------------ > > Correo Yahoo! > Comprueba qué es nuevo, aquí > <http://us.rd.yahoo.com/mail/es/whatsnew/*http://es.whatsnew.mail.yahoo.com/> > http://correo.yahoo.es |
From: Alejandro G. <al...@ya...> - 2005-12-11 12:44:13
|
Hi! I've just discovered your ESOM tools and I think they'll ve very useful to me, I'm working in data analysis in a neuropsychological hospital since oct 2003, we are also working with the Lokonat System http://www.hocoma.ch/ for gait training rehab data analysis so I'm very interested in your work. I've started with the Fundamental Clustering Problem Suite to get insight and I could not find how do you generate views like this: http://www.mathematik.uni-marburg.de/~databionics/en//images/stx_umx.jpg this kind of 3D landscape of the U-Matrix is fantastic but I cannot create it using the tools in the View tab and can't find how you do it I could only generate the top view of the U-Matrix but could not also generate the 3D view like for example Figure 3a) pag 77 in here http://www.mathematik.uni-marburg.de/~databionics/en//downloads/papers/ultsch05clustering.pdf Thanks --------------------------------- Correo Yahoo! Comprueba qué es nuevo, aquí http://correo.yahoo.es |
From:
<fa...@in...> - 2005-11-14 10:41:17
|
I fixed the performance problems with very high dimensional data in the new version 0.9.5. the dataset with 18 rows and > 9000 features ran through with default settings on my system. you still have to adjust the memory parameter as described below, I'm not aware of a way of having java use as much memory as available. bye fabian Fabian Mörchen wrote: > the error on map initialization is most likely the "out of memory" > error. i will try to make this error message appear more prominently. > > it somewhat works with your file on my system if I change the java > memory setting in the start scripts: > > - open the file <ESOM folder>/bin/esomstart[.bat] with a text editor > - change -Xmx512m to -Xms1800m (or higher for even larger files) > > now I end up with the program running at 100% cpu, but no results after > a few hours. checking in the code I found a bottleneck, we need to > remove, before you should try again: the covariance matrix of the data > is calculated even if it is not needed for the map initialization (it is > only needed for the PCA initialization), with 9000 variables obviously a > demanding task. fixing this will take a while as we are all very busy > with other things, sorry. > > bye > fabian > > Yu Shi wrote: > >>Thanks Fabian, this is the data file I used. I found the problem might >>not be the cls file or integer cause I tried your examples and there >>are same messages and just ignore them and start training. My problem >>is no action performs after "initilializing map", the CPU load is 0 >>after that. I wonder whether this file works on your machine. Thanks >>for your time and help. >> >>BR, >> >>Shi >> >>On 11/9/05, Fabian Mörchen <fa...@ma...> wrote: >> >> >>>The software should be able to handle this many features, provided you >>>have enough memory. Recall, that even if you have only 18 examples, the >>>map contains a 9k vector for each neuron. >>> >>>I generated a file like you described and did not get the same error, I >>>only ran out of memory, because there are some other demanding tasks >>>currently running on my machine. >>> >>>Dou you have by any chance a "3" somewhere in the 3rd line of the *.lrn >>>file? That would be the deprecated code for a classification. Or did you >>>supply a *.cls file with the argument -cls? Otherwise I wouldn't know, >>>why the Java class ClsFile ist used at all. Note, that the --cls option >>>is not needed for unsupervised training, it is rather used for debugging >>>purposes. To find out why your file is not parsed, you could also sent >>>it to me (not the list!) if that is ok. >>> >>>bye >>>fabian >>> >>>Yu Shi wrote: >>> >>> >>>>Hi. I just downloaded ESOM 0.9.4 today and try to build a map of 36 >>>>genes with 9,175 features. The data file is correct because the main >>>>values have been tested in Genecluster2, Genesis and other clustering >>>>tools. I changed the required format as: >>>> >>>>% 36 >>>>% 9176 >>>>% 9 1 1 .... (9,172 times of '1') 1 >>>>% Key C1 C2 ............ C9175 >>>>1 0 0.2 ............ 0.03 >>>>... >>>>36 ................ >>>> >>>> >>>>however, this data file cannot been trained. Each time I start the >>>>training, the message says "missing input file for type: ClsFile >>>> is not an integer" >>>>if I reduce the feature number to small one, it is ok. So I wonder >>>>whether the program can handle 9,000 features for clustering. If not, >>>>how many features can it mostly handle? Thanks. >>>> >>>>BR, >>>> >>>>Shi >>>> >>>> >>>>------------------------------------------------------- >>>>SF.Net email is sponsored by: >>>>Tame your development challenges with Apache's Geronimo App Server. Download >>>>it for free - -and be entered to win a 42" plasma tv or your very own >>>>Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >>>>_______________________________________________ >>>>Databionic-ESOM-User mailing list >>>>Dat...@li... >>>>https://lists.sourceforge.net/lists/listinfo/databionic-esom-user >> > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. Download > it for free - -and be entered to win a 42" plasma tv or your very own > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user |
From:
<fa...@in...> - 2005-11-09 19:54:33
|
the error on map initialization is most likely the "out of memory" error. i will try to make this error message appear more prominently. it somewhat works with your file on my system if I change the java memory setting in the start scripts: - open the file <ESOM folder>/bin/esomstart[.bat] with a text editor - change -Xmx512m to -Xms1800m (or higher for even larger files) now I end up with the program running at 100% cpu, but no results after a few hours. checking in the code I found a bottleneck, we need to remove, before you should try again: the covariance matrix of the data is calculated even if it is not needed for the map initialization (it is only needed for the PCA initialization), with 9000 variables obviously a demanding task. fixing this will take a while as we are all very busy with other things, sorry. bye fabian Yu Shi wrote: > Thanks Fabian, this is the data file I used. I found the problem might > not be the cls file or integer cause I tried your examples and there > are same messages and just ignore them and start training. My problem > is no action performs after "initilializing map", the CPU load is 0 > after that. I wonder whether this file works on your machine. Thanks > for your time and help. > > BR, > > Shi > > On 11/9/05, Fabian Mörchen <fa...@ma...> wrote: > >>The software should be able to handle this many features, provided you >>have enough memory. Recall, that even if you have only 18 examples, the >>map contains a 9k vector for each neuron. >> >>I generated a file like you described and did not get the same error, I >>only ran out of memory, because there are some other demanding tasks >>currently running on my machine. >> >>Dou you have by any chance a "3" somewhere in the 3rd line of the *.lrn >>file? That would be the deprecated code for a classification. Or did you >>supply a *.cls file with the argument -cls? Otherwise I wouldn't know, >>why the Java class ClsFile ist used at all. Note, that the --cls option >>is not needed for unsupervised training, it is rather used for debugging >>purposes. To find out why your file is not parsed, you could also sent >>it to me (not the list!) if that is ok. >> >>bye >>fabian >> >>Yu Shi wrote: >> >>>Hi. I just downloaded ESOM 0.9.4 today and try to build a map of 36 >>>genes with 9,175 features. The data file is correct because the main >>>values have been tested in Genecluster2, Genesis and other clustering >>>tools. I changed the required format as: >>> >>>% 36 >>>% 9176 >>>% 9 1 1 .... (9,172 times of '1') 1 >>>% Key C1 C2 ............ C9175 >>>1 0 0.2 ............ 0.03 >>>... >>>36 ................ >>> >>> >>>however, this data file cannot been trained. Each time I start the >>>training, the message says "missing input file for type: ClsFile >>> is not an integer" >>>if I reduce the feature number to small one, it is ok. So I wonder >>>whether the program can handle 9,000 features for clustering. If not, >>>how many features can it mostly handle? Thanks. >>> >>>BR, >>> >>>Shi >>> >>> >>>------------------------------------------------------- >>>SF.Net email is sponsored by: >>>Tame your development challenges with Apache's Geronimo App Server. Download >>>it for free - -and be entered to win a 42" plasma tv or your very own >>>Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >>>_______________________________________________ >>>Databionic-ESOM-User mailing list >>>Dat...@li... >>>https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > >> |
From: <fa...@Ma...> - 2005-11-09 10:26:58
|
The U*Matrix is not implemented in Java, yet. We might be able to offer it as a funtion within the Matlab interface included in the ESOM tools, I have to check on this. A Java implementation is planned for early next year. bye fabian mit...@we... wrote: > How can we produce the U*-Matrix? > Is ESOM tool capable of producing U*-Matrix? > thanks > Katerina > > > > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. Download > it for free - -and be entered to win a 42" plasma tv or your very own > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user |
From: Yu S. <sla...@gm...> - 2005-11-08 23:41:04
|
Hi. I just downloaded ESOM 0.9.4 today and try to build a map of 36 genes with 9,175 features. The data file is correct because the main values have been tested in Genecluster2, Genesis and other clustering tools. I changed the required format as: % 36 % 9176 % 9 1 1 .... (9,172 times of '1') 1 % Key C1 C2 ............ C9175 1 0 0.2 ............ = 0.03 ... 36 ................ however, this data file cannot been trained. Each time I start the training, the message says "missing input file for type: ClsFile is not an integer" if I reduce the feature number to small one, it is ok. So I wonder whether the program can handle 9,000 features for clustering. If not, how many features can it mostly handle? Thanks. BR, Shi |
From: <mit...@we...> - 2005-11-08 17:46:45
|
How can we produce the U*-Matrix? Is ESOM tool capable of producing U*-Matrix? thanks Katerina |
From:
<fa...@in...> - 2005-11-02 19:48:17
|
Katerina Mitrokotsa wrote: > Dear Fabian, > could you please let me know > 1. If using different background visualizations (P-Matrix, U-Matrix) > may predict different classification results. > Is the algrorithm performed differently for different Visualizations > (P-Matrix, Two-Match, U-Matrix)? I'm not sure if I understand your question, but let me try to clarify. The U-Matrix and the Two-Match are both distance based visualizations, i.e. large values (=mountains) represent large distances in the dataset. These visualizations can be used with the floodfill option on the class mask page, to select valleys as clusters, if this is what you mean with classification. The P-Matrix is a density based visualization, i.e. large values represent large densities in the data. This should _not_ be used with the floodfill automation, maybe we should ass an inverted P-Matrix or an inverted floodfill for this. But as for any visualization you can always manually label regions as clusters, for the P-Matrix you would typically select connected regions with large densities. > 2. In order to see different visualuizations of the data I have to > select the visualization (P-Matrix, Two-Match, U-Matrix etc) and then > train the data? No. First you train and then you can display different visualizations for the same trained ESOM. They do not affect the training. > 3. If I select only a few features (enabled) from the tab component (not > all that have been used in training procedure) and press update > then the U-Matrix (or the corresponding selected background) is updated > accordingly as if the training has been done unsing only the enabled > features (from the component tab)??? Almost right. All components are used during training. After the training you can use the components tab to select a subset for the calculation of the visualization. This is not equivalent to a training with a subset of the components, however, especially if the subset is small. You can select a subset for training by setting the column key in the *.lrn files to 0 for columnes not to be used. bye fabian |
From: Katerina M. <mit...@un...> - 2005-11-02 18:28:10
|
Dear Fabian, could you please let me know 1. If using different background visualizations (P-Matrix, U-Matrix) may predict different classification results. Is the algrorithm performed differently for different Visualizations (P-Matrix, Two-Match, U-Matrix)? 2. In order to see different visualuizations of the data I have to select the visualization (P-Matrix, Two-Match, U-Matrix etc) and then train the data? 3. If I select only a few features (enabled) from the tab component (not all that have been used in training procedure) and press update then the U-Matrix (or the corresponding selected background) is updated accordingly as if the training has been done unsing only the enabled features (from the component tab)??? Thank you in advance for your valuable help, Best regards, Katerina |
From: <fa...@in...> - 2005-10-31 19:34:11
|
you have to encode them by numbers because labels like "green" or "hot" will not be read by the program. i think encoding something like age: "1-10","11-20","20-45","45-60" as 1,2,3,4 or gender: "male", "female" as 0,1 will give intuitive results using the Euclidean distance. if you have something like color: "green","red","blue","yellow","black" however, it is unclear what is the best encoding. if you use 1..5, then you get (green-black)^2 = 16 and (blue-yellow)^2 = 2 as component in the sum of an Euclidean distance. Is that justified if all you want to express is "different color"? In this case it might be better to use 5 binary variables for each color. This is a general problem in data mining, however, and not ESOM specific. bye fabian p.s. please answer to the list. mit...@we... wrote: > I don't want to use special symbolic distance functions. I use the > euclidean distance. Do you think that by using some symbolic features > after encoding them to numbers (e.g. 0,1) will contribute positively to > the classification of the datasets? > > >>You can certainly feed ordinal or nominal attributes to the tools if you >>encode them e.g. as natural numbers. Letters or strings are not allowed >>as data entries in *.lrn files. >> >>The question is, whether Euclidean (or any other implemented distance >>function) is meaningful on this encoding for your data. Further, the >>final prototypes that are assigned to each neuron, will almost surely >>not consist of natural numbers, since the udating of the map uses small >>vector differences as learning steps. But they could be seen as >>approximations to symbolic prototypes. >> >>If you want to use special symbolic distance functions (e.g. Hamming) >>you would have to implement them first and I can give you hints on how >>to do it. In addition, the update step should be modified accordingly. >>Both should be comparatively easy to do. >> >>bye >>fabian >> >>mit...@we... wrote: >> >>>May we use not only continuous but also symbolic features with ESOM? >>>thank you in advance, >>>Katerina >>> >>> >>> >>> >>>------------------------------------------------------- >>>This SF.Net email is sponsored by the JBoss Inc. >>>Get Certified Today * Register for a JBoss Training Course >>>Free Certification Exam for All Training Attendees Through End of 2005 >>>Visit http://www.jboss.com/services/certification for more information >>>_______________________________________________ >>>Databionic-ESOM-User mailing list >>>Dat...@li... >>>https://lists.sourceforge.net/lists/listinfo/databionic-esom-user >> >> >>------------------------------------------------------- >>This SF.Net email is sponsored by the JBoss Inc. >>Get Certified Today * Register for a JBoss Training Course >>Free Certification Exam for All Training Attendees Through End of 2005 >>Visit http://www.jboss.com/services/certification for more information >>_______________________________________________ >>Databionic-ESOM-User mailing list >>Dat...@li... >>https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > > > |
From: <fa...@in...> - 2005-10-31 15:26:09
|
You can certainly feed ordinal or nominal attributes to the tools if you encode them e.g. as natural numbers. Letters or strings are not allowed as data entries in *.lrn files. The question is, whether Euclidean (or any other implemented distance function) is meaningful on this encoding for your data. Further, the final prototypes that are assigned to each neuron, will almost surely not consist of natural numbers, since the udating of the map uses small vector differences as learning steps. But they could be seen as approximations to symbolic prototypes. If you want to use special symbolic distance functions (e.g. Hamming) you would have to implement them first and I can give you hints on how to do it. In addition, the update step should be modified accordingly. Both should be comparatively easy to do. bye fabian mit...@we... wrote: > May we use not only continuous but also symbolic features with ESOM? > thank you in advance, > Katerina > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. > Get Certified Today * Register for a JBoss Training Course > Free Certification Exam for All Training Attendees Through End of 2005 > Visit http://www.jboss.com/services/certification for more information > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user |
From: <mit...@we...> - 2005-10-31 15:10:15
|
May we use not only continuous but also symbolic features with ESOM? thank you in advance, Katerina |
From:
<fa...@in...> - 2005-10-25 19:41:07
|
see http://databionic-esom.sourceforge.net/changes-report.html |
From:
<fa...@in...> - 2005-08-05 15:30:47
|
changes: - Moved view options to bottom tab. - Display of names for bestmatches. - Display of round bestmatches. - Display of classes as letters on bestmatches. - Class mask can be created with manual floodfill and semi-automated - algorithm by Fabien Moutarde. - Slide show problems fixed. - Projection and classification problems fixed. - Component background can display negative values. - Contours are displayed correct with clipping. - Broken selection of components fixed. - Updated documentation. |
From: <fa...@in...> - 2005-06-23 16:08:31
|
>> did you save the class mask explicitely? this should be done >> automatically but might be a point of error. > > How can the class mask (.cls file ) be created automatically? > I though that when we want to create a class mask (.cls) we > > enable the classmask tab and select regions on the map with the > polygon. a right mouse click will finish the selection and one class will > be created and shown in the tab. > > When I have created all the classes I select to save the cls file > manually from the "class mask" tab. > Right or wrong? So how is a class mask file created automatically? right. you always have to _create_ the class mask manually. the question was, however, whether this mask gets saved to disk in a *.cls file automatically. it does not seem to be, i just checked it. even saving it manually did't save all problems. try this workaround: - create the class mask - save the class mask - load the class mask now the following tools should definitely know about the existence of the class mask and use it. this does not explain the bm=-1 messages you reported earlier, however, i did not get these in my test. but maybe you can get one step further and tell us about it. > 2.When I want to create the cls file for the test set sometimes when I > select classify the .cls file is not created automatically. > I think that sometimes the cls file is created in the "classes" tab and > can be saved from there. > Right or wrong? both. the classification result should be listed in the classes tab and got saved in a file on disk (*_projected.cls). you can always manually save it under a different name or folder if you wish. > 3.Sometimes when I load an .lrn file and select to perform > z-transformation the file that is created includes non readable > characters and I cannot explain why is this happening. this sounds weird. which characters? maybe a "?"? that would mean a NaN and could be caused by a column with constant value (-> std = 0 -> division by zero). > 4.The whole procedure of training with a train set creating a class > mask, projecting a test set and creating a new cls file for the train > set belong is the category of unsupervised classification? no. unsupervised classification is a term sometimes used for clustering. i think the term is misleading, because (supervised) classification and (unsupervised) clustering are completely different problems. the procedure you describe is classification because you have a labeled training set that you assume to be the ground truth and use it to define the class mask. this is similar to training a decision tree. in fact projecting data on an esom with a classmask is somewhat a sophisticated version of k-nearest neighbor classification with k = 1. a new data point is assigned the same label as it's nearest prototype in the data space as defined by the trained esom. but you have the additional benefit of visually defining the classes and allowing regions with no class label. if you create the class mask without any given labels, you have an unsupervised process. then i would call it clustering with a sample and applying the results to a larger data set with esom classification. bye fabian |
From: Niko E. <ne...@Ma...> - 2005-06-23 15:17:40
|
Katerina Mitrokotsa wrote: > 2.When I want to create the cls file for the test set sometimes when I > select classify the .cls file is not created automatically. > I think that sometimes the cls file is created in the "classes" tab and > can be saved from there. > Right or wrong? There are two different types of .cls files. The one assosiates data with classes and the other assosiates neurons (map regeons) with classes. (@fabian: maybe it would be a good idea to change the extension for one of them. .cm for classmask?) > 3.Sometimes when I load an .lrn file and select to perform > z-transformation the file that is created includes non readable > characters and I cannot explain why is this happening. Could you send a sample of such a corrupted file? I guess it has to do with windows vs. unix line brakes. Niko -- Niko Efthymiou Tel: 06421/898565 Geschw.-Scholl-Str.11a www: www.mathematik.uni-marburg.de/~nefthy 35039 Marburg pgp-key: 0xE6BF2487 @ www.keyserver.net |
From: Katerina M. <mit...@un...> - 2005-06-23 15:04:45
|
Hi Fabian, Mario 1. Fabian M=C3=B6rchen wrote: > did you save the class mask explicitely? this should be done automatically but might be a point of error. How can the class mask (.cls file ) be created automatically? I though that when we want to create a class mask (.cls) we enable the classmask tab and select regions on the map with the polygon. a right mouse click will finish the selection and one class will be created and shown in the tab. When I have created all the classes I select to save the cls file=20 manually from the "class mask" tab. Right or wrong? So how is a class mask file created automatically? 2.When I want to create the cls file for the test set sometimes when I=20 select classify the .cls file is not created automatically. I think that sometimes the cls file is created in the "classes" tab and=20 can be saved from there. Right or wrong? 3.Sometimes when I load an .lrn file and select to perform=20 z-transformation the file that is created includes non readable=20 characters and I cannot explain why is this happening. 4.The whole procedure of training with a train set creating a class=20 mask, projecting a test set and creating a new cls file for the train=20 set belong is the category of unsupervised classification? Thank you in advance for your valuable help, Katerina >_______________________________________________ >Databionic-ESOM-User mailing list >Dat...@li... >https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > > > > =20 > |
From: <fa...@in...> - 2005-06-20 08:34:50
|
mit...@we... wrote: > Dear Mario, > I followed the whole procedure. > Used one train data set (about 1000 records) > apply z-transformation > trained the data > load the cls file > created the class mask .cls file did you save the class mask explicitely? this should be done automatically but might be a point of error. > Should the test data set have the same size (in records) as the train data > set? no, the number of records can be different. > Can I use this tool for really big files about 60.000 records (10 features)? > or is it to slow? depends on what 'slow' means for you. the size you describe should definitely be doable over night on a modern computer. probably a couple of hours. it the data contains large redundancies, you should consider sampling to speed up the turn around time or your experiments. bye fabian |
From: <noe...@Ma...> - 2005-06-20 08:06:38
|
> applied z-transformation > project the data (while project was running it appeared for every row > bm==-1 I don't know if this is relevant for the problem I faced later) > and when I selected from the menu tools--> classify > > no cls file was created > error creating cls file > > what did I do wrong??? > Should the test data set have the same size (in records) as the train > data set? Hi Katerina I believe bm==-1 is an error. Do you have the same number of features(columns) in your two datasets? The dimension of the grid (map) is equal to the number of features of your first data. And you can not project something with another number of features on that grid(map). > > Can I use this tool for really big files about 60.000 records (10 > features)? or is it to slow? just go and try it mario |
From: <mit...@we...> - 2005-06-19 18:47:46
|
Dear Mario, I followed the whole procedure. Used one train data set (about 1000 records) apply z-transformation trained the data load the cls file created the class mask .cls file loaded the test data (about 2250 records) applied z-transformation project the data (while project was running it appeared for every row bm==-1 I don't know if this is relevant for the problem I faced later) and when I selected from the menu tools--> classify no cls file was created error creating cls file what did I do wrong??? Should the test data set have the same size (in records) as the train data set? Can I use this tool for really big files about 60.000 records (10 features)? or is it to slow? thank you in advance it is really imporant for me to get the classification of the test data. Katerina > Hi Katerina > >> ok but does the training test data have to include a column with the >> class of each sample or it is the same with having a separate .cls >> file. (in some example .lrn files there is a column for the class of >> the data) Is it necessary one of the fields of the .lrn file to be >> labeled as unique key (9)? > > if a *.cls file exists you do not have to create a column in you lrn > file. But you need a column labeled with 9. This column has to contain > the unique keys of the datasets. > >> - optional: load *.cls with known classification of training data >> the *.cls file should be loaded before or after the training process. >> (I suppose before) >> It is loaded from the tab classes or class mask? > > classes! the classes tab shows the classification of the data. the > classmask tab shows the classification of the neurons. > > >> >>> - identify clusters and create class mask (also *.cls) >> how do I identify clusters?? and create class mask? >> Do I use the classify selection from the tools menu? >> for some reason it doesn't seem to work although I press the start >> button the procedure thoes not start and no output .cls file is >> created. > > just enable the classmask tab and select regions on the map with the > polygon. a right mouse click will finish the selection and one class > will be created and shown in the tab. and so on... > > you can save this classmask by saving as *.cls. (button in tab) > > There are 2 kinds of *.cls: 1. classification of data points(classes) > and 2. classification of neurons(classmask) > >>> - save newly created *.cls for test data >> How do I create the new *.cls data > > maybe the user guide is not up to date. you can project new data on > the map. (Tools - Project...) A loaded *.lrn file will be projected on > the map and a *.bm file will be created. A *.bm file holds the > information of the datasets positions on the map. > > You can also classify the loaded bestmatches by using a classmask, > which has been created before.(Tools - Classify..) Every bm (from the > loaded *.bm) looks in the classmask for his class number and that will > be written in the new classification. After that you can save the > classification (file menu). > > > hope that helps > mario |