From:
<fa...@in...> - 2005-11-14 10:41:17
|
I fixed the performance problems with very high dimensional data in the new version 0.9.5. the dataset with 18 rows and > 9000 features ran through with default settings on my system. you still have to adjust the memory parameter as described below, I'm not aware of a way of having java use as much memory as available. bye fabian Fabian Mörchen wrote: > the error on map initialization is most likely the "out of memory" > error. i will try to make this error message appear more prominently. > > it somewhat works with your file on my system if I change the java > memory setting in the start scripts: > > - open the file <ESOM folder>/bin/esomstart[.bat] with a text editor > - change -Xmx512m to -Xms1800m (or higher for even larger files) > > now I end up with the program running at 100% cpu, but no results after > a few hours. checking in the code I found a bottleneck, we need to > remove, before you should try again: the covariance matrix of the data > is calculated even if it is not needed for the map initialization (it is > only needed for the PCA initialization), with 9000 variables obviously a > demanding task. fixing this will take a while as we are all very busy > with other things, sorry. > > bye > fabian > > Yu Shi wrote: > >>Thanks Fabian, this is the data file I used. I found the problem might >>not be the cls file or integer cause I tried your examples and there >>are same messages and just ignore them and start training. My problem >>is no action performs after "initilializing map", the CPU load is 0 >>after that. I wonder whether this file works on your machine. Thanks >>for your time and help. >> >>BR, >> >>Shi >> >>On 11/9/05, Fabian Mörchen <fa...@ma...> wrote: >> >> >>>The software should be able to handle this many features, provided you >>>have enough memory. Recall, that even if you have only 18 examples, the >>>map contains a 9k vector for each neuron. >>> >>>I generated a file like you described and did not get the same error, I >>>only ran out of memory, because there are some other demanding tasks >>>currently running on my machine. >>> >>>Dou you have by any chance a "3" somewhere in the 3rd line of the *.lrn >>>file? That would be the deprecated code for a classification. Or did you >>>supply a *.cls file with the argument -cls? Otherwise I wouldn't know, >>>why the Java class ClsFile ist used at all. Note, that the --cls option >>>is not needed for unsupervised training, it is rather used for debugging >>>purposes. To find out why your file is not parsed, you could also sent >>>it to me (not the list!) if that is ok. >>> >>>bye >>>fabian >>> >>>Yu Shi wrote: >>> >>> >>>>Hi. I just downloaded ESOM 0.9.4 today and try to build a map of 36 >>>>genes with 9,175 features. The data file is correct because the main >>>>values have been tested in Genecluster2, Genesis and other clustering >>>>tools. I changed the required format as: >>>> >>>>% 36 >>>>% 9176 >>>>% 9 1 1 .... (9,172 times of '1') 1 >>>>% Key C1 C2 ............ C9175 >>>>1 0 0.2 ............ 0.03 >>>>... >>>>36 ................ >>>> >>>> >>>>however, this data file cannot been trained. Each time I start the >>>>training, the message says "missing input file for type: ClsFile >>>> is not an integer" >>>>if I reduce the feature number to small one, it is ok. So I wonder >>>>whether the program can handle 9,000 features for clustering. If not, >>>>how many features can it mostly handle? Thanks. >>>> >>>>BR, >>>> >>>>Shi >>>> >>>> >>>>------------------------------------------------------- >>>>SF.Net email is sponsored by: >>>>Tame your development challenges with Apache's Geronimo App Server. Download >>>>it for free - -and be entered to win a 42" plasma tv or your very own >>>>Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php >>>>_______________________________________________ >>>>Databionic-ESOM-User mailing list >>>>Dat...@li... >>>>https://lists.sourceforge.net/lists/listinfo/databionic-esom-user >> > > > ------------------------------------------------------- > SF.Net email is sponsored by: > Tame your development challenges with Apache's Geronimo App Server. Download > it for free - -and be entered to win a 42" plasma tv or your very own > Sony(tm)PSP. Click here to play: http://sourceforge.net/geronimo.php > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user |