From: <al...@de...> - 2007-12-26 14:05:46
|
Dear Greg Dick, The U-matrix method, devel=ADoped by Ultsch [1], enables = visualiza=ADtion of the topological relations of the neurons in an = organized SOM. A gradient image (2D) or a surface plot is generated by = computing distances between adjacent neurons. High values in the = U-matrix encode dis=ADsimi=ADlarities between neurons and correspond to = cluster borders.=20 Some strategies for cluster detection using U-matrix were proposed by = Costa and Netto [2, 3].=20 Three main algorithms were presented: mathematical morphology derived = map segmentation [2,3]; a graph partitioning approach [4] and = contiguity-constrained hierarchical clustering approaches [5, 6]. Both = algorithms were developed for automatically partitioning and labeling a = trained SOM network. The first approach uses image processing algorithms = such as the watershed transform are used to obtain connected regions of = neurons representing similar stimuli classes. The second approach uses = rules to partition the map by analyzing inconsistent neighboring = relations between neurons. Each resulting cluster of neurons is a = sub-graph that defines, in the input space, complex and non-parametric = geometries, which approximately describes the shape of the clusters. = Regarding the last approach, Goncalves et al. [5] presented improvements = of contiguity-constrained hierarchical clustering approaches using = validation indexes. =20 I had addressed this problem almost 10 years ago, with papers [7] and = [8, a extended version of 7]. But there are many recent papers by many = authors dealing with this problem.=20 Refs:=20 [1] A. Ultsch, "Self-Organizing Neural Networks for Visualization = and Classification", In: O. Opitz et al. (Eds). Information and = Classification, pp.301-306. Springer: Berlin. 1993. [2] J. A. F. Costa, and M. L. de Andrade Netto, "Clustering of = complex shaped data sets via Kohonen maps and mathematical morphology". = In: Proceedings of the SPIE, Data Mining and Knowledge Discovery. B. = Dasarathy (Ed.), Vol. 4384, pp. 16-27, 2001. [3] J. A. F. Costa, and M. L. de Andrade Netto, "A new = tree-structured self-organizing map for data analysis". In: Proc. of the = Intl. Joint Conf. on Neural Networks, Washington, DC, 2001, pp. = 1931-1936. [4] J. A. F. Costa, and M. L. de Andrade Netto, "Segmenta=E7=E3o do = SOM Baseada em Particionamento de Grafos". In: Proc. of Brazilian Neural = Networks Conference, S=E3o Paulo, Brazil, June, 2003, pp. 301-308 (in = Portuguese). [5] M. Goncalves, M. Netto, J. Zullo, and J.A.F. Costa. "A new = method for unsupervised classification of remotely sensed images using = Kohonen self-organizing maps and agglomerative hierarchical clustering = methods". In: Intl. Journal of Remote Sensing, 2007 (Accepted). [6] F. Murtagh, "Interpreting the Kohonen self-organizing feature = map using contiguity-constrained clustering". Pattern Recognition = Letters, vol. 16, pp. 399-408. [7] Costa, J.A.F., & Netto, M. L. A. (1998). "An Approach for Estimating = the Number of Clusters in Multivariate Data by Self-Organizing Maps". = In: Anais do V Simp=F3sio Brasileiro de Redes Neurais (SBRN'98), p. = 33-38. Dezembro, Belo Horizonte, MG.=20 [8] Costa, J.A.F., & Netto, M.L.A. (1999). Estimating the Number of = Clusters in Multivariate Data by Self-Organizing Maps. International = Journal of Neural Systems, Vol. 9, No. 3, pp. 195-202. =20 Anyway, there are my doctoral work of 1999 (in portuguese) -> Costa 1999 = - Classifica=E7=E3o Autom=E1tica e An=E1lise de dados por redes neurais = auto-organiz=E1veis - link = http://www.dca.fee.unicamp.br/~marcio/ia004/artigos/Costa_ClassifAutomati= caPorRedesNeuraisAutoOrg_PhD99.zip With my best regards, and wishing for all (especially to Dr. Ultsch and = group) a wonderful new year,=20 Jose Alfredo F. Costa, Prof. Dr. Federal University, UFRN, Brazil http://www.dee.ufrn.br/~alfredo ----- Original Message -----=20 From: <dat...@li...> To: <dat...@li...> Sent: Friday, December 21, 2007 5:21 PM Subject: Databionic-ESOM-User Digest, Vol 8, Issue 1 > Send Databionic-ESOM-User mailing list submissions to > dat...@li... >=20 > To subscribe or unsubscribe via the World Wide Web, visit > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > or, via email, send a message with subject or body 'help' to > dat...@li... >=20 > You can reach the person managing the list at > dat...@li... >=20 > When replying, please edit your Subject line so it is more specific > than "Re: Contents of Databionic-ESOM-User digest..." >=20 >=20 > Today's Topics: >=20 > 1. automated data clustering? (Greg Dick) >=20 >=20 > ---------------------------------------------------------------------- >=20 > Message: 1 > Date: Thu, 20 Dec 2007 15:00:31 -0800 > From: "Greg Dick" <gd...@be...> > Subject: [Databionic-ESOM-User] automated data clustering? > To: <dat...@li...> > Message-ID: <025c01c8435c$1f600da0$6500a8c0@GJD> > Content-Type: text/plain; charset=3D"iso-8859-1" >=20 > Dear ESOM users, >=20 > I am using ESOM to cluster DNA sequences from environmental = microorganisms, based on genome wide signatures (tetranucleotide = frequency). Overall I am very happy with the results and it has proven = to be an extremely valuable tool for our research group. There are two = areas that we are hoping to develop further and I am curious if anyone = has suggestions or comments: >=20 > (1) Are there any automated methods for clustering data? The = boundaries for our clusters range from obvious to questionable. While = this variable strength of clustering is useful information in itself, we = would like to develop an automated method for defining clusters in order = to avoid potential errors in where we draw the lines (it is not always = entirely clear how to do so). >=20 > (2) Are there statistical tools that have been developed or applied to = ESOM to evaluate the robustness of clustering (ideally on a per-cluster = basis)? We are interested in such an analysis, which would either be = based on the U-matrix distance structure and/or an evaluation of the = accuracy of the clustering (for much of our data we know the true = cluster affiliations). >=20 > Any suggestions or references relevant to these areas would be greatly = appreciated. >=20 > Greg Dick > Postdoctoral Research > University of California, Berkeley =20 >=20 > -------------- next part -------------- > An HTML attachment was scrubbed... >=20 > ------------------------------ >=20 > = -------------------------------------------------------------------------= > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >=20 > ------------------------------ >=20 > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user >=20 >=20 > End of Databionic-ESOM-User Digest, Vol 8, Issue 1 > ************************************************** >=20 > __________ Informa=E7=E3o do NOD32 IMON 2746 (20071225) __________ >=20 > Esta mensagem foi verificada pelo NOD32 sistema antiv=EDrus > http://www.eset.com.br >=20 > |