From: Fabian M. <fa...@my...> - 2010-04-25 01:01:56
|
Peter, for ESOM (and many other data mining algorithms) you need a vector of numbers of a fixed length and a distance function. Based on your description. a very simple apporach would be to use 1 for present, 0 for absent, and euclid for the distance measure. for each student a vector of length 5x(weeks in semester) would need to be generated. for file formats please refer to the manual. if you want to capture the reason for absence it gets a bit more tricky. the letters are categories and not easily encoded in a single number. here it seems your 5 categories do have an ordering so you could use 0,...,4 or 0, 0.2, ..., 1 for present, ..., unjustified. similar vectors, thus attendance patterns, should be close on the map. you can investigate clusters to see what the typical patterns is inside them. best fabian Peter Sim wrote: > > Hi everyone > > > > I am *very new* to Emergent Self-Organizing Maps. > > > > My area of reseach encompasses school absenteeism. I have a year of > attendance data for several thousand students on a class-by-class > basis. The data looks like this (five classes per day): > > > > Student# Week Mon Tue Wed Thu Fri > > 1234 2/2/09 ***** ****T ***** EEEEE ***** > > 9/2/09 ***** JJJ** ***** L**LL ***** > > 16/2/09 ***** ***** ***SS ***** ***** > > ...etc > > 9876 2/2/09 ***** MMMMM ***** ***** ***** > > 9/2/09 ***** ***** ***** ***** QQQQQ > > 16/2/09 DDD** ***** ***** ***** ***** > > ...etc > > > > An asterisk indicates that the student is present. Letter codes > indicate the type of absence: D (Doctor/Dentist), E (Explained but > unjustified), J (Justified), L (present but Late), M (Medical), Q > (school trip or camp), T (Truant). > > > > I intend coding the attendances/absences into: > > -Present > > -Late > > -Justified (eg other school activity) > > -Justified but questionable (eg note from home stating student is unwell) > > -Unjustified (E and T) > > > > I want to examine a number of hypotheses about relationships between > individual students' attendance patterns early in the year, and later on. > > > > My supervisor has asked me to investigate whether ESOMs will be a > useful analytical tool in my research. I have downloaded and installed > the Databionic ESOM Tools. > > > > I'm having trouble figuring out how to preprocess my data into a > format suitable for use with the software. > > > > Can anyone point me in the right direction? > > > > MTIA > > > > Peter Sim > > New Zealand > > > > __________ Information from ESET Smart Security, version of virus > signature database 4906 (20100301) __________ > > The message was checked by ESET Smart Security. > > http://www.eset.com > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------------ > Download Intel® Parallel Studio Eval > Try the new software tools for yourself. Speed compiling, find bugs > proactively, and fine-tune applications for parallel performance. > See why Intel Parallel Studio got high marks during beta. > http://p.sf.net/sfu/intel-sw-dev > ------------------------------------------------------------------------ > > _______________________________________________ > Databionic-ESOM-User mailing list > Dat...@li... > https://lists.sourceforge.net/lists/listinfo/databionic-esom-user > |