I am currently having problems to get a useful output from the GPlab. I
realise this is due to my understanding of the problem and wondered if
anyone could help me get started on this. I have some useful ideas and
already classified my data using fuzzy clustering. To that, I would like to
classify the data using GP as I have worked on this for the last 4 months
and dont really want to say it cant classify the data when in fact its
down to my understanding and not the strategy behind GP.
The problem:
If I have 30 cases of data  for argument sake, the 1st ten represent
classification phenomenon 1, the next ten (1120) represent classification
phenomenon 2 and lastly, the last ten (2130), represent classification
phenomenon 3. In short I wish to segregate classes 1 to 3 using if,
else, if greater than, and less than rules. I use two text inputs one
with the data (120*30 matrix) and one with the targets 1 to 3. I have got
useful rules from tree regression software (Matlab) however not GP, I
believe its my fitness function (standard GP toolbox) and terminals are
currently set to nil. The functions are as stated above (implemented my own
min, max, kurtosis, stddev, although these just confused matters.
The GP lab is converging however I am having problems in interpreting the
output string/tree for classifying the 3 classes of data (it would be useful
to get an output similar to the NN GP output against desired output. Each of
the 30 cases are 120 pts in size and each of the classes are significantly
different from the other classes. I.e. first class has first 30 pts larger
than the 2nd batch and the 3rd batch has 60 90 pts both larger than the
other classes.
Based on this problem I would like to use the fitness measure to continually
update a correct classification and penalize a misclassification. Perhaps a
figure of merit could be associated to ensure the best individual is found.
Can anyone help me getting started with such problem? I.e what functions and
terminals do they think are appropriate (I am looking at le, gr and myif),
not sure what terminals to use ??? The fitness function should look if the
rules/tree segregates the data giving the correct classification .....to
this end, I believe something similar to the ant demo is required but not as
complex.
I would like to express my thanks to Sara for producing such a user friendly
and powerful tool box and once I get into this in greater detail I would
like to share some ideas I have for enabling your GP toolbox to handle
ndimensionally huge data ??? Although automating may take some effort and
that's of course if you haven't already done this!
Again I would be grateful for any help any of the GP community can give as I
have been mucking around with the toolbox for 2 months now and I can't seem
to replicate John Koza/Daniel Howards GP classification ideas (they classify
imagery data  although mine is mechanical measurements and Acoustic
Emission and a lot smaller in size due to ndimensional reduction).
_________________________________________________________________
The next generation of Hotmail is here! http://www.newhotmail.co.uk/
