Hi Peter,
I think this was because most of the data sets wound up being split into at
least 20 clusters in. Even if there were only 2 real cells in the recording.
So if we started with 2 it just split them up so much anyway.
On this subject: one thing that many people have found useful is to use
ellipsoidal t-distributions instead of Gaussians for the cluster models
(Best ref I know for this is Shy Shoham et al). I hear this leads to a big
reduction in overclustering, because a lot of the overclustering occurs
because it is trying to fit Gaussians to distributions that are not truly
Gaussian ... In this case, starting with 2 to 10 clusters might actually. It
should be fairly simple to incorporate into KK ...
-----Original Message-----
From: Peter N. Steinmetz [mailto:PeterNSteinmetz@...]
Sent: Thursday, June 18, 2009 8:37 PM
To: klustakwik-develop@...
Subject: [Klustakwik-develop] why 20-30 clusters?
Does anyone know why a MinClusters of 20 and MaxClusters of 30 was
introduced as the default values? This is now the setting in the
SubSetter release, which incorporating the initial subset code from
one of Ken's versions.
Should this be kept or changed back to 2-10, which is what it was in
initial release?
cheers,
Peter
--
Peter N. Steinmetz, M.D.,Ph.D.
Program Director, Neuroengineering
Barrow Neurological Institute
PeterNSteinmetz@...
602-406-3258
http://steinmetz.org/peter
----------------------------------------------------------------------------
--
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
Klustakwik-develop mailing list
Klustakwik-develop@...
https://lists.sourceforge.net/lists/listinfo/klustakwik-develop
|