On Mon, 26 May 2008 01:41:31 +0200, Philipp K. Janert <janert@...>
> On Saturday 24 May 2008 00:11, you wrote:
>> On Sat, 24 May 2008 01:15:37 +0200, Philipp K. Janert <janert@...>
>> > I just submitted a patch (1970923) which
>> > draws a smooth histogram-like curve
>> > for a random collection of points, using
>> > a Gaussian kernel density estimation
>> > algorithm.
>> > Demos are found here:
>> > http://www.philipp-janert.com/kdensity
>> very interesting.
>> My initial impression on looking at your top left example is that there
>> a phase shift of +half a box in x most visible in the 0.01 and 0.05
> The edge effect is actually in the histogram,
> not in the kernel density (yet another advantage
> of k-densities over histograms: the annoying
> bin-placement problem goes away).
hmm, never been much of a fan of bins and histograms, that's probably why.
More for sociologists and economists.
> The code does what all current gnuplot
> smoothing algos do: they stop at the min
> and max data point in the sample. I think
> this is reasonable.
Well I'm not sure that is comparable. IRRC all the "smoothing" algos
(appart from unique) are splines , these are calculated over 4 data. In
fact they would require just one point outside the data range at each end.
I have not looked how they are dealt with but it is unlikely to be
important for one point.
However, techniques using a kernel require half the kernel width outside
each end of the data range.
I would guess by looking at your examples that the missing data are
initialised as zero. Is that correct?
Sorry to be a stickler for detail, it must be my rigourous physics
training coming out. As people become less and less aware of what all
these software tools are actually doing for them, it becomes more and more
important that they do not introduce distortions.
Don't think I knocking your efforts, I'm pretty impressed overall.
best regards, Peter.