Re: [bug] plot bins does not clip output to plot area

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

> I'm using it as a way to average over a large number of samples that are
> sampled equidistantly (kernel density is way slower obviously). The
> documentation doesn't really tell how it is implemented or supposed to
> work, so I've experimented a bit to find out.

I take the blame for any failures in either the implementation or
the documentation of the "bins" option, since I wrote both.
I would be happy to amend the code or the description to cover
uses that I didn't anticipate.

Since I envisioned it as a histogramming tool, it did not occur to
me that users would want to plot "with lines" rather than using
boxes or impulses.   Therefore I did not think about or test the
clipping behaviour.

As it happens, the routines that draw boxes or impulses do the clipping as
they draw each one.  The routine that draws "with lines" assumes that the
inrange/outrange/undefined status of each point has already been flagged,
and only considers clipping if a particular line segment changes state.
This is so that successive in range points are drawn as a single poly-line
with smooth joins rather than being split into individual segments.
Normally the inrange/outrange flag is set on data entry, but those flags
apply to the original data points rather than to the binned totals.
I have now added a separate pass to re-check the binned data against
yrange so that clipping works as you expect it to.
(New code added for both 5.4 and 5.5).

I remain a little uneasy about the use of negative values in the weighting
column, although I don't have a specific example in mind that will fail.

Thanks for the explanation of what you are using it for.
I agree that in that mode "bins" is similar to a kernel density model
that uses a delta function rather than a Gaussian kernel.

	cheers,

		Ethan

On Friday, 11 September 2020 11:59:28 PDT ASSI wrote:
> Ethan A Merritt writes:
> 
> > If there is a second column of data this is interpreted as a weight.
> 
> Let's assume the sample spacing is 1 and no samples are missing, then
> you'll get the sum over the bin width.  Different sampling density just
> scales the result.  Dividing by the number of samples gives you the
> average.

Correct.

[snip]

> 
> > Your test script provides no "using" specifier, however, so the plot
> > command draws values from column 1 (essentially the numbers 0 to 100)
> > and weights each one by the value in the second column (sin(x)).
> 
> It doesn't do that either or the result would be a linearly rising
> function with a sin riding on top of it.  It reproduces the function if
> I arrange the binwidth to contain just a single sample, so clearly the x
> column isn't used directly.

I may not have phrased that well.  It weights the _contribution_ of each
sample by the value in the seconds column.  If no second column is
provided, the weight is 1.

Re: [bug] plot bins does not clip output to plot area

A portable, multi-platform, command-line driven graphing utility

Re: [bug] plot bins does not clip output to plot area