Gnuplot 5.2 documentation on site 75 states the following about set fit errors
:
A few shorthands for the errors qualifier are available: yerrors (for fits with 1 column of independent variable), and zerrors (for the general case) are all equivalent to errors z, indicating that there is a single extra column with errors of the dependent variable.
However, the behavior is not the same for the different ones, and it seems to affect also range limitings (or the latter might be an independent thing, even not a bug).
gnuplot 5.2.5
ubuntu 16.04 LTS
wxt terminal, interactive mode
As per my tests, always.
However, there is a chance that the actual data (and esp. the error estimates/weight calculation or the weighted fitting itself on the data) can return with unexpectedly bad fitting, while never without errors. (Even on the same data from the same initial values.) Mostly I would expect larger differences only in the asymptotic standard errors and at only a few cases in the parameters itself.
(This might be out of interest)
A data file contatining 64751 rows (important later) and 3 columns (x,z,delta_z)
theyr x range is [0:1]
and always has a subrange [xmin:xmax]
for which z is in [0:1]
, and x and z are linearly related. delta_z
is mostly in order of 0.1. The actual task is to fit a sigmoidal contrast function (see e.g. here) to those points for which the z data (i.e. $2 in the file) is in the range [0:1]
. Filtering this way seems impossible (because the function to be fit is always within [0:1]
by definition) unless I know the [xmin:xmax]
subrange.
After declaring the functio s(x)
with parameters a
and b
, fitting is done against the data in the [xmin:xmax]
subrange, which is set with set xrange
before. As there are three columns in the file, using
might be omitted (however, according to the manual, it is possibly a bad idea).
Also tested xrange handling by set xrange [xmin:xmax] ; fit s(x) ...
and set xrange [0:1] ; fit [xmin:xmax] s(x) ...
in all the following cases, and seems both ignore xrange always: the maximal slope (related to parameter b
) is severely overestimated, the sigmoid's slope is much greater than the linear slope; when datafile contains only the points with x in [xmin:xmax]
, the slopes are nearly equal. The former is caused by that the points with y outside [0:1]
shift the lower and upper tail toward 0 and 1 respectively.
When I left out errors
and using 1:2
only, this ignorance disappear, but something is still strange: FIT_NDF
is the same in both case (with and without errors), and after set autoscale xy ; replot
(making yrange [-0.2:2.6]
in my case), fit seems to reset yrange [0:1]
which is actually the range of the function values. Thus, it seems to be unpredictable if data is filtered against this yrange or not: the FIT_NDF
suggest correct filtering but the results differs so much that is impossible to believe is cause by just the errors.
errors z
fit s(x) 'file' using 1:2:3 errors z via a,b
This form forks only with using
specified. Without using
, it stops returning the error message Out of memory in fit: too many datapoints (4096)?
.
zerrors
and yerrors
fit s(x) 'file' using 1:2:3 zerrors via a,b
This form works with and without using
, and the results are the same (and the same as the previous wiht using
). The yerrors
and zerrors
produce the same, so they seem to be really equivalent.
Fitting without errors on the whole dataset gives
a=0.302249+-0.0001197(0.03962%) b=8.81983+-0.009261(0.105%) FIT_NDF=33247 !!! FIT_STDFIT=0.026446
so it seems that filtering happens even with unspecified ranges, or after set au xy ; rep
. FIT_NDF
begins to decrease only if I specify an xrange smaller than [xmin:xmax]
.
Fitting with errors gives
a=0.256654+-0.0003717(0.1448%) b=21.6521+-0.06377(0.2945%) !!! FIT_NDF=33247 FIT_STDFIT=468.141
thus, seemingly a much worse fitting, as can be seen on the attachments (file names tell you the case). The initial valuse were the same: a=0.32
and b=6
, but testing with largely varying initial values (a from 0.2 to 0.4, b from 3 to 20) returns these same results.