Patch to add xyerrorbars option for set fit
The attached patch adds the options yerrorbars and xyerrorbars to the set fit command.
If the xyerrorbars is provided the fit command expects a using spec of the type
x:y:sx:sy to fit y=f(x), where sy is the standard deviation of the y-values and sx
the standard deviation of the independend variable x. For the fit the
"effective variance method" (Jay Orear, Am. J. Phys., Vol. 50, No. 10, October 1982)
is then used to incorporate the standard deviation of the x-values in the estimation
of the weights s used to calculate the Chi^2. By default (option yerrorbars) the
weights are simply s = sy (as befor this patch), but with the xyerrorbars option
the weights are calculated using the formula s**2 = (f'(x)*sx)**2 + sy**2, where
f'(x) is the derivative of the function f at x. The results of this new method were
sucessfully compared to results using ROOT, which uses a similar approach.
Nice. I read the paper and its follow-ups, but haven't tried your code yet. Could you provide some test case?
If I interpret your code correctly, Lybanon's concerns (Am. J. Phys. 52) about this method are addressed by varying both, the free parameters and the effective error, in every iteration. Correct?
gnuplot generally supports up to five independent variables, whereas your implementation is currently limited to a single independent variable only. Would it be worth and feasible to extend the code?
Also, I was wondering if there wasn't a better approach to distinguish the meaning of the data columns than the "set fit yerrorbars | xyerrorbars" command introduced by your patch. For the `using` spec in `plot` commands we have e.g. the xticlabels(), timecolumn() and stringcolumn() "modifiers". Wouldn't it be possible to implement in a similar fashion xerrorcolumn() etc. to resolve the ambiguity between x:y:z:s and x:xerrorcolumn():z:s ?
> Nice. I read the paper and its follow-ups, but haven't tried your code yet.
> Could you provide some test case?
I added a demo to the new patch.
> If I interpret your code correctly, Lybanon's concerns (Am. J. Phys. 52)
> about this method are addressed by varying both, the free parameters and
> the effective error, in every iteration. Correct?
Yes. This method is also summarized in the book William R. Leo,
Techniques for Nuclear and Particle Physics Experiments
(http://books.google.de/books?id=8VufE4SD-AkC&lpg=PA108&ots=6a2OeQJ0JP&pg=PA108#v=onepage&q&f=false).
> gnuplot generally supports up to five independent variables, whereas your
> implementation is currently limited to a single independent variable only.
> Would it be worth and feasible to extend the code?
The attached patch adds the support for the other independent variables
by simply adding quadratically a term like (dz/dp)*p_err, with (p=x,y,t,u,v) and p_err the error for the corresponding independent variable.
> Also, I was wondering if there wasn't a better approach to distinguish the
> meaning of the data columns than the "set fit yerrorbars | xyerrorbars"
> command introduced by your patch. For the `using` spec in `plot` commands
> we have e.g. the xticlabels(), timecolumn() and stringcolumn() "modifiers".
> Wouldn't it be possible to implement in a similar fashion xerrorcolumn()
> etc. to resolve the ambiguity between x:y:z:s and x:xerrorcolumn():z:s ?
As you suggested I reimplemented the method by introducing the five
"modifiers" xerrorcolumn(), yerrorcolumn(), terrorcolumn(), uerrorcolumn() and verrorcolumn(). With that the using specifier for the
fit command looks like
z
x:z
x:[:xerr]z:s
x:[:xerr]:y:[:yerr]:z:s
x:[:xerr]:y:[:yerr]:t:[:terr]:z:s
x:[:xerr]:y:[:yerr]:t:[:terr]:u:[:uerr]:z:s
x:[:xerr]:y:[:yerr]:t:[:terr]:u:[:uerr]:v:[:verr]:z:s
Please review the changes in "datafile.c" which were necessary for
this kind of implementation with extreme care, since this is the
first time that I changed something in this very complex parser code.
On second thought my suggested syntax wasn't the best idea. In your implementation, the order of the parameters is fixed, but different from the corresponding plot commands. This is unfortunate. Instead I know propose to extend the syntax of the fit command, see https://sourceforge.net/p/gnuplot/patches/621/ for an detailed explanation.
The patch attached adpopts your code to the proposed syntax. The patch is against current CVS and also includes the code for patch 621.
I'm bumping the priority because we've seen several requests for this feature.
As I understand it, the patch works but the syntax may need revision.
Is that correct?
The syntax of the last patch here conflicts with your changes for more than 5 independent variables. But you can now find an udate to the proposed syntax at
https://sourceforge.net/p/gnuplot/patches/621/
There is one thing I would like to understand before including this patch in any release, though: Cern ROOT uses this algorithm, too. But unlike this implementation for gnuplot, ROOT's solver will find the true optimimum for Orear's testcase, whereas this code will find the same approximate solution as given in Orear's paper.
I just retested the patch with both the test case from Orear and a second one from
Cameron Reed. In both cases the minimizer fails to find the smallest chi square
(WSSR in the list below). If one initializes the parameters with values which are
near to this minimum it converges to the values, which are compatible with the ROOT
results. Since the WSSR values are also the same and correspond to the values
in the two papers, it seems that the calculation of the WSSR is done correctly, but
the gnuplot minimizer fails to find the minimum.
In the table below you can find the values I got from the different tests:
B. Cameron Reed, Am. J. Phys. 57, 642 (1989) - Example 1 (y=m*x+c):
m = 1.167 +- 0.308
c = -0.365 +- 0.291
WSSR = 578
gnuplot (FIT_LIMIT=1.0e-14):
parameters not initialized at start:
m = 0.931057 +- 0.1299/5.08338=0.02555
c = -0.150072 +- 0.1192/5.08338=0.02345
WSSR=646
start parameters m=1.167 and c=-0.365:
m = 1.16633 +- 0.1381/4.80855=0.0287
b = -0.361615 +- 0.1268/4.80855=0.0264
WSSR=578
CERN-ROOT:
m = 1.16684 +- 0.0319997
b = -0.365156 +- 0.0293386
WSSR=578
J. Orear, Am. J. Phys., Vol. 50, No. 10, October 1982 (y=a1*x-a2/x)
exact (Erratum: J. Orear, Am. J. Phys., Vol. 52, No. 3, March 1984):
a1 = 1.0731e-3
a2 = 6.250e5
WSSR = 2.134
gnuplot (FIT_LIMIT=1.0e-14):
parameters not initialized at start:
a1 = (1.0116 +- 0.1702/0.853987=0.1993)e-3
a2 = (5.93675 +- 0.9527/0.853987=1.11557)e5
WSSR = 2.18788
start parameters a1 = 1.0731e-3 and a2 = 6.250e5:
a1 = (1.07257 +- 0.1748/0.84351=0.2072)e-3
a2 = (6.25000 +- 0.9784/0.84351=1.1599)e5
WSSR = 2.13453
CERN-ROOT:
a1 = (1.07277 +- 0.252572)e-3
a2 = (6.24979 +- 1.39958)e5
WSSR = 2.13318
Thanks for the quick response. Below are my findings for Orear's data set, which more or less agree with your analysis. I guess the conclusion is that we just get stuck in a local minimum and the code can thus go to CVS right now. We really should do a chi² plot of the region...
Summary of the fit results:
a1 a2 a1_err a2_err ------------------------------------------------ initial values 1.0000e-003 5.900e+005 dy only 9.8993e-004 5.893e+005 3.8731e-005 2.289e+004 (*)dy and dx 1.0162e-003 5.937e+005 1.9933e-004 1.116e+005 dy and dx, errorscaling 1.0162e-003 5.937e+005 1.7022e-004 9.527e+004 dy and dx, no prescale 1.0096e-003 5.900e+005 1.9840e-004 1.110e+005 ------------------------------------------------ published 1.0163e-003 5.937e+005 exact 1.0731e-003 6.250e+005 ------------------------------------------------ Minuit 1.0727e-003 6.250e+005 2.5254e-004 1.399e+005
Note that our fit result (*) and the result published in J. Orear, Am. J. Phys. 52, 278 (1984) agree.
This is also an example where--due to the large difference in the order of magnitude of the parameters--prescaling is helpful to obtain convergence!
Is there a version of this patch that applies against current CVS?
This patch should apply to current CVS. It includes a demo fitxyerr.dem
that uses Orear's data and Pearson's data with York's weights. The documentation still needs to be reworked.
Thanks. That works nicely other some trivial compiler warnings.
Help me to understand the output of the demo.
Demo 1)
How can it be that both the RMS and CHI2 values are better for the xyerr fit, but the p value is worse?
Demo 2)
"This is also an example where--due to the large difference in
the order of magnitude of the parameters--prescaling is required
to obtain convergence." I have lost track - is patch #507 still relevant or does it implement the same thing as the current "set fit prescale"?
I gather that the fit is not exact even with prescaling, but in the demo the residual error is on the same order of magnitude as the convergence limit. However, if you tighten the convergence limit to "set fit limit 1e-09" then it reaches the same end point and claims an epsilon of e-10. Doesn't this mean there is something wrong with the function evaluation rather than with the convergence?
Demo 1)
How can it be that both the RMS and CHI2 values are better for the xyerr fit, but the p value is worse?
For the y-errors fit I get p = 3.5e-5, whereas for the xy-error fit it is 0.15, which is much better indeed, i.e. closer to 0.5, and in the 'acceptable' range of lets say [0.06:0.94].
Demo 2)
"This is also an example where--due to the large difference in
the order of magnitude of the parameters--prescaling is required
to obtain convergence." I have lost track - is patch #507 still relevant or does it implement the same thing as the current "set fit prescale"?
Patch #507 is included in CVS since 2013-03-12. Question is, if it shouldn't be included in 4.6.5, too.
I gather that the fit is not exact even with prescaling, but in the demo the residual error is on the same order of magnitude as the convergence limit. However, if you tighten the convergence limit to "set fit limit 1e-09" then it reaches the same end point and claims an epsilon of e-10. Doesn't this mean there is something wrong with the function evaluation rather than with the convergence?
The fact that it does not reach the exact solution is why I started the discussion again. Without further investigation (e.g. a real Chi² plot of the region), I can only assume that the algorithm gets stuck in a local minimum. As Alexander pointed out, the calculation of the WSSR seems to be correct. Also, the endpoint is the same as in Orear's paper, which probably tells us that the starting point is just not good enough for both algorithms.
Although the problem looks easy at first glance, it is really very difficult: the correlation between the parameters is very high ( gnuplot says >0.994 - as does a colleagues' Mathematica script ), and -without prescaling- the region of convergence must be very elongated due to the very different scale of the parameters.
The fact that it does not reach the exact solution is why I started the discussion >again. Without further investigation (e.g. a real Chi² plot of the region), I can >only assume that the algorithm gets stuck in a local minimum.
Attached is a Chi² plot of the region. The contour lines equal the minimum Chi²
+1, +0.1 and +0.01. The points show the exact solution and Orears one. From the
plot it is clear, that finding the absolut minimum is not really easy.
Alexander, thanks for the nice Chi² plot. It would be great to be able to do plots like this after a fit with using a chisq()
function in gnuplot. Any ideas how to implement this? Of course we would have to limit it to one or two parameters.
From the plot it seems that it is really the fit algorithm that gets stuck and that it is not a problem of the effective variance calculation per se. An modified version of the patch is now included in CVS. The help text for gnuplot.doc still needs work and is not included yet.