The attached patch adds the options yerrorbars and xyerrorbars to the set fit command.
If the xyerrorbars is provided the fit command expects a using spec of the type
x:y:sx:sy to fit y=f(x), where sy is the standard deviation of the y-values and sx
the standard deviation of the independend variable x. For the fit the
"effective variance method" (Jay Orear, Am. J. Phys., Vol. 50, No. 10, October 1982)
is then used to incorporate the standard deviation of the x-values in the estimation
of the weights s used to calculate the Chi^2. By default (option yerrorbars) the
weights are simply s = sy (as befor this patch), but with the xyerrorbars option
the weights are calculated using the formula s**2 = (f'(x)*sx)**2 + sy**2, where
f'(x) is the derivative of the function f at x. The results of this new method were
sucessfully compared to results using ROOT, which uses a similar approach.
Alexander Täschner
2012-04-27
Patch to add xyerrorbars option for set fit
Bastian Märkisch
2012-05-02
Nice. I read the paper and its follow-ups, but haven't tried your code yet. Could you provide some test case?
If I interpret your code correctly, Lybanon's concerns (Am. J. Phys. 52) about this method are addressed by varying both, the free parameters and the effective error, in every iteration. Correct?
gnuplot generally supports up to five independent variables, whereas your implementation is currently limited to a single independent variable only. Would it be worth and feasible to extend the code?
Also, I was wondering if there wasn't a better approach to distinguish the meaning of the data columns than the "set fit yerrorbars | xyerrorbars" command introduced by your patch. For the `using` spec in `plot` commands we have e.g. the xticlabels(), timecolumn() and stringcolumn() "modifiers". Wouldn't it be possible to implement in a similar fashion xerrorcolumn() etc. to resolve the ambiguity between x:y:z:s and x:xerrorcolumn():z:s ?
Alexander Täschner
2012-05-11
Path to add effective variance method to fit (V2)
Alexander Täschner
2012-05-11
> Nice. I read the paper and its follow-ups, but haven't tried your code yet.
> Could you provide some test case?
I added a demo to the new patch.
> If I interpret your code correctly, Lybanon's concerns (Am. J. Phys. 52)
> about this method are addressed by varying both, the free parameters and
> the effective error, in every iteration. Correct?
Yes. This method is also summarized in the book William R. Leo,
Techniques for Nuclear and Particle Physics Experiments
(http://books.google.de/books?id=8VufE4SD-AkC&lpg=PA108&ots=6a2OeQJ0JP&pg=PA108#v=onepage&q&f=false).
> gnuplot generally supports up to five independent variables, whereas your
> implementation is currently limited to a single independent variable only.
> Would it be worth and feasible to extend the code?
The attached patch adds the support for the other independent variables
by simply adding quadratically a term like (dz/dp)*p_err, with (p=x,y,t,u,v) and p_err the error for the corresponding independent variable.
> Also, I was wondering if there wasn't a better approach to distinguish the
> meaning of the data columns than the "set fit yerrorbars | xyerrorbars"
> command introduced by your patch. For the `using` spec in `plot` commands
> we have e.g. the xticlabels(), timecolumn() and stringcolumn() "modifiers".
> Wouldn't it be possible to implement in a similar fashion xerrorcolumn()
> etc. to resolve the ambiguity between x:y:z:s and x:xerrorcolumn():z:s ?
As you suggested I reimplemented the method by introducing the five
"modifiers" xerrorcolumn(), yerrorcolumn(), terrorcolumn(), uerrorcolumn() and verrorcolumn(). With that the using specifier for the
fit command looks like
z
x:z
x:[:xerr]z:s
x:[:xerr]:y:[:yerr]:z:s
x:[:xerr]:y:[:yerr]:t:[:terr]:z:s
x:[:xerr]:y:[:yerr]:t:[:terr]:u:[:uerr]:z:s
x:[:xerr]:y:[:yerr]:t:[:terr]:u:[:uerr]:v:[:verr]:z:s
Please review the changes in "datafile.c" which were necessary for
this kind of implementation with extreme care, since this is the
first time that I changed something in this very complex parser code.
Bastian Märkisch
2013-05-14
On second thought my suggested syntax wasn't the best idea. In your implementation, the order of the parameters is fixed, but different from the corresponding plot commands. This is unfortunate. Instead I know propose to extend the syntax of the fit command, see https://sourceforge.net/p/gnuplot/patches/621/ for an detailed explanation.
The patch attached adpopts your code to the proposed syntax. The patch is against current CVS and also includes the code for patch 621.
Bastian Märkisch
2013-05-14
Ethan Merritt
2014-02-17
I'm bumping the priority because we've seen several requests for this feature.
As I understand it, the patch works but the syntax may need revision.
Is that correct?
Bastian Märkisch
2014-02-17
The syntax of the last patch here conflicts with your changes for more than 5 independent variables. But you can now find an udate to the proposed syntax at
https://sourceforge.net/p/gnuplot/patches/621/
There is one thing I would like to understand before including this patch in any release, though: Cern ROOT uses this algorithm, too. But unlike this implementation for gnuplot, ROOT's solver will find the true optimimum for Orear's testcase, whereas this code will find the same approximate solution as given in Orear's paper.
Alexander Täschner
2014-02-17
I just retested the patch with both the test case from Orear and a second one from
Cameron Reed. In both cases the minimizer fails to find the smallest chi square
(WSSR in the list below). If one initializes the parameters with values which are
near to this minimum it converges to the values, which are compatible with the ROOT
results. Since the WSSR values are also the same and correspond to the values
in the two papers, it seems that the calculation of the WSSR is done correctly, but
the gnuplot minimizer fails to find the minimum.
In the table below you can find the values I got from the different tests:
B. Cameron Reed, Am. J. Phys. 57, 642 (1989) - Example 1 (y=m*x+c):
m = 1.167 +- 0.308
c = -0.365 +- 0.291
WSSR = 578
gnuplot (FIT_LIMIT=1.0e-14):
parameters not initialized at start:
m = 0.931057 +- 0.1299/5.08338=0.02555
c = -0.150072 +- 0.1192/5.08338=0.02345
WSSR=646
start parameters m=1.167 and c=-0.365:
m = 1.16633 +- 0.1381/4.80855=0.0287
b = -0.361615 +- 0.1268/4.80855=0.0264
WSSR=578
CERN-ROOT:
m = 1.16684 +- 0.0319997
b = -0.365156 +- 0.0293386
WSSR=578
J. Orear, Am. J. Phys., Vol. 50, No. 10, October 1982 (y=a1*x-a2/x)
exact (Erratum: J. Orear, Am. J. Phys., Vol. 52, No. 3, March 1984):
a1 = 1.0731e-3
a2 = 6.250e5
WSSR = 2.134
gnuplot (FIT_LIMIT=1.0e-14):
parameters not initialized at start:
a1 = (1.0116 +- 0.1702/0.853987=0.1993)e-3
a2 = (5.93675 +- 0.9527/0.853987=1.11557)e5
WSSR = 2.18788
start parameters a1 = 1.0731e-3 and a2 = 6.250e5:
a1 = (1.07257 +- 0.1748/0.84351=0.2072)e-3
a2 = (6.25000 +- 0.9784/0.84351=1.1599)e5
WSSR = 2.13453
CERN-ROOT:
a1 = (1.07277 +- 0.252572)e-3
a2 = (6.24979 +- 1.39958)e5
WSSR = 2.13318
Ethan Merritt
2014-02-17