Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#585 Add option xyerrorbars to set fit to consider x errors

None
closed-accepted
7
2014-03-04
2012-04-27
No

The attached patch adds the options yerrorbars and xyerrorbars to the set fit command.
If the xyerrorbars is provided the fit command expects a using spec of the type
x:y:sx:sy to fit y=f(x), where sy is the standard deviation of the y-values and sx
the standard deviation of the independend variable x. For the fit the
"effective variance method" (Jay Orear, Am. J. Phys., Vol. 50, No. 10, October 1982)
is then used to incorporate the standard deviation of the x-values in the estimation
of the weights s used to calculate the Chi^2. By default (option yerrorbars) the
weights are simply s = sy (as befor this patch), but with the xyerrorbars option
the weights are calculated using the formula s**2 = (f'(x)*sx)**2 + sy**2, where
f'(x) is the derivative of the function f at x. The results of this new method were
sucessfully compared to results using ROOT, which uses a similar approach.

Discussion

1 2 > >> (Page 1 of 2)
  • Patch to add xyerrorbars option for set fit

     
    Attachments
  • Nice. I read the paper and its follow-ups, but haven't tried your code yet. Could you provide some test case?

    If I interpret your code correctly, Lybanon's concerns (Am. J. Phys. 52) about this method are addressed by varying both, the free parameters and the effective error, in every iteration. Correct?

    gnuplot generally supports up to five independent variables, whereas your implementation is currently limited to a single independent variable only. Would it be worth and feasible to extend the code?

    Also, I was wondering if there wasn't a better approach to distinguish the meaning of the data columns than the "set fit yerrorbars | xyerrorbars" command introduced by your patch. For the `using` spec in `plot` commands we have e.g. the xticlabels(), timecolumn() and stringcolumn() "modifiers". Wouldn't it be possible to implement in a similar fashion xerrorcolumn() etc. to resolve the ambiguity between x:y:z:s and x:xerrorcolumn():z:s ?

     
  • Path to add effective variance method to fit (V2)

     
  • > Nice. I read the paper and its follow-ups, but haven't tried your code yet.
    > Could you provide some test case?

    I added a demo to the new patch.

    > If I interpret your code correctly, Lybanon's concerns (Am. J. Phys. 52)
    > about this method are addressed by varying both, the free parameters and
    > the effective error, in every iteration. Correct?

    Yes. This method is also summarized in the book William R. Leo,
    Techniques for Nuclear and Particle Physics Experiments
    (http://books.google.de/books?id=8VufE4SD-AkC&lpg=PA108&ots=6a2OeQJ0JP&pg=PA108#v=onepage&q&f=false).

    > gnuplot generally supports up to five independent variables, whereas your
    > implementation is currently limited to a single independent variable only.
    > Would it be worth and feasible to extend the code?

    The attached patch adds the support for the other independent variables
    by simply adding quadratically a term like (dz/dp)*p_err, with (p=x,y,t,u,v) and p_err the error for the corresponding independent variable.

    > Also, I was wondering if there wasn't a better approach to distinguish the
    > meaning of the data columns than the "set fit yerrorbars | xyerrorbars"
    > command introduced by your patch. For the `using` spec in `plot` commands
    > we have e.g. the xticlabels(), timecolumn() and stringcolumn() "modifiers".
    > Wouldn't it be possible to implement in a similar fashion xerrorcolumn()
    > etc. to resolve the ambiguity between x:y:z:s and x:xerrorcolumn():z:s ?

    As you suggested I reimplemented the method by introducing the five
    "modifiers" xerrorcolumn(), yerrorcolumn(), terrorcolumn(), uerrorcolumn() and verrorcolumn(). With that the using specifier for the
    fit command looks like

    z
    x:z
    x:[:xerr]z:s
    x:[:xerr]:y:[:yerr]:z:s
    x:[:xerr]:y:[:yerr]:t:[:terr]:z:s
    x:[:xerr]:y:[:yerr]:t:[:terr]:u:[:uerr]:z:s
    x:[:xerr]:y:[:yerr]:t:[:terr]:u:[:uerr]:v:[:verr]:z:s

    Please review the changes in "datafile.c" which were necessary for
    this kind of implementation with extreme care, since this is the
    first time that I changed something in this very complex parser code.

     
  • On second thought my suggested syntax wasn't the best idea. In your implementation, the order of the parameters is fixed, but different from the corresponding plot commands. This is unfortunate. Instead I know propose to extend the syntax of the fit command, see https://sourceforge.net/p/gnuplot/patches/621/ for an detailed explanation.

    The patch attached adpopts your code to the proposed syntax. The patch is against current CVS and also includes the code for patch 621.

     
    Attachments
    • assigned_to: Bastian Märkisch
    • Group: -->
     
  • Ethan Merritt
    Ethan Merritt
    2014-02-17

    I'm bumping the priority because we've seen several requests for this feature.
    As I understand it, the patch works but the syntax may need revision.
    Is that correct?

     
    • The syntax of the last patch here conflicts with your changes for more than 5 independent variables. But you can now find an udate to the proposed syntax at
      https://sourceforge.net/p/gnuplot/patches/621/

      There is one thing I would like to understand before including this patch in any release, though: Cern ROOT uses this algorithm, too. But unlike this implementation for gnuplot, ROOT's solver will find the true optimimum for Orear's testcase, whereas this code will find the same approximate solution as given in Orear's paper.

       
      • I just retested the patch with both the test case from Orear and a second one from
        Cameron Reed. In both cases the minimizer fails to find the smallest chi square
        (WSSR in the list below). If one initializes the parameters with values which are
        near to this minimum it converges to the values, which are compatible with the ROOT
        results. Since the WSSR values are also the same and correspond to the values
        in the two papers, it seems that the calculation of the WSSR is done correctly, but
        the gnuplot minimizer fails to find the minimum.

        In the table below you can find the values I got from the different tests:

        B. Cameron Reed, Am. J. Phys. 57, 642 (1989) - Example 1 (y=m*x+c):
        m = 1.167 +- 0.308
        c = -0.365 +- 0.291
        WSSR = 578

        gnuplot (FIT_LIMIT=1.0e-14):
        parameters not initialized at start:
        m = 0.931057 +- 0.1299/5.08338=0.02555
        c = -0.150072 +- 0.1192/5.08338=0.02345
        WSSR=646

        start parameters m=1.167 and c=-0.365:
        m = 1.16633 +- 0.1381/4.80855=0.0287
        b = -0.361615 +- 0.1268/4.80855=0.0264
        WSSR=578

        CERN-ROOT:
        m = 1.16684 +- 0.0319997
        b = -0.365156 +- 0.0293386
        WSSR=578

        J. Orear, Am. J. Phys., Vol. 50, No. 10, October 1982 (y=a1*x-a2/x)
        exact (Erratum: J. Orear, Am. J. Phys., Vol. 52, No. 3, March 1984):
        a1 = 1.0731e-3
        a2 = 6.250e5
        WSSR = 2.134

        gnuplot (FIT_LIMIT=1.0e-14):
        parameters not initialized at start:
        a1 = (1.0116 +- 0.1702/0.853987=0.1993)e-3
        a2 = (5.93675 +- 0.9527/0.853987=1.11557)e5
        WSSR = 2.18788

        start parameters a1 = 1.0731e-3 and a2 = 6.250e5:
        a1 = (1.07257 +- 0.1748/0.84351=0.2072)e-3
        a2 = (6.25000 +- 0.9784/0.84351=1.1599)e5
        WSSR = 2.13453

        CERN-ROOT:
        a1 = (1.07277 +- 0.252572)e-3
        a2 = (6.24979 +- 1.39958)e5
        WSSR = 2.13318

         
  • Ethan Merritt
    Ethan Merritt
    2014-02-17

    • labels: --> requested feature
    • Priority: 5 --> 7
     
1 2 > >> (Page 1 of 2)