Menu

#2735 [6.0.0] `fit` finicky about the range

None
open
nobody
None
2024-09-18
2024-09-06
No
gnuplot> fit [1e9:1e13] a1+x '+' u 1:($1) via a1
         Read 100 points
         Skipped 100 points outside range [x=1e+09:1e+13]
         No data to fit

(To add insult to injury, sometimes a very similar command would report a different number of points [like 50 or 69] — all skipped.)

It works OK¹⁾ with many ranges, like [1e20:1e23] or [1:1e4].

¹⁾ Within the usual flackiness of `fit`.

Discussion

  • Ethan Merritt

    Ethan Merritt - 2024-09-06

    Second attempt at a response. Just forget about anything I said the first time around.

    I never use in-line ranges myself, so I always get confused when I see them as to exactly what they apply to. Unfortunately this confusion may extend to what the program thinks it means and what the user thinks it means.

    There are several issues here. It looks like you expecting the program to figure out that it should generate appropriate samples spanning the range of '+'. This doesn't happen for your minimal example because neither the sampling range (trange) nor xrange is specified and no log-scaling is in effect. In the absence of a trange specifier the program will thus generate uniform (equal interval) samples along the default xrange, which is [-10:10]. All of these samples lie outside the fitting range [1.e9:1.e13]. Thus the error message you see.

    So the first fix is to make the sampling range match the desired fit range. Usually this would mean
    set trange [1.e9 : 1.e13] but that isn't going to work here because you can't set log-scaling on the sample axis (no set log t command). So instead you have to leave trange unspecified and set xrange instead. Also you need to specify log-scaling on x.

    gnuplot> set log x
    gnuplot> set xrange [1.e9:1.e13]
    gnuplot> fit [1e9:1e13] a1+x '+' u 1:1 via a1
    iter      chisq       delta/lim  lambda   a1           
       0 9.8000000000e+01   0.00e+00  1.01e+00    1.000000e+00
       1 4.4050066605e-02  -2.22e+08  1.01e-01    2.118430e-02
       2 6.4510639618e-04  -6.73e+06  1.01e-02    2.559278e-03
       3 1.1962247822e-04  -4.39e+05  1.01e-03    1.103984e-03
       4 4.2138106664e-05  -1.84e+05  1.01e-04    6.335908e-04
       5 1.0124258324e-05  -3.16e+05  1.01e-05    3.198809e-04
       6 2.7051376037e-06  -2.74e+05  1.01e-06    1.786574e-04
       7 1.8320206436e-06  -4.77e+04  1.01e-07    1.393422e-04
       8 6.3236075221e-07  -1.90e+05  1.01e-08    8.961254e-05
       9 6.1250511862e-07  -3.24e+03  1.01e-09    8.755145e-05
      10 2.1990126697e-07  -1.79e+05  1.01e-10    5.542007e-05
      11 1.3118756215e-07  -6.76e+04  1.01e-11    4.256261e-05
      12 1.3118756215e-07   0.00e+00  1.01e-12    4.256261e-05
             Singular matrix in Invert_RtR
    

    No range problems now, but the fit fails because the default initial value of a1 is 1.0 and the function being fit makes no sense. The next step would be to pick a more reasonable starting value for a1. Now you get this:

    gnuplot> set log x
    gnuplot> set xrange [1.e9:1.e13]
    gnuplot> a1 = 0.0
    gnuplot> fit [1e9:1e13] a1+x '+' u 1:1 via a1
    Warning: Initial value of parameter 'a1' is zero.
      Please provide non-zero initial values for the parameters, at least of
      the right order of magnitude. If the expected value is zero, then use
      the magnitude of the expected error. If all else fails, try 1.0
    
    iter      chisq       delta/lim  lambda   a1           
       0 0.0000000000e+00  -nan       0.00e+00    1.000000e-30
             Singular matrix in Givens()
    

    Without a reasonable function to fit I can't take the analysis any further than that.

    hope that helps

     

    Last edit: Ethan Merritt 2024-09-06
    • Ilya Zakharevich

      There are several issues here. It looks like you expecting the program to figure out that it should generate appropriate samples spanning the range of '+'.

      Yes, this was a user’s error. However, many issues triggered by the code abpve (and by your remarks) still remain. I will address them one-by-one.¹⁾

      ¹⁾ Here I decided to answer by collecting many not-related issues in one message. Since I sit on dozens of gnuplot bugs, let me know which format of discussion is more convenient to you in the future.

      In the absence of a trange specifier the program will thus generate uniform (equal interval) samples along the default xrange, which is [-10:10].

      Right now I can see this indeed:

      gnuplot> show trange
      
              set trange [ * : * ] noreverse nowriteback  # (currently [-10.0000:10.0000] )
      

      But a few days ago it would show the “rurrent” range as [-5:5] (on a freshly started gnuplot). Go figure!

      gnuplot> set xrange [1.e9:1.e13]
      gnuplot> fit [1e9:1e13] a1+x '+' u 1:1 via a1

      Why set xrange is needed when the fit command specifies the x-range explicitly?

      Singular matrix in Invert_RtR

      This is a linear fit with coefficient 1. AFAIU, the Jacobian should be a 1×1 identity matrix. Why a problem to invert it?!

      the fit fails because the default initial value of a1 is 1.0 and the function being fit makes no sense.

      Which function makes no sense? I fit F(x) ⁻≡ x by f(x,a) ⁻≡ x+a. Which of them do you mean here?

      Without a reasonable function to fit I can't take the analysis any further than that.

      How come a linear function with coefficient 1 is not “reasonable”? Is it related²⁾ to #2643 The FIT command fails in gnuplot Ver.5.4.8 on Windows 7 64-bit platform?

      ²⁾ Finally, I was able to make the fit I need in octave. It had similar problems, but it was a bit easier to debug. Judging by this experience, I would try to list the following (conjectural, undocumented) properties of the algorithm used by fit

      • The independent variables (like x and y) in the datafile should be of approximately of the same magnitude (say, within a factor of 1,000).
      • The dependent variable (like z) in the datafile should be of approximately the same magnitude as the dependent variables.
      • Same for the via parameters one is looking for.
      • The gradient of the “misfit” w.r.t. the parameter variables should also be of the same magnitude.³⁾

      ³⁾ This part is not clear to me: “in which units” should one measure the gradient? Is it just dT/da? But what is the “target function” T? The square of the ℒ₂-norm of the misfit? Then this changes as one gets closer to the solution… Go figure again!

      you can't set log-scaling on the sample axis (no set log t command).

      Is there a reason for this?

      ⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜⁜

      Additionally, when working over your answer(s), I found many issues in the docs. But this message gets too long, so I will try to discuss this later.

       

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.