Menu

Discrepancy in fitting data; gnuplot 4.6.6

2016-04-29
2016-05-01
  • Suraj Semwal

    Suraj Semwal - 2016-04-29

    Hello,

    I am trying to fit following set of data points(columns 1 vs 2). I try a function

    g(x)=p-q*exp(-((x-r)**(2))/(2*s**(2)))
    

    and it fits quite well in the middle region, where I am particularly interested. But looking at the data and the fitting function together, one can intuitively guess that adding a simple

    constant times x**(2)
    

    term can make the fit slightly better on the sides too. I did that and got slightly better results. Lastly, I thought that further adding another simple

    constant times x**(2)
    

    might help bit more. I saved this best fitting function which is :

    g(x)=g(x)=289.050096211176-265.805373922702*exp(-((x+0.0550971088675056)**(2))/(2*(-0.347862087277803)**(2)))+(-8.31769085388354)*x**(2)+(-34.722389941881)*x
    

    The general formula for this function is

    g(x)=p-q*exp(-((x-r)**(2))/(2*s**(2)))+t*x**(2)+u*x
    

    So far everything is perfect, as long as I start with the first function I mentioned on top, then add an x squared term and finally a linear x term, step by step. But when I launch a new gnuplot window, and try to try to fit the data directly with the general function at the end of paragraph above, I get random curves. No fit at all.

    Could someone please look into this mystery?

         -2.00000      322.000      1589.00      1589.00
         -1.80000      323.000      1552.00      1552.00
         -1.60000      303.000      1567.00      1567.00
         -1.40000      324.000      1478.00      1478.00
         -1.20000      340.000      1573.00      1573.00
         -1.00000      322.000      1433.00      1433.00
        -0.800000      280.000      1433.00      1433.00
        -0.600000      232.000      1244.00      1244.00
        -0.400000      138.000      932.000      932.000
        -0.200000      45.0000      442.000      442.000
     1.49012e-007      38.0000      331.000      331.000
         0.200000      71.0000      495.000      495.000
         0.400000      163.000      949.000      949.000
         0.600000      217.000      1192.00      1192.00
         0.800000      252.000      1279.00      1279.00
          1.00000      229.000      1239.00      1239.00
          1.20000      239.000      1231.00      1231.00
          1.40000      197.000      1088.00      1088.00
          1.60000      213.000      1183.00      1183.00
          1.80000      210.000      1114.00      1114.00
          2.00000      197.000      1094.00      1094.00
    
     

    Last edit: Hans-Bernhard Broeker 2016-04-29
  • Suraj Semwal

    Suraj Semwal - 2016-04-29

    I am attaching the data file for convenience.

     
  • Hans-Bernhard Broeker

    I really don't think that a square term is an intuitive assumption for the shape of that background curve. It's quite a bit closer to being linear. And fit agrees with me on that, too.

     t = 1; u=0 ; fit g(x) 'data.txt' u 1:2 via p,q,r,s,t
    

    gives me a WSSR/ndf of 2372, whereas

    t = 0; u=1 ; fit g(x) 'data.txt' u 1:2 via p,q,r,s,u  
    

    yiels 224.4. That's a factor of ten better! Allowing both t and u to vary yields only a slightly better ressult of 176.7, and a huge error on 't', which tells me that there isn't really much of a parabolic background.

    when I launch a new gnuplot window, and try to try to fit the data directly with the general function at > the end of paragraph above, I get random curves. No fit at all.

    Could someone please look into this mystery?

    That's no mystery at all. Doing 6-parameter fit with all the parameters starting way off the final result will hardly ever work. fit needs at least some rough idea where to look. One approach that often works is to fit the background first, then the Gaussian along with it:

    fit p + u*x 'data.txt' u 1:2 via p, u
    t = 0; fit g(x) 'data.txt' u 1:2 via p,q,r,s,u
    

    In the case at hand, that's not enough, either, because the amplitude of the gaussion isn't found. Pre-loading 'q' to a value of about 100.0 does make it work, though.

     
  • Suraj Semwal

    Suraj Semwal - 2016-05-01

    Thank you for explaining...:) I shall try to see if I can provide slightly better initial values of parameters, although it is difficult for my case.
    I really want to keep both t and u, as for this data, u seems to be more important than t, but actually, in some other data, u seem to dominate. The script is supposed to be generic and that is why I want to implement both.

     

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.