I am trying to fit following set of data points(columns 1 vs 2). I try a function
g(x)=p-q*exp(-((x-r)**(2))/(2*s**(2)))
and it fits quite well in the middle region, where I am particularly interested. But looking at the data and the fitting function together, one can intuitively guess that adding a simple
constant times x**(2)
term can make the fit slightly better on the sides too. I did that and got slightly better results. Lastly, I thought that further adding another simple
constant times x**(2)
might help bit more. I saved this best fitting function which is :
So far everything is perfect, as long as I start with the first function I mentioned on top, then add an x squared term and finally a linear x term, step by step. But when I launch a new gnuplot window, and try to try to fit the data directly with the general function at the end of paragraph above, I get random curves. No fit at all.
I really don't think that a square term is an intuitive assumption for the shape of that background curve. It's quite a bit closer to being linear. And fit agrees with me on that, too.
t = 1; u=0 ; fit g(x) 'data.txt' u 1:2 via p,q,r,s,t
gives me a WSSR/ndf of 2372, whereas
t = 0; u=1 ; fit g(x) 'data.txt' u 1:2 via p,q,r,s,u
yiels 224.4. That's a factor of ten better! Allowing both t and u to vary yields only a slightly better ressult of 176.7, and a huge error on 't', which tells me that there isn't really much of a parabolic background.
when I launch a new gnuplot window, and try to try to fit the data directly with the general function at > the end of paragraph above, I get random curves. No fit at all.
Could someone please look into this mystery?
That's no mystery at all. Doing 6-parameter fit with all the parameters starting way off the final result will hardly ever work. fit needs at least some rough idea where to look. One approach that often works is to fit the background first, then the Gaussian along with it:
fit p + u*x 'data.txt' u 1:2 via p, u
t = 0; fit g(x) 'data.txt' u 1:2 via p,q,r,s,u
In the case at hand, that's not enough, either, because the amplitude of the gaussion isn't found. Pre-loading 'q' to a value of about 100.0 does make it work, though.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thank you for explaining...:) I shall try to see if I can provide slightly better initial values of parameters, although it is difficult for my case.
I really want to keep both t and u, as for this data, u seems to be more important than t, but actually, in some other data, u seem to dominate. The script is supposed to be generic and that is why I want to implement both.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hello,
I am trying to fit following set of data points(columns 1 vs 2). I try a function
and it fits quite well in the middle region, where I am particularly interested. But looking at the data and the fitting function together, one can intuitively guess that adding a simple
term can make the fit slightly better on the sides too. I did that and got slightly better results. Lastly, I thought that further adding another simple
might help bit more. I saved this best fitting function which is :
The general formula for this function is
So far everything is perfect, as long as I start with the first function I mentioned on top, then add an x squared term and finally a linear x term, step by step. But when I launch a new gnuplot window, and try to try to fit the data directly with the general function at the end of paragraph above, I get random curves. No fit at all.
Could someone please look into this mystery?
Last edit: Hans-Bernhard Broeker 2016-04-29
I am attaching the data file for convenience.
I really don't think that a square term is an intuitive assumption for the shape of that background curve. It's quite a bit closer to being linear. And fit agrees with me on that, too.
gives me a WSSR/ndf of 2372, whereas
yiels 224.4. That's a factor of ten better! Allowing both t and u to vary yields only a slightly better ressult of 176.7, and a huge error on 't', which tells me that there isn't really much of a parabolic background.
That's no mystery at all. Doing 6-parameter fit with all the parameters starting way off the final result will hardly ever work. fit needs at least some rough idea where to look. One approach that often works is to fit the background first, then the Gaussian along with it:
In the case at hand, that's not enough, either, because the amplitude of the gaussion isn't found. Pre-loading 'q' to a value of about 100.0 does make it work, though.
Thank you for explaining...:) I shall try to see if I can provide slightly better initial values of parameters, although it is difficult for my case.
I really want to keep both t and u, as for this data, u seems to be more important than t, but actually, in some other data, u seem to dominate. The script is supposed to be generic and that is why I want to implement both.