From: Daniel J S. <dan...@ie...> - 2011-05-06 22:56:48
|
On 05/06/2011 04:10 PM, Thomas Mattison wrote: > I stand corrected about two things: > > 1. Gnuplot fits have always done the "right" thing and divided > chisquare by degrees of freedom rather than measurements. Sorry, I > haven't looked at what gets printed out in a while; I mostly use my > own fitting programs these days. > > 2. Gnuplot will indeed refuse to fit a constant function to a single > data point, or a line to two points. Yes, I hadn't done the > experiment. Sorry. > > However I regard item 2 as a bug not a feature. It is perfectly well > defined to ask the question, what is the chisquare for agreement > between data and model, as a function of the parameter values, for > these cases. There is a perfectly well defined answer to that > question. There is a perfectly well defined point of minimum > chisquare, which gives the best fit parameters. And the curvature of > the chisquare curve or surface gives the fit errors. Nothing > anomalous happens in the case where the number of data points equals > the number of parameters. It seems like it should. From H.B.B.'s last post, if there are only two points and the curve to fit is a line (two parameters, first order equation**) then the fit is exact in the sense that the residuals are zero...if no other information is given about the data. But that doesn't say anything about the measurement "error" in the fit. (I sort of prefer the word "uncertainty", but that's just me.) We can only glean information about the uncertainty of the fit (i.e., the statistics of the original data) if there are additional data points beyond the number of parameters. ** Polynomial order of the fitting curve is important. For example, exp() has infinite polynomial order, which has ramifications on uniqueness. > I agree that in those two particular cases, it's not necessary to "do > a fit" to find the parameters; it can easily be done by hand. But > mathematically the fit is still well-defined. OK, so I'm beginning to see there is this vital piece of information about the quality of fit, reflecting the underlying statistics or the uncertainty in the data (i.e., how noisy it is). That's what is meant by error, right? In this discussion http://www.erikburd.org/projects/pitfall/evaluate.html is a column for "standard error". Is that what is being referred to, the standard error typically used in statistical analysis? (There are the underlying actual measurement errors, but we can't know what those are because there is uncertainty in the fit.) > I hope we can all agree that when fitting a line to two data points > that have errors, the slope and intercept DO have errors. Yes, but I don't see how one can estimate the size of that error without additional data beyond two data points, unless some information is known apriori. > Is it really so much to ask to have both raw and rescaled errors > printed? I'm fine with that idea. But I'm surprised that no well defined, commonly understood terminology has developed for what seems like an important idea in the numerical analysis community. Bastian points out that: * Minuit does not scale * Origin offers both options, the default being version dependent * SAS and Mathematica default to scaling of errors That inconsistency is the problem. Dan |