You can subscribe to this list here.
2001 
_{Jan}

_{Feb}
(1) 
_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}
(1) 
_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}


2002 
_{Jan}
(1) 
_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}
(1) 
_{Aug}
(1) 
_{Sep}

_{Oct}

_{Nov}
(1) 
_{Dec}

2003 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}
(1) 
_{Aug}
(1) 
_{Sep}

_{Oct}
(83) 
_{Nov}
(57) 
_{Dec}
(111) 
2004 
_{Jan}
(38) 
_{Feb}
(121) 
_{Mar}
(107) 
_{Apr}
(241) 
_{May}
(102) 
_{Jun}
(190) 
_{Jul}
(239) 
_{Aug}
(158) 
_{Sep}
(184) 
_{Oct}
(193) 
_{Nov}
(47) 
_{Dec}
(68) 
2005 
_{Jan}
(190) 
_{Feb}
(105) 
_{Mar}
(99) 
_{Apr}
(65) 
_{May}
(92) 
_{Jun}
(250) 
_{Jul}
(197) 
_{Aug}
(128) 
_{Sep}
(101) 
_{Oct}
(183) 
_{Nov}
(186) 
_{Dec}
(42) 
2006 
_{Jan}
(102) 
_{Feb}
(122) 
_{Mar}
(154) 
_{Apr}
(196) 
_{May}
(181) 
_{Jun}
(281) 
_{Jul}
(310) 
_{Aug}
(198) 
_{Sep}
(145) 
_{Oct}
(188) 
_{Nov}
(134) 
_{Dec}
(90) 
2007 
_{Jan}
(134) 
_{Feb}
(181) 
_{Mar}
(157) 
_{Apr}
(57) 
_{May}
(81) 
_{Jun}
(204) 
_{Jul}
(60) 
_{Aug}
(37) 
_{Sep}
(17) 
_{Oct}
(90) 
_{Nov}
(122) 
_{Dec}
(72) 
2008 
_{Jan}
(130) 
_{Feb}
(108) 
_{Mar}
(160) 
_{Apr}
(38) 
_{May}
(83) 
_{Jun}
(42) 
_{Jul}
(75) 
_{Aug}
(16) 
_{Sep}
(71) 
_{Oct}
(57) 
_{Nov}
(59) 
_{Dec}
(152) 
2009 
_{Jan}
(73) 
_{Feb}
(213) 
_{Mar}
(67) 
_{Apr}
(40) 
_{May}
(46) 
_{Jun}
(82) 
_{Jul}
(73) 
_{Aug}
(57) 
_{Sep}
(108) 
_{Oct}
(36) 
_{Nov}
(153) 
_{Dec}
(77) 
2010 
_{Jan}
(42) 
_{Feb}
(171) 
_{Mar}
(150) 
_{Apr}
(6) 
_{May}
(22) 
_{Jun}
(34) 
_{Jul}
(31) 
_{Aug}
(38) 
_{Sep}
(32) 
_{Oct}
(59) 
_{Nov}
(13) 
_{Dec}
(62) 
2011 
_{Jan}
(114) 
_{Feb}
(139) 
_{Mar}
(126) 
_{Apr}
(51) 
_{May}
(53) 
_{Jun}
(29) 
_{Jul}
(41) 
_{Aug}
(29) 
_{Sep}
(35) 
_{Oct}
(87) 
_{Nov}
(42) 
_{Dec}
(20) 
2012 
_{Jan}
(111) 
_{Feb}
(66) 
_{Mar}
(35) 
_{Apr}
(59) 
_{May}
(71) 
_{Jun}
(32) 
_{Jul}
(11) 
_{Aug}
(48) 
_{Sep}
(60) 
_{Oct}
(87) 
_{Nov}
(16) 
_{Dec}
(38) 
2013 
_{Jan}
(5) 
_{Feb}
(19) 
_{Mar}
(41) 
_{Apr}
(47) 
_{May}
(14) 
_{Jun}
(32) 
_{Jul}
(18) 
_{Aug}
(68) 
_{Sep}
(9) 
_{Oct}
(42) 
_{Nov}
(12) 
_{Dec}
(10) 
2014 
_{Jan}
(14) 
_{Feb}
(139) 
_{Mar}
(137) 
_{Apr}
(66) 
_{May}
(72) 
_{Jun}
(142) 
_{Jul}
(70) 
_{Aug}
(31) 
_{Sep}
(39) 
_{Oct}
(98) 
_{Nov}
(133) 
_{Dec}
(44) 
2015 
_{Jan}
(70) 
_{Feb}
(27) 
_{Mar}
(36) 
_{Apr}
(11) 
_{May}
(15) 
_{Jun}
(70) 
_{Jul}
(30) 
_{Aug}
(63) 
_{Sep}
(18) 
_{Oct}
(15) 
_{Nov}
(42) 
_{Dec}
(29) 
2016 
_{Jan}
(37) 
_{Feb}
(48) 
_{Mar}
(59) 
_{Apr}
(28) 
_{May}
(30) 
_{Jun}
(43) 
_{Jul}
(47) 
_{Aug}
(14) 
_{Sep}
(21) 
_{Oct}
(26) 
_{Nov}
(10) 
_{Dec}
(2) 
2017 
_{Jan}
(26) 
_{Feb}
(18) 
_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}

S  M  T  W  T  F  S 

1

2

3
(3) 
4

5
(17) 
6
(10) 
7
(1) 
8

9

10

11

12
(3) 
13
(3) 
14
(1) 
15

16

17

18
(1) 
19

20

21
(2) 
22
(1) 
23
(3) 
24
(4) 
25
(1) 
26

27
(3) 
28

29

30

31





From: Thomas Mattison <mattison@ph...>  20110506 23:50:48

On 20110506, at 3:56 PM, Daniel J Sebald wrote: > On 05/06/2011 04:10 PM, Thomas Mattison wrote: >> >> 2. Gnuplot will indeed refuse to fit a constant function to a single >> data point, or a line to two points. Yes, I hadn't done the >> experiment. Sorry. >> >> However I regard item 2 as a bug not a feature. It is perfectly well >> defined to ask the question, what is the chisquare for agreement >> between data and model, as a function of the parameter values, for >> these cases. There is a perfectly well defined answer to that >> question. There is a perfectly well defined point of minimum >> chisquare, which gives the best fit parameters. And the curvature of >> the chisquare curve or surface gives the fit errors. Nothing >> anomalous happens in the case where the number of data points equals >> the number of parameters. > > It seems like it should. But it doesn't, from a mathematical point of view. > From H.B.B.'s last post, if there are only two points and the curve to fit is a line (two parameters, first order equation**) then the fit is exact in the sense that the residuals are zero...if no other information is given about the data. But that doesn't say anything about the measurement "error" in the fit. (I sort of prefer the word "uncertainty", but that's just me.) We can only glean information about the uncertainty of the fit (i.e., the statistics of the original data) if there are additional data points beyond the number of parameters. You are right that there need to be more measurements than parameters in order to use the chisquare/degreeoffreedom to infer something about the errors of the fit INPUTS from the quality of the fit. But in many cases, the user ALREADY KNOWS what the input data measurement errors are. He knows the precision of the markings on his ruler, or how many decimal places his voltmeter has, or squarerootofN for the bin contents of a histogram, etc [I am not discussing "systematic" errors here, like is the calibration of your voltmeter correct]. In those cases, the "raw" fit errors are the appropriate ones, and the "rescaled" errors are less appropriate. > > OK, so I'm beginning to see there is this vital piece of information about the quality of fit, reflecting the underlying statistics or the uncertainty in the data (i.e., how noisy it is). That's what is meant by error, right? The way I would phrase it is, the noise in the input data propagates to noise in the fit parameters. The "standard errors" of the fit parameters are the standard deviation of the input data (assumed to be known) propagated to the standard deviation of the fit parameters. > >> I hope we can all agree that when fitting a line to two data points >> that have errors, the slope and intercept DO have errors. > > Yes, but I don't see how one can estimate the size of that error without additional data beyond two data points, unless some information is known apriori. > Sure, but when the user has supplied meaningful yerrors on his data, we have precisely the apriori informaton that your intuition is telling you is needed. > But I'm surprised that no well defined, commonly understood terminology has developed for what seems like an important idea in the numerical analysis community. Bastian points out that: > > * Minuit does not scale > * Origin offers both options, the default being version dependent > * SAS and Mathematica default to scaling of errors > > That inconsistency is the problem. To me, that looks like most programs give you a choice. Mathematica may default to scaling, but it looks like it's optional. I don't have much knowledge about SAS, but I would expect a big package like that to have an option buried somewhere. Minuit comes from the physics community, nuclear and particle physics in particular. In physics, we tend to have pretty good apriori estimates for the errors of the measurements going into a fit. We use the chisquare/DOF as a measure of the (statistical) fit quality. Physicists would seldom rescale fit parameter errors by sqrt(chisq/DOF). And since Minuit is normally used in an expertprogramming context instead of an ignorantGUIuser context, the authors would expect the user to be able multiply by sqrt(chisq/DOF) himself. In bioscience, and even more so in medical and social science, there's often next to nothing known about the errors of the input measurements apriori. In such cases, there's not much the user can do except set all the errors equal (or not tell gnuplot any errors, which will have the same result). Then the "raw" errors from the fit aren't very meaningful, but the "rescaled" errors have at least some utility. So defaulting to rescaled makes some sence. My guess would be that there are more gnuplot fit users who don't know much about their input data errors than those who know a great deal about their input data errors. So if I were forced to make a choice of only one error to present, I'd probably present the rescaled error. It will be the "right thing" for users who don't give errors or give uniform and arbitrary errors. And for the people who supply accurate errors, if the model really reflects the data, then chisq/DOF will be close to 1, and the "rescaled" errors will not be very different, very often, from the "raw" errors. But why not give both? Cheers Prof. Thomas Mattison Hennings 276 University of British Columbia Dept. of Physics and Astronomy 6224 Agricultural Road Vancouver BC V6T 1Z1 CANADA mattison@... phone: 6048229690 fax:6048225324 
From: Daniel J Sebald <daniel.sebald@ie...>  20110506 22:56:48

On 05/06/2011 04:10 PM, Thomas Mattison wrote: > I stand corrected about two things: > > 1. Gnuplot fits have always done the "right" thing and divided > chisquare by degrees of freedom rather than measurements. Sorry, I > haven't looked at what gets printed out in a while; I mostly use my > own fitting programs these days. > > 2. Gnuplot will indeed refuse to fit a constant function to a single > data point, or a line to two points. Yes, I hadn't done the > experiment. Sorry. > > However I regard item 2 as a bug not a feature. It is perfectly well > defined to ask the question, what is the chisquare for agreement > between data and model, as a function of the parameter values, for > these cases. There is a perfectly well defined answer to that > question. There is a perfectly well defined point of minimum > chisquare, which gives the best fit parameters. And the curvature of > the chisquare curve or surface gives the fit errors. Nothing > anomalous happens in the case where the number of data points equals > the number of parameters. It seems like it should. From H.B.B.'s last post, if there are only two points and the curve to fit is a line (two parameters, first order equation**) then the fit is exact in the sense that the residuals are zero...if no other information is given about the data. But that doesn't say anything about the measurement "error" in the fit. (I sort of prefer the word "uncertainty", but that's just me.) We can only glean information about the uncertainty of the fit (i.e., the statistics of the original data) if there are additional data points beyond the number of parameters. ** Polynomial order of the fitting curve is important. For example, exp() has infinite polynomial order, which has ramifications on uniqueness. > I agree that in those two particular cases, it's not necessary to "do > a fit" to find the parameters; it can easily be done by hand. But > mathematically the fit is still welldefined. OK, so I'm beginning to see there is this vital piece of information about the quality of fit, reflecting the underlying statistics or the uncertainty in the data (i.e., how noisy it is). That's what is meant by error, right? In this discussion http://www.erikburd.org/projects/pitfall/evaluate.html is a column for "standard error". Is that what is being referred to, the standard error typically used in statistical analysis? (There are the underlying actual measurement errors, but we can't know what those are because there is uncertainty in the fit.) > I hope we can all agree that when fitting a line to two data points > that have errors, the slope and intercept DO have errors. Yes, but I don't see how one can estimate the size of that error without additional data beyond two data points, unless some information is known apriori. > Is it really so much to ask to have both raw and rescaled errors > printed? I'm fine with that idea. But I'm surprised that no well defined, commonly understood terminology has developed for what seems like an important idea in the numerical analysis community. Bastian points out that: * Minuit does not scale * Origin offers both options, the default being version dependent * SAS and Mathematica default to scaling of errors That inconsistency is the problem. Dan 
From: Thomas Mattison <mattison@ph...>  20110506 21:10:45

I stand corrected about two things: 1. Gnuplot fits have always done the "right" thing and divided chisquare by degrees of freedom rather than measurements. Sorry, I haven't looked at what gets printed out in a while; I mostly use my own fitting programs these days. 2. Gnuplot will indeed refuse to fit a constant function to a single data point, or a line to two points. Yes, I hadn't done the experiment. Sorry. However I regard item 2 as a bug not a feature. It is perfectly well defined to ask the question, what is the chisquare for agreement between data and model, as a function of the parameter values, for these cases. There is a perfectly well defined answer to that question. There is a perfectly well defined point of minimum chisquare, which gives the best fit parameters. And the curvature of the chisquare curve or surface gives the fit errors. Nothing anomalous happens in the case where the number of data points equals the number of parameters. I agree that in those two particular cases, it's not necessary to "do a fit" to find the parameters; it can easily be done by hand. But mathematically the fit is still welldefined. I hope we can all agree that when fitting a line to two data points that have errors, the slope and intercept DO have errors. But calculating them by hand is not as trivial as calculating the slope and intercept by hand. The easiest way to do that in practice is to run a fitting program. The "raw" errors from the fit give the answer. But gnuplot refuses to do that. And the only reason I can see for the refusal is that the "rescaled" errors wouldn't make sense in that case (while the raw errors would make sense, and are useful). Let's say I have several equalsize data sets, described by the same model, with the same amount of random noise in each. If I fit each data set, the parameters will vary (within the errors), but they all have equal amounts of information, so they SHOULD all give the same fit errors. The "raw" fit errors WILL be the (neglecting effects from nonlinearities which are usually small). But the "rescaled" fit errors will vary randomly, depending on the chisquare of the fit. For a case where someone has worked hard to make the input errors meaningful, it's wrong for the parameter error output to depend on randomness in the data. Is it really so much to ask to have both raw and rescaled errors printed? Cheers Prof. Thomas Mattison Hennings 276 University of British Columbia Dept. of Physics and Astronomy 6224 Agricultural Road Vancouver BC V6T 1Z1 CANADA mattison@... phone: 6048229690 fax:6048225324 
From: HansBernhard Bröker <HBBroeker@t...>  20110506 20:03:50

On 06.05.2011 19:48, Thomas Mattison wrote: > Consider the case where the "function" to be fit is just a constant: > we think the the yvalue of all data points should be the same, > independent of xvalue. Let's call this constant C. Surely we want > gnuplot to be able to do this "fit" and give a meaningful error on > C. Yes. > Then consider the special case where there is only a single data > point: an xvalue, a yvalue, and a yerror. The fit result should > C=y, and the error should be sigmaC = sigmay. It shouldn't. The problem here is that by going down to a single data point, you've constructed a problem with zero degrees of freedom. That's not a fit  it's just an equation to be evaluated. Using fit for that is tool abuse. > The fit should be "perfect" with a chisquare of zero. The chisquare > divided by the number of measurements is still zero. But chisq is never divided by the number of measurements. It's divided by the number of degrees of freedom, i.e. (number of measurements)  (number of parameters). In the case at hand that's zero, STDFIT would be zero divided by zero. > Internally, gnuplot's fit algorithm knows a number equivalent to > sigmaC = sigmay, No, it doesn't, because it never even starts to work in that case: gnuplot> fit a '' u 1:2:3 via a input data ('e' ends) > 0 10 2 input data ('e' ends) > e Read 1 points No data to fit > but the reported error is that multiplied by chisquare/measurement, > so the result is zero. No it's not. See above. > It is totally unreasonable to claim that "fitting" the single data > point gives C with infinite precision, when the data point has a > finite error. It's totally unreasonable to use fit on a single data point, period. > One can make a similar argument about the case of fitting a line (two > parameters, slope and intercept) to two xyyerror data points. Same problem. Zero degrees of freedom means 'fit' is the wrong tool for the job. > Gnuplot's fit will go exactly through both data points, with zero > chisquare per measurement, so the reported errors on the slope and > intercept will both be zero. No they won't. Because gnuplot won't report _any_ errors in that case. > But clearly there is a range of slopes > and range of intercepts that are "within one sigma" of both data > points. And internally gnuplot's fit algorithm knows those errors on > slope and intercept, before it multiplies them by zero. No, it doesn't. > I would make one small additional change: chisquare should be divided > by "degrees of freedom" == (measurements minus parameters) not just > measurements. Do this both for reporting STDFIT and rescaling of > errors. What made you believe that's not already what's happening? > user that the rescaled errors are meaningless. This is much more > sensible than what gnuplot does now, which is report the errors are > zero. Your view of the world might profit from an actual experiment. 
From: HansBernhard Bröker <HBBroeker@t...>  20110506 19:45:18

On 06.05.2011 05:08, Daniel J Sebald wrote: > On 05/05/2011 04:10 PM, Bastian Märkisch wrote: > One thing I notice is that HansBernhard uses the term "residuals". Yes, but only for what's always called that: the difference between fitted model and data. > Now, "residuals" is a fairly common definition in fitting. Maybe it > would have been better in the first place to use "_res" extensions to > variable names for the unscaled errors and "_err" for the scaled errors I rather much doubt that. The term residual is not applicable to parameter errors in any meaningful way. > The argument is made in the post that one is derived from the other with > a simple scaling; That argument would be wrong. Parameter errors scale in parallel with the data errors (or weights, if you prefer), not with the residiuals. > avoid ambiguity. If "error" has some ambiguity in the field, whereas > "residual" is much less ambiguous, then go with the latter. "Residual" is unambiguous primarily because it is use for exactly one purpose. Calling something else by the same name would only break that unambiguity. That wouldn't be particularly helpful. 
From: Daniel J Sebald <daniel.sebald@ie...>  20110506 17:55:51

On 05/05/2011 06:30 PM, HansBernhard Bröker wrote: > On 05.05.2011 10:52, Daniel J Sebald wrote: [snip] >> As for the original bug report, unless this is something obvious, >> perhaps there is a way to illustrate the error with a test case, to >> ensure the fit is solved correctly. > > The demo is dead simple. Pick any fit from the demos or wherever, and > repeat it with the data errors multiplied by a fixed factor, i.e. replace > > fit f(x) 'foo.dat' u 1:2:3 via ... > > by > > fit f(x) 'foo.dat' u 1:2:($3*20) via ... > > 'fit' will report the same data errors, both in the printed output and > in the saved *_err variables. Only the chisq and STDFIT will have > shrinked by a factor of 20. > > People thinking I made a bad decision here say that the errors on the > parameters should become 20 times as large in the second case. Well, this is certainly a valid approach. Often some statistical or algebraic quantity is independent of scale to reflect is quality or fundamental nature. Could we create an additional set of variables "_res" that is scaled as others might want (i.e., residuals)? Dan 
From: Thomas Mattison <mattison@ph...>  20110506 17:48:58

Hi Consider the case where the "function" to be fit is just a constant: we think the the yvalue of all data points should be the same, independent of xvalue. Let's call this constant C. Surely we want gnuplot to be able to do this "fit" and give a meaningful error on C. Then consider the special case where there is only a single data point: an xvalue, a yvalue, and a yerror. The fit result should C=y, and the error should be sigmaC = sigmay. The fit should be "perfect" with a chisquare of zero. The chisquare divided by the number of measurements is still zero. Internally, gnuplot's fit algorithm knows a number equivalent to sigmaC = sigmay, but the reported error is that multiplied by chisquare/measurement, so the result is zero. It is totally unreasonable to claim that "fitting" the single data point gives C with infinite precision, when the data point has a finite error. One can make a similar argument about the case of fitting a line (two parameters, slope and intercept) to two xyyerror data points. Gnuplot's fit will go exactly through both data points, with zero chisquare per measurement, so the reported errors on the slope and intercept will both be zero. But clearly there is a range of slopes and range of intercepts that are "within one sigma" of both data points. And internally gnuplot's fit algorithm knows those errors on slope and intercept, before it multiplies them by zero. Every other fitting program that I have ever used, which takes into account yerrors, just reports the "internal" error and doesn't divide it by chisquare/measurement the way gnuplot does. If the yerrors are known and are "correct", the points will scatter around the fit curve such that the residual (measurement minus curve) divided by the error has a gaussian distribution with mean of zero and sigma of 1.0. If the "real" yerrors are twice as large as the "assumed" yerrors, the gaussian will have sigma of about 2.0, and the chisquare/measurement will be about 4.0. However, the user is not required to provide yerrors to gnuplot. In that case, the errors of the fit parameters are more problematic. Gnuplot does what every other noyerrors fitting program does: treat all the yerrors as being 1.0. The "internal" error calculation is still done, but one does not expect it to really describe the errors of the parameters. And one does not expect the "chisquare" per measurement to be approximately 1.0. But we can use the "chisquare" per measurement to estimate the yerror, assuming that it's the same for all measurements. The result is that yerror is just sqrt(chisquare/measurement). If we then repeated the fit using those yerrors, we do expect that the "internal" parameter errors will describe our knowledge of the parameters. In fact, it's not necessary to actually repeat the fit. The parameter values would come out exactly the same, and the parameter errors would just be multiplied by sqrt(chisquare/measurement). This is in fact what gnuplot's fit algorithm does. And for the case where no user errors are provided, I think it's (almost) the most sensible thing to do. So what should be done? If it were me, I'd report both the "raw" and the "rescaled" parameter errors on the console. The documentation should state that if the user has provided errors, the "raw" errors are the most meaningful, and if the user has not provided errors, the "rescaled" errors are the most meaningful. This requires essentially no algorithm changes, just some print statement changes. Perhaps at the end of the fit, one could print a recommendation to the user on which error to use, based on whether or not he provided errors, and on the chisquare per measurement. If no errors were provided, only the rescaled errors are meaningful. If errors were provided and the chisquare per measurement is reasonable (less than 2 perhaps), the raw errors are meaningful. If errors were provided and the chisquare per measurement is not reasonable, then the raw errors are questionable and the rescaled errors might be more meaningful. I would make one small additional change: chisquare should be divided by "degrees of freedom" == (measurements minus parameters) not just measurements. Do this both for reporting STDFIT and rescaling of errors. Dividing by degrees of freedom instead of measurements is totally standard statistical practice. In the oneparameter fit to one data point and linefit to two data points examples I give above, measurementsparameters is zero. Ideally the chisquare is also zero, so ideally the rescaling factor is 0/0=NAN, which gives the "correct" result that the rescaled errors are meaningless (in these cases). In practice the chisquare won't be exactly zero, so the rescaling factor will be INF, still telling the user that the rescaled errors are meaningless. This is much more sensible than what gnuplot does now, which is report the errors are zero. If we were fitting a line to 3 points instead of 2, with errors supplied, we expect the average raw chisquare to be 1, not 3. If errors are not supplied, we expect the raw "chisquare" not to be 3*yerror^2, but only 1*yerror^2. Dividing by (measurements  parameters) gives a significantly different (and more correct) estimate of the yerrors, and thus of the parameter errors. Cheers Prof. Thomas Mattison Hennings 276 University of British Columbia Dept. of Physics and Astronomy 6224 Agricultural Road Vancouver BC V6T 1Z1 CANADA mattison@... phone: 6048229690 fax:6048225324 
From: <plotter@pi...>  20110506 13:55:34

On 05/05/11 17:40, sfeam (Ethan Merritt) wrote: > On Thursday, 05 May 2011, plotter@... wrote: >> Any reason that is not in the install instructions in INSTALL? > > The CVS source tree contains a file INSTALL for use with the > eventual distributed release package. The instructions in > INSTALL are correct for the release package. But if you are > building directly from the CVS source tree you need to first run > the files through it through autoconf. > There is a script "./prepare" that does this for you. > > Ethan > Allin Cottrell wrote: You need to read README.1ST ;) Hi, Allin is of course correct. Thanks for pointing to that file: If your source is from the CVS repository rather from a release package, the order of commands is ./prepare; ./configure; make So the point is that INSTALL, despite it's length, does not tell you how to install in all cases. There are two files giving install notes with a paragraph headed "Installation from sources ", the two with differing information. Is there any pressing reason for this simple phrase about how to install not being in the file called INSTALL? Why is there information in README.1ST: 'Installation from sources' that is not in INSTALL: 'Installation from sources' ? Surely the two should be syncronised or the para from README.1ST should be merged into INSTALL. thanks for the replies. regards, Peter. 
From: Bastian Märkisch <bmaerkisch@we...>  20110506 10:58:32

For what it's worth, here's the result of a little survey on data analysis packages: * Minuit does not scale * Origin offers both options, the default being version dependent * SAS and Mathematica default to scaling of errors The default choice of how to treat errors is somewhat arbitrary and depends on the problem at hand  I am not arguing about that. For a given problem only one choice is correct, though. The default for gnuplot has been chosen long ago. All I am proposing is to make this default user changeable. Actual name suggestions for that setting are more than welcome! IMHO there's no point in printing both numbers and even creating another error variable. As pointed out earlier, the other value can still be easily obtained by multiplying/dividing by the variable FIT_STDFIT. Finally, here's an example of why _not_ to scale fit errors in combination with "real" data errors is important: Consider NDF = 10. According to the ChiSq distribution, ChiSq = 6.74 corresponds to P = 0.75, with STDFIT = 0.82, whereas and ChiSq = 12.55 corresponds to P = 0.25, with STDFIT = 1.12. It is equally probable to obtain either ChiSq value, but scaling would tells us errors should differ by 30%! In physics a probability P of the fit below 0.05 or above 0.95 is typically considered as an indication of "something being wrong", ie. there's some problem with data, errors or model. But within that range the fit is accepted. For NDF=10, Bernhard's example of STDFIT=10 would correspond to P < 10^200. Bastian 
From: Daniel J Sebald <daniel.sebald@ie...>  20110506 03:09:19

On 05/05/2011 04:10 PM, Bastian Märkisch wrote: >> But if it is some nuanced detailed that initially could be seen as a >> mistake in coding, then I'd say backward compatibility isn't so much an >> issue. > > I am pretty sure that this was a deliberate choice. The reasoning being > that as long as the fit is good, FIT_STDFIT is somehow close to 1. So it > wouldn't hurt too much if "real" errors were given. See > http://article.gmane.org/gmane.comp.graphics.gnuplot.devel/3737 I don't know about the argument that once the fit isn't so good it doesn't matter whether the errors are normalized or not. One thing I notice is that HansBernhard uses the term "residuals". Now, "residuals" is a fairly common definition in fitting. Maybe it would have been better in the first place to use "_res" extensions to variable names for the unscaled errors and "_err" for the scaled errors (or maybe "_ferr" for fitting error with the inherent meaning that fitting errors are always scaled). The argument is made in the post that one is derived from the other with a simple scaling; a scaling which is made available to the user so redundant information is given. True, but I don't know if minimal representation is that important. If we're talking minimal basis or some linear algebra concept, sure. But my main point in all this is to avoid ambiguity. If "error" has some ambiguity in the field, whereas "residual" is much less ambiguous, then go with the latter. >> My fear with this is that a user could run the fit, get the results and >> significantly misinterpret what they mean by assuming errors were >> expressed as scaled or unscaled. > > That is already the case. Most physicists I know incorrectly assume that > gnuplot reports "unscaled" errors. That's not good. > Why not give them means to get what > they expect? Yes, naturally. But the most coherent way to do that is the question, right? We want gnuplot to be easy to use, not arcane. Dan 