From: Charles R H. <cha...@gm...> - 2006-10-12 20:51:09
|
On 10/12/06, Greg Willden <gre...@gm...> wrote: > > Hi All, > I'm using numpy.polyfit and it is giving me some really strange results > when evaluated near 0. So I compared it with polyfit in matplotlib and the > code and docstrings are nearly identical. However the slight differences in > the code make a huge difference on my data. > > Here is the Numpy version with line numbers added: > 1 x = NX.asarray(x)+0. > 2 y = NX.asarray(y)+0. > 3 y = NX.reshape(y, (len(y), 1)) > 4 X = vander(x, N+1) > 5 c, resids, rank, s = _lstsq(X, y) > 6 c.shape = (N+1,) > 7 return c > > > And here is the MPL version with line numbers added: > 1 x = asarray(x)+0. > 2 y = asarray(y)+0. > 3 y = reshape(y, (len(y),1)) > 4 X = Matrix(vander(x, N+1)) > 5 Xt = Matrix(transpose(X)) > 6 c = array(linear_algebra.inverse(Xt*X)*Xt*y) # convert back to array > 7 c.shape = (N+1,) > 8 return c > > > So lines 1-4 are basically the same. > The MPL version produces a much more stable representation of my data > > Can someone please comment on the differences? > Should these two be exactly the same? > If not, what are the advantages to each algorithm? I'm guessing that the rcond number in the lstsq version (default 1e-10) is the difference. Generally the lstsq version should work better than the MPL version because at*a is not as well conditioned and vandermonde matrices are notoriously ill conditioned anyway for higher degree fits. It would help if you attached the data files in ascii form unless they happen to contain thousands of data items. Just the x will suffice, zip it up if you have to. Chuck |