From: Alexandre <Ale...@lo...> - 2002-04-29 13:19:48
|
On Mon, Apr 29, 2002 at 03:13:44AM -0700, Jasper Phillips wrote: > I'm helping my wife with programming for her economics thesis, which needs > to calculate a "Multiple Linear Regression" on her data. > > Does anyone know of any (preferably though not necesarrily free) software > that can do this? I'm working in Python, but not limited to it as I > can relatively freely access other languages. > > I'm still looking for a library written in Python, but haven't had any luck. > I'm helping my wife with her History PhD, and have to deal with similar stuff. I found R to be a very useful environment for statistical computations. R is a free software clone of S-plus, which is to statistics what Matlab is to linear algebra and automation. Pros: - programming environment, with a high level programming language - extensive statistical and linalg library (using C and FORTRAN code) - lots of third party code available, covering a very wide range of situations - Python bindings available if you don't want to learn the Scheme-like language - Tons of documentation available - Excellent support through the mailing lists - GPL'd - Tons of way to import data (ranging from CSV files to ODBC queries) - 2 printed books available, at Springer Verlag - postscript, png, wmf, X outputs, with precise control of the layout of the graphs and figures available for a nice colourful thesis Cons: - the language can be a bit weird at times (it took me some time to get used to '.' being used instead of '_' and vice versa in the scoping and variable naming), but you can use Python to script R, thanks to RPython - it's quite a big piece of code, with a rather steep learning curve and you need time to get inside it - the documentation is aimed at professional statisticians. I had to dig back in my statistics courses and to buy a couple of books on that topic for the software to become really useful. Asking newbie statistician questions on the r-help mailing list is off-topic - the springer verlag books are very expensive (Modern Applied Statistics with S-plus costs something like 70 euros), but they are great So you have a powerful tool available at your fingertips, designed to do precisely what you need. I think it's worth taking the time to look at it carefully. The more I get to understand the topic, the more ideas I get for new ways of exploring the data of my wife's PhD. Alexandre Fayolle -- LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org Narval, the first software agent available as free software (GPL). |