From: David M. C. <co...@ph...> - 2006-06-15 03:13:30
|
After working with them for a while, I'm going to go on record and say that I prefer the long names from Numeric and numarray (like linear_least_squares, inverse_real_fft, etc.), as opposed to the short names now used by default in numpy (lstsq, irefft, etc.). I know you can get the long names from numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better defaults. Abbreviations aren't necessary unique (quick! what does eig() return by default?), and aren't necessarily obvious. A Google search for irfft vs. irefft for instance turns up only the numpy code as (English) matches for irefft, while irfft is much more common. Also, Numeric and numarray compatibility is increased by using the long names: those two don't have the short ones. Fitting names into 6 characters when out of style decades ago. (I think MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!) My 2 cents... -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Scott R. <sr...@nr...> - 2006-06-15 03:21:05
|
I'll add my 2 cents to this and agree with David. Arguments about how short name are important for interactive work are pretty bogus given the beauty of modern tab-completion. And I'm not sure what other arguments there are... Scott On Wed, Jun 14, 2006 at 11:13:25PM -0400, David M. Cooke wrote: > After working with them for a while, I'm going to go on record and say that I > prefer the long names from Numeric and numarray (like linear_least_squares, > inverse_real_fft, etc.), as opposed to the short names now used by default in > numpy (lstsq, irefft, etc.). I know you can get the long names from > numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better > defaults. > > Abbreviations aren't necessary unique (quick! what does eig() return by > default?), and aren't necessarily obvious. A Google search for irfft vs. > irefft for instance turns up only the numpy code as (English) matches for > irefft, while irfft is much more common. > > Also, Numeric and numarray compatibility is increased by using the long > names: those two don't have the short ones. > > Fitting names into 6 characters when out of style decades ago. (I think > MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!) > > My 2 cents... > > -- > |>|\/|< > /--------------------------------------------------------------------------\ > |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ > |co...@ph... > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sr...@nr... Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 |
From: Sasha <nd...@ma...> - 2006-06-15 03:46:30
|
On 6/14/06, David M. Cooke <co...@ph...> wrote: > After working with them for a while, I'm going to go on record and say that I > prefer the long names from Numeric and numarray (like linear_least_squares, > inverse_real_fft, etc.), as opposed to the short names now used by default in > numpy (lstsq, irefft, etc.). I know you can get the long names from > numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better > defaults. > I agree in spirit, but note that inverse_real_fft is still short for inverse_real_fast_fourier_transform. Presumably, fft is a proper noun in many people vocabularies, but so may be lstsq depending who you ask. > Abbreviations aren't necessary unique (quick! what does eig() return by > default?), and aren't necessarily obvious. A Google search for irfft vs. > irefft for instance turns up only the numpy code as (English) matches for > irefft, while irfft is much more common. > Short names have one important advantage in scientific languages: they look good in expressions. What is easier to understand: hyperbolic_tangent(x) = hyperbolic_sinus(x)/hyperbolic_cosinus(x) or tanh(x) = sinh(x)/cosh(x) ? I am playing devil's advocate here a little because personally, I always recommend the following as a compromize: sinh = hyperbolic_sinus ... tanh(x) = sinh(x)/cosh(x) But the next question is where to put "sinh = hyperbolic_sinus": right before the expression using sinh? at the top of the module (import hyperbolic_sinus as sinh)? in the math library? If you pick the last option, do you need hyperbolic_sinus to begin with? If you pick any other option, how do you prevent others from writing sh = hyperbolic_sinus instead of sinh? > Also, Numeric and numarray compatibility is increased by using the long > names: those two don't have the short ones. > > Fitting names into 6 characters when out of style decades ago. (I think > MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!) > Short names are still popular in scientific programming: <http://www.nsl.com/papers/style.pdf>. I am still +1 for keeping linear_least_squares and inverse_real_fft, but not just because abreviations are bad as such - if an established acronym such as fft exists we should be free to use it. |
From: David M. C. <co...@ph...> - 2006-06-16 05:54:41
|
On Wed, Jun 14, 2006 at 11:46:27PM -0400, Sasha wrote: > On 6/14/06, David M. Cooke <co...@ph...> wrote: > > After working with them for a while, I'm going to go on record and say that I > > prefer the long names from Numeric and numarray (like linear_least_squares, > > inverse_real_fft, etc.), as opposed to the short names now used by default in > > numpy (lstsq, irefft, etc.). I know you can get the long names from > > numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better > > defaults. > > > > I agree in spirit, but note that inverse_real_fft is still short for > inverse_real_fast_fourier_transform. Presumably, fft is a proper noun > in many people vocabularies, but so may be lstsq depending who you > ask. I say "FFT", but I don't say "lstsq". I can find "FFT" in the index of a book of algorithms, but not "lstsq" (unless it was a specific implementation). Those are my two guiding ideas for what makes a good name here. > I am playing devil's advocate here a little because personally, I > always recommend the following as a compromize: > > sinh = hyperbolic_sinus > ... > tanh(x) = sinh(x)/cosh(x) > > But the next question is where to put "sinh = hyperbolic_sinus": right > before the expression using sinh? at the top of the module (import > hyperbolic_sinus as sinh)? in the math library? If you pick the last > option, do you need hyperbolic_sinus to begin with? If you pick any > other option, how do you prevent others from writing sh = > hyperbolic_sinus instead of sinh? Pish. By the same reasoning, we don't need the number 2: we can write it as the successor of the successor of the additive identity :-) > > Also, Numeric and numarray compatibility is increased by using the long > > names: those two don't have the short ones. > > > > Fitting names into 6 characters when out of style decades ago. (I think > > MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!) > > > Short names are still popular in scientific programming: > <http://www.nsl.com/papers/style.pdf>. That's 11 years old. The web was only a few years old at that time! There's been much work done on what makes a good programming style (Steve McConnell's "Code Complete" for instance is a good start). > I am still +1 for keeping linear_least_squares and inverse_real_fft, > but not just because abreviations are bad as such - if an established > acronym such as fft exists we should be free to use it. Ok, in summary, I'm seeing a bunch of "yes, long names please", but only your devil's advocate stance for no (and +1 for real). I see that Travis fixed the real fft names back to 'irfft' and friends. So, concrete proposal time: - go back to the long names in numpy.linalg (linear_least_squares, eigenvalues, etc. -- those defined in numpy.linalg.old) - of the new names, I could see keeping 'det' and 'svd': those are commonly used, although maybe 'SVD' instead? - anybody got a better name than Heigenvalues? That H looks weird at the beginning. - for numpy.dft, use the old names again. I could probably be persuaded that 'rfft' is ok. 'hfft' for the Hermite FFT is right out. - numpy.random is other "old package replacement", but's fine (and better). -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Paul D. <pfd...@gm...> - 2006-06-15 04:47:24
|
Bertrand Meyer has pointed out that abbreviations are usually a bad idea. The problem is that abbreviations are not unique so you can't guess what they are. Whereas (modulo some library-wide conventions about names) linearLeastSquares or the like is unique. At the very least you're more likely to get it right. Any python user can abbreviate anything they like any way they like for interactive work. And yes, I think FFT is a name. (:-> Exception for that. |
From: Scott R. <sr...@nr...> - 2006-06-15 04:53:06
|
On Wed, Jun 14, 2006 at 09:47:20PM -0700, Paul Dubois wrote: > And yes, I think FFT is a name. (:-> Exception for that. I agree. As are sinh, cosh, tanh, sinc, exp, log10 and various other very commonly used (and not only in programming) names. lstsq, eig, irefft, etc are not. Scott -- Scott M. Ransom Address: NRAO Phone: (434) 296-0320 520 Edgemont Rd. email: sr...@nr... Charlottesville, VA 22903 USA GPG Fingerprint: 06A9 9553 78BE 16DB 407B FFCA 9BFA B6FF FFD3 2989 |
From: Alexander B. <ale...@gm...> - 2006-06-15 13:15:58
|
On 6/15/06, Paul Dubois <pfd...@gm...> wrote: > And yes, I think FFT is a name. (:-> Exception for that. There are more exceptions that Numeric is not taking advantage of: equal, less, greater, ... -> eq, lt, gt, ... inverse, generalized_inverse -> inv, pinv In my view it is more important that code is easy to read rather than easy to write. Interactive users will disagree, but in programming you write once and read/edit forever :). Again, there is no defense for abbreviating linear_least_squares because it is unlikely to appear in an expression and waste valuable horisontal space. Contracting generalised_inverse is appropriate and numpy does the right thing in this case. The eig.., svd and cholesky choice of names is unfortunate because three different abbreviation schemes are used: first syllable, acronym and first word. I would say: when in doubt spell it in full. |
From: Sasha <nd...@ma...> - 2006-06-15 13:17:06
|
On 6/15/06, Paul Dubois <pfd...@gm...> wrote: > And yes, I think FFT is a name. (:-> Exception for that. There are more exceptions that Numeric is not taking advantage of: equal, less, greater, ... -> eq, lt, gt, ... inverse, generalized_inverse -> inv, pinv In my view it is more important that code is easy to read rather than easy to write. Interactive users will disagree, but in programming you write once and read/edit forever :). Again, there is no defense for abbreviating linear_least_squares because it is unlikely to appear in an expression and waste valuable horisontal space. Contracting generalised_inverse is appropriate and numpy does the right thing in this case. The eig.., svd and cholesky choice of names is unfortunate because three different abbreviation schemes are used: first syllable, acronym and first word. I would say: when in doubt spell it in full. |
From: Sven S. <sve...@gm...> - 2006-06-16 08:44:01
|
Alexander Belopolsky schrieb: > In my view it is more important that code is easy to read rather than > easy to write. Interactive users will disagree, but in programming you > write once and read/edit forever :). The insight about this disagreement imho suggests a compromise (or call it a dual solution): Have verbose names, but also have good default abbreviations for those who prefer them. It would be unfortunate if numpy users were required to cook up their own abbreviations if they wanted to, because 1. it adds overhead, and 2. it would make other people's code more difficult to read. > > Again, there is no defense for abbreviating linear_least_squares > because it is unlikely to appear in an expression and waste valuable > horisontal space. not true imho; btw, I would suggest "ols" (ordinary least squares), which is in every textbook. Cheers, Sven |
From: Alexandre F. <ale...@lo...> - 2006-06-16 12:11:26
|
On Fri, Jun 16, 2006 at 10:43:42AM +0200, Sven Schreiber wrote: > > Again, there is no defense for abbreviating linear_least_squares > > because it is unlikely to appear in an expression and waste valuable > > horisontal space. =20 >=20 > not true imho; btw, I would suggest "ols" (ordinary least squares), > which is in every textbook. Please, keep the zen of python in mind : Explicit is better than implicit.=20 --=20 Alexandre Fayolle LOGILAB, Paris (France) Formations Python, Zope, Plone, Debian: http://www.logilab.fr/formations D=E9veloppement logiciel sur mesure: http://www.logilab.fr/services Informatique scientifique: http://www.logilab.fr/science |
From: Sven S. <sve...@gm...> - 2006-06-16 12:49:20
|
Alexandre Fayolle schrieb: > On Fri, Jun 16, 2006 at 10:43:42AM +0200, Sven Schreiber wrote: >>> Again, there is no defense for abbreviating linear_least_squares >>> because it is unlikely to appear in an expression and waste valuable >>> horisontal space. >> not true imho; btw, I would suggest "ols" (ordinary least squares), >> which is in every textbook. > > Please, keep the zen of python in mind : Explicit is better than > implicit. > > True, but horizontal space *is* valuable (copied from above), and some of the suggested long names were a bit too long for my taste. Abbreviations will emerge anyway, the question is merely: Will numpy provide/recommend them (in addition to having long names maybe), or will it have to be done by somebody else, possibly resulting in many different sets of abbreviations for the same purpose. Thanks, Sven |
From: Tim H. <tim...@co...> - 2006-06-16 13:00:05
|
I don't have anything constructive to add at the moment, so I'll just throw out an unelucidated opinion: +1 for longish names. -1 for two sets of names. -tim |
From: Alan G I. <ai...@am...> - 2006-06-16 15:29:47
|
On Fri, 16 Jun 2006, Sven Schreiber apparently wrote:=20 > Abbreviations will emerge anyway, the question is merely:=20 > Will numpy provide/recommend them (in addition to having=20 > long names maybe), or will it have to be done by somebody=20 > else, possibly resulting in many different sets of=20 > abbreviations for the same purpose.=20 Agreed. =20 Cheers, Alan Isaac |
From: Sasha <nd...@ma...> - 2006-06-16 13:48:14
|
On 6/16/06, Sven Schreiber <sve...@gm...> wrote: > .... > Abbreviations will emerge anyway, the question is merely: Will numpy > provide/recommend them (in addition to having long names maybe), or will > it have to be done by somebody else, possibly resulting in many > different sets of abbreviations for the same purpose. > This is a valid point. In my experience ad hoc abbreviations are more popular among scientists who are not used to writing large programs. They use numpy either interactively or write short throw-away scripts that are rarely reused. Programmers who write reusable code almost universally hate ad hoc abbreviations. (There are exceptions: <http://www.kuro5hin.org/story/2002/8/30/175531/763>.) If numpy is going to compete with MATLAB, we should not ignore non-programmer user base. I like the idea of providing recommended abbreviations. There is a precedent for doing that: GNU command line utilities provide long/short alternatives for most options. Long options are recommended for use in scripts while short are indispensable at the command line. I would like to suggest the following guidelines: 1. Numpy should never invent abbreviations, but may reuse abbreviations used in the art. 2. When alternative names are made available, there should be one simple rule for reducing the long name to short. For example, use of acronyms may provide one such rule: singular_value_decomposition -> svd. Unfortunately that would mean linear_least_squares -> lls, not ols and conflict with rule #1 (rename lstsq -> ordinary_least_squares?). The second guideline may be hard to follow, but it is very important. Without a rule like this, there will be confusion on whether linear_least_squares and lsltsq are the same or just "similar". |
From: Tim H. <tim...@co...> - 2006-06-16 16:23:28
|
Sasha wrote: >On 6/16/06, Sven Schreiber <sve...@gm...> wrote: > > >>.... >>Abbreviations will emerge anyway, the question is merely: Will numpy >>provide/recommend them (in addition to having long names maybe), or will >>it have to be done by somebody else, possibly resulting in many >>different sets of abbreviations for the same purpose. >> >> >> >This is a valid point. In my experience ad hoc abbreviations are more >popular among scientists who are not used to writing large programs. >They use numpy either interactively or write short throw-away scripts >that are rarely reused. Programmers who write reusable code almost >universally hate ad hoc abbreviations. (There are exceptions: ><http://www.kuro5hin.org/story/2002/8/30/175531/763>.) > >If numpy is going to compete with MATLAB, we should not ignore >non-programmer user base. I like the idea of providing recommended >abbreviations. There is a precedent for doing that: GNU command line >utilities provide long/short alternatives for most options. Long >options are recommended for use in scripts while short are >indispensable at the command line. > > Unless the abreviations are obvious, adding second set of names will make it more difficult to read others code. In particular, it will make it harder to answer questions on the newsgroup. Particularly since I suspect that most of the more experienced users will end up using long names while the new users coming from MATLAB or whatever will use the shorter names. >I would like to suggest the following guidelines: > >1. Numpy should never invent abbreviations, but may reuse >abbreviations used in the art. > > Let me add a couple of cents here. There are widespread terms of the art and there are terms of art that are specific to a certain field. At the top level, I would like to see only widespread terms of the art. Thus, 'cos', 'sin', 'exp', etc are perfectly fine. However, something like 'dft' is not so good. Perversely, I consider 'fft' a widespread term of the art, but the more general 'dft' is somehow not. These narrower terms would be perfectly fine if segregated into appropriate packages. For example, I would consider it more sensible to have the current package 'dft' renamed to 'fourier' and the routine 'fft' renamed to 'dft' (since that's what it is). As another example, linear_algebra.svd is perfectly clear, but numpy.svd would be opaque. >2. When alternative names are made available, there should be one >simple rule for reducing the long name to short. For example, use of >acronyms may provide one such rule: singular_value_decomposition -> >svd. > Svd is already a term of the art I believe, so linalg.svd seems like a fine name for singular_value_decomposition. > Unfortunately that would mean linear_least_squares -> lls, not >ols and conflict with rule #1 (rename lstsq -> >ordinary_least_squares?). > > Before you consider this I suggest that you google 'linear algebra lls' and 'linear algebra ols'. The results may suprise you... While your at it google 'linear algebra svd' >The second guideline may be hard to follow, but it is very important. >Without a rule like this, there will be confusion on whether >linear_least_squares and lsltsq are the same or just "similar". > > Can I just reiterate a hearty blech! for having two sets of names. The horizontal space argument is mostly bogus in my opinion -- functions that tend to be used in complicated expression already have short, widely used abbreviations that we can steal. The typing argument is also mostly bogus: a decent editor will do tab completion (I use a pretty much no frills editor, SciTe, and even it does tab completion) and there's IPython if you want tab completion in interactive mode. -tim |