Thread: [Numpy-discussion] Don't like the short names like lstsq and irefft

A package for scientific computing with Python

Brought to you by: charris208, jarrodmillman, kern, rgommers, teoliphant

numpy-discussion

[Numpy-discussion] Don't like the short names like lstsq and irefft

From: David M. C. <co...@ph...> - 2006-06-15 03:13:30

After working with them for a while, I'm going to go on record and say that I
prefer the long names from Numeric and numarray (like linear_least_squares,
inverse_real_fft, etc.), as opposed to the short names now used by default in
numpy (lstsq, irefft, etc.). I know you can get the long names from
numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better
defaults.

Abbreviations aren't necessary unique (quick! what does eig() return by
default?), and aren't necessarily obvious. A Google search for irfft vs.
irefft for instance turns up only the numpy code as (English) matches for
irefft, while irfft is much more common.

Also, Numeric and numarray compatibility is increased by using the long
names: those two don't have the short ones.

Fitting names into 6 characters when out of style decades ago. (I think
MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!)

My 2 cents...

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|co...@ph...

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Scott R. <sr...@nr...> - 2006-06-15 03:21:05

I'll add my 2 cents to this and agree with David.   Arguments
about how short name are important for interactive work are pretty
bogus given the beauty of modern tab-completion.  And I'm not sure
what other arguments there are...

Scott

On Wed, Jun 14, 2006 at 11:13:25PM -0400, David M. Cooke wrote:
> After working with them for a while, I'm going to go on record and say that I
> prefer the long names from Numeric and numarray (like linear_least_squares,
> inverse_real_fft, etc.), as opposed to the short names now used by default in
> numpy (lstsq, irefft, etc.). I know you can get the long names from
> numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better
> defaults.
> 
> Abbreviations aren't necessary unique (quick! what does eig() return by
> default?), and aren't necessarily obvious. A Google search for irfft vs.
> irefft for instance turns up only the numpy code as (English) matches for
> irefft, while irfft is much more common.
> 
> Also, Numeric and numarray compatibility is increased by using the long
> names: those two don't have the short ones.
> 
> Fitting names into 6 characters when out of style decades ago. (I think
> MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!)
> 
> My 2 cents...
> 
> -- 
> |>|\/|<
> /--------------------------------------------------------------------------\
> |David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
> |co...@ph...
> 
> 
> _______________________________________________
> Numpy-discussion mailing list
> Num...@li...
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sr...@nr...             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Sasha <nd...@ma...> - 2006-06-15 03:46:30

On 6/14/06, David M. Cooke <co...@ph...> wrote:
> After working with them for a while, I'm going to go on record and say that I
> prefer the long names from Numeric and numarray (like linear_least_squares,
> inverse_real_fft, etc.), as opposed to the short names now used by default in
> numpy (lstsq, irefft, etc.). I know you can get the long names from
> numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better
> defaults.
>

I agree in spirit, but note that inverse_real_fft is still short for
inverse_real_fast_fourier_transform.  Presumably, fft is a proper noun
in many people vocabularies, but so may be lstsq depending who you
ask.

> Abbreviations aren't necessary unique (quick! what does eig() return by
> default?), and aren't necessarily obvious. A Google search for irfft vs.
> irefft for instance turns up only the numpy code as (English) matches for
> irefft, while irfft is much more common.
>
Short names have one important advantage in scientific languages: they
look good in expressions.

What is easier to understand:

  hyperbolic_tangent(x) = hyperbolic_sinus(x)/hyperbolic_cosinus(x)

or

  tanh(x) = sinh(x)/cosh(x)

?

I am playing devil's advocate here a little because personally, I
always recommend the following as a compromize:

sinh = hyperbolic_sinus
...
tanh(x) = sinh(x)/cosh(x)

But the next question is where to put "sinh = hyperbolic_sinus": right
before the expression using sinh?  at the top of the module (import
hyperbolic_sinus as sinh)? in the math library? If you pick the last
option, do you need hyperbolic_sinus to begin with?  If you pick any
other option, how do you prevent others from writing sh =
hyperbolic_sinus instead of sinh?

> Also, Numeric and numarray compatibility is increased by using the long
> names: those two don't have the short ones.
>
> Fitting names into 6 characters when out of style decades ago. (I think
> MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!)
>
Short names are still popular in scientific programming:
<http://www.nsl.com/papers/style.pdf>.

I am still +1 for keeping  linear_least_squares and inverse_real_fft,
but not just because abreviations are bad as such - if an established
acronym such as fft exists we should be free to use it.

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: David M. C. <co...@ph...> - 2006-06-16 05:54:41

On Wed, Jun 14, 2006 at 11:46:27PM -0400, Sasha wrote:
> On 6/14/06, David M. Cooke <co...@ph...> wrote:
> > After working with them for a while, I'm going to go on record and say that I
> > prefer the long names from Numeric and numarray (like linear_least_squares,
> > inverse_real_fft, etc.), as opposed to the short names now used by default in
> > numpy (lstsq, irefft, etc.). I know you can get the long names from
> > numpy.dft.old, numpy.linalg.old, etc., but I think the long names are better
> > defaults.
> >
> 
> I agree in spirit, but note that inverse_real_fft is still short for
> inverse_real_fast_fourier_transform.  Presumably, fft is a proper noun
> in many people vocabularies, but so may be lstsq depending who you
> ask.

I say "FFT", but I don't say "lstsq". I can find "FFT" in the index of a
book of algorithms, but not "lstsq" (unless it was a specific
implementation). Those are my two guiding ideas for what makes a good
name here.

> I am playing devil's advocate here a little because personally, I
> always recommend the following as a compromize:
> 
> sinh = hyperbolic_sinus
> ...
> tanh(x) = sinh(x)/cosh(x)
> 
> But the next question is where to put "sinh = hyperbolic_sinus": right
> before the expression using sinh?  at the top of the module (import
> hyperbolic_sinus as sinh)? in the math library? If you pick the last
> option, do you need hyperbolic_sinus to begin with?  If you pick any
> other option, how do you prevent others from writing sh =
> hyperbolic_sinus instead of sinh?

Pish. By the same reasoning, we don't need the number 2: we can write it
as the successor of the successor of the additive identity :-)

> > Also, Numeric and numarray compatibility is increased by using the long
> > names: those two don't have the short ones.
> >
> > Fitting names into 6 characters when out of style decades ago. (I think
> > MS-BASIC running under CP/M on my Rainbow 100 had a restriction like that!)
> >
> Short names are still popular in scientific programming:
> <http://www.nsl.com/papers/style.pdf>.

That's 11 years old. The web was only a few years old at that time!
There's been much work done on what makes a good programming style
(Steve McConnell's "Code Complete" for instance is a good start).

> I am still +1 for keeping  linear_least_squares and inverse_real_fft,
> but not just because abreviations are bad as such - if an established
> acronym such as fft exists we should be free to use it.

Ok, in summary, I'm seeing a bunch of "yes, long names please",
but only your devil's advocate stance for no (and +1 for real).

I see that Travis fixed the real fft names back to 'irfft' and friends.

So, concrete proposal time:

- go back to the long names in numpy.linalg (linear_least_squares,
  eigenvalues, etc. -- those defined in numpy.linalg.old)
  - of the new names, I could see keeping 'det' and 'svd': those are
    commonly used, although maybe 'SVD' instead?
  - anybody got a better name than Heigenvalues? That H looks weird
    at the beginning.

- for numpy.dft, use the old names again. I could probably be persuaded
  that 'rfft' is ok. 'hfft' for the Hermite FFT is right out.

- numpy.random is other "old package replacement", but's fine (and
  better).

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|co...@ph...

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Paul D. <pfd...@gm...> - 2006-06-15 04:47:24

Bertrand Meyer has pointed out that abbreviations are usually a bad idea.
The problem is that abbreviations are not unique so you can't guess what
they are. Whereas (modulo some library-wide conventions about names)
linearLeastSquares or the like is unique. At the very least you're more
likely to get it right.

Any python user can abbreviate anything they like any way they like for
interactive work.

And yes, I think FFT is a name. (:->  Exception for that.

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Scott R. <sr...@nr...> - 2006-06-15 04:53:06

On Wed, Jun 14, 2006 at 09:47:20PM -0700, Paul Dubois wrote:
> And yes, I think FFT is a name. (:->  Exception for that.

I agree.  As are sinh, cosh, tanh, sinc, exp, log10 and various
other very commonly used (and not only in programming) names.
lstsq, eig, irefft, etc are not.

Scott

-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sr...@nr...             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Alexander B. <ale...@gm...> - 2006-06-15 13:15:58

On 6/15/06, Paul Dubois <pfd...@gm...> wrote:
> And yes, I think FFT is a name. (:->  Exception for that.


There are more exceptions that Numeric is not taking advantage of:

equal, less, greater, ... -> eq, lt, gt, ...
inverse, generalized_inverse -> inv, pinv

In my view it is more important that code is easy to read rather than
easy to write. Interactive users will disagree, but in programming you
write once and read/edit forever :).

Again, there is no defense for abbreviating linear_least_squares
because it is unlikely to appear in an expression and waste valuable
horisontal space.  Contracting generalised_inverse is appropriate and
numpy does the right thing in this case.

The eig.., svd and cholesky choice of names is unfortunate because
three different abbreviation schemes are used: first syllable,
acronym and first word.  I would say: when in doubt spell it in full.

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Sasha <nd...@ma...> - 2006-06-15 13:17:06

On 6/15/06, Paul Dubois <pfd...@gm...> wrote:
> And yes, I think FFT is a name. (:->  Exception for that.


There are more exceptions that Numeric is not taking advantage of:

equal, less, greater, ... -> eq, lt, gt, ...
inverse, generalized_inverse -> inv, pinv

In my view it is more important that code is easy to read rather than
easy to write. Interactive users will disagree, but in programming you
write once and read/edit forever :).

Again, there is no defense for abbreviating linear_least_squares
because it is unlikely to appear in an expression and waste valuable
horisontal space.  Contracting generalised_inverse is appropriate and
numpy does the right thing in this case.

The eig.., svd and cholesky choice of names is unfortunate because
three different abbreviation schemes are used: first syllable,
acronym and first word.  I would say: when in doubt spell it in full.

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Sven S. <sve...@gm...> - 2006-06-16 08:44:01

Alexander Belopolsky schrieb:

> In my view it is more important that code is easy to read rather than
> easy to write. Interactive users will disagree, but in programming you
> write once and read/edit forever :).

The insight about this disagreement imho suggests a compromise (or call
it a dual solution): Have verbose names, but also have good default
abbreviations for those who prefer them.

It would be unfortunate if numpy users were required to cook up their
own abbreviations if they wanted to, because 1. it adds overhead, and 2.
it would make other people's code more difficult to read.

> 
> Again, there is no defense for abbreviating linear_least_squares
> because it is unlikely to appear in an expression and waste valuable
> horisontal space.  

not true imho; btw, I would suggest "ols" (ordinary least squares),
which is in every textbook.

Cheers,
Sven

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Alexandre F. <ale...@lo...> - 2006-06-16 12:11:26

On Fri, Jun 16, 2006 at 10:43:42AM +0200, Sven Schreiber wrote:
> > Again, there is no defense for abbreviating linear_least_squares
> > because it is unlikely to appear in an expression and waste valuable
> > horisontal space. =20
>=20
> not true imho; btw, I would suggest "ols" (ordinary least squares),
> which is in every textbook.

Please, keep the zen of python in mind : Explicit is better than
implicit.=20

--=20
Alexandre Fayolle                              LOGILAB, Paris (France)
Formations Python, Zope, Plone, Debian:  http://www.logilab.fr/formations
D=E9veloppement logiciel sur mesure:       http://www.logilab.fr/services
Informatique scientifique:               http://www.logilab.fr/science

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Sven S. <sve...@gm...> - 2006-06-16 12:49:20

Alexandre Fayolle schrieb:
> On Fri, Jun 16, 2006 at 10:43:42AM +0200, Sven Schreiber wrote:
>>> Again, there is no defense for abbreviating linear_least_squares
>>> because it is unlikely to appear in an expression and waste valuable
>>> horisontal space.  
>> not true imho; btw, I would suggest "ols" (ordinary least squares),
>> which is in every textbook.
> 
> Please, keep the zen of python in mind : Explicit is better than
> implicit. 
> 
> 

True, but horizontal space *is* valuable (copied from above), and some
of the suggested long names were a bit too long for my taste.

Abbreviations will emerge anyway, the question is merely: Will numpy
provide/recommend them (in addition to having long names maybe), or will
it have to be done by somebody else, possibly resulting in many
different sets of abbreviations for the same purpose.

Thanks,
Sven

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Tim H. <tim...@co...> - 2006-06-16 13:00:05

I don't have anything constructive to add at the moment, so I'll just 
throw out an unelucidated opinion:

+1 for longish names.
-1 for two sets of names.

-tim

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Alan G I. <ai...@am...> - 2006-06-16 15:29:47

On Fri, 16 Jun 2006, Sven Schreiber apparently wrote:=20
> Abbreviations will emerge anyway, the question is merely:=20
> Will numpy provide/recommend them (in addition to having=20
> long names maybe), or will it have to be done by somebody=20
> else, possibly resulting in many different sets of=20
> abbreviations for the same purpose.=20

Agreed. =20
Cheers,
Alan Isaac

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Sasha <nd...@ma...> - 2006-06-16 13:48:14

On 6/16/06, Sven Schreiber <sve...@gm...> wrote:
> ....
> Abbreviations will emerge anyway, the question is merely: Will numpy
> provide/recommend them (in addition to having long names maybe), or will
> it have to be done by somebody else, possibly resulting in many
> different sets of abbreviations for the same purpose.
>
This is a valid point.  In my experience ad hoc abbreviations are more
popular among scientists who are not used to writing large programs.
They use numpy either interactively or write short throw-away scripts
that are rarely reused.  Programmers who write reusable code almost
universally hate ad hoc abbreviations. (There are exceptions:
<http://www.kuro5hin.org/story/2002/8/30/175531/763>.)

If numpy is going to compete with MATLAB, we should not ignore
non-programmer user base.  I like the idea of providing recommended
abbreviations.   There is a precedent for doing that: GNU command line
utilities provide long/short alternatives for most options.  Long
options are recommended for use in scripts while short are
indispensable at the command line.

I would like to suggest the following guidelines:

1. Numpy should never invent abbreviations, but may reuse
abbreviations used in the art.

2. When alternative names are made available, there should be one
simple rule for reducing the long name to short.  For example, use of
acronyms may provide one such rule: singular_value_decomposition ->
svd.  Unfortunately that would mean linear_least_squares -> lls, not
ols and conflict with rule #1 (rename lstsq ->
ordinary_least_squares?).

The second guideline may be hard to follow, but it is very important.
Without a rule like this, there will be confusion on whether
linear_least_squares and lsltsq are the same or just "similar".

Re: [Numpy-discussion] Don't like the short names like lstsq and irefft

From: Tim H. <tim...@co...> - 2006-06-16 16:23:28

Sasha wrote:

>On 6/16/06, Sven Schreiber <sve...@gm...> wrote:
>  
>
>>....
>>Abbreviations will emerge anyway, the question is merely: Will numpy
>>provide/recommend them (in addition to having long names maybe), or will
>>it have to be done by somebody else, possibly resulting in many
>>different sets of abbreviations for the same purpose.
>>
>>    
>>
>This is a valid point.  In my experience ad hoc abbreviations are more
>popular among scientists who are not used to writing large programs.
>They use numpy either interactively or write short throw-away scripts
>that are rarely reused.  Programmers who write reusable code almost
>universally hate ad hoc abbreviations. (There are exceptions:
><http://www.kuro5hin.org/story/2002/8/30/175531/763>.)
>
>If numpy is going to compete with MATLAB, we should not ignore
>non-programmer user base.  I like the idea of providing recommended
>abbreviations.   There is a precedent for doing that: GNU command line
>utilities provide long/short alternatives for most options.  Long
>options are recommended for use in scripts while short are
>indispensable at the command line.
>  
>
Unless the abreviations are obvious, adding second set of names will 
make it more difficult to read others code. In particular, it will make 
it harder to answer questions on the newsgroup. Particularly since I 
suspect that most of the more experienced  users will end up using long 
names while the new users coming from MATLAB or whatever will use the 
shorter names.

>I would like to suggest the following guidelines:
>
>1. Numpy should never invent abbreviations, but may reuse
>abbreviations used in the art.
>  
>
Let me add a couple of cents here. There are widespread terms of the art 
and there are terms of art that are specific to a certain field. At the 
top level, I would like to see only widespread terms of the art. Thus, 
'cos', 'sin', 'exp', etc are perfectly fine. However, something like 
'dft' is not so good. Perversely, I consider 'fft' a widespread term of 
the art, but the more general 'dft' is somehow not.

These narrower terms would be perfectly fine if segregated into 
appropriate packages. For example, I would consider it more sensible to 
have the current package 'dft' renamed to 'fourier' and the routine 
'fft' renamed to 'dft' (since that's what it is).  As another example, 
linear_algebra.svd is perfectly clear, but numpy.svd would be opaque.

>2. When alternative names are made available, there should be one
>simple rule for reducing the long name to short.  For example, use of
>acronyms may provide one such rule: singular_value_decomposition ->
>svd.
>
Svd is already a term of the art I believe, so linalg.svd seems like a 
fine name for singular_value_decomposition.

>  Unfortunately that would mean linear_least_squares -> lls, not
>ols and conflict with rule #1 (rename lstsq ->
>ordinary_least_squares?).
>  
>
Before you consider this I suggest that you google 'linear algebra lls' 
and 'linear algebra ols'. The results may suprise you...

While your at it google 'linear algebra svd'

>The second guideline may be hard to follow, but it is very important.
>Without a rule like this, there will be confusion on whether
>linear_least_squares and lsltsq are the same or just "similar".
>  
>
Can I just reiterate a hearty blech! for having two sets of names. The 
horizontal space argument is mostly bogus in my opinion -- functions 
that tend to be used in complicated expression already have short, 
widely used abbreviations that we can steal. The typing argument is also 
mostly bogus: a decent editor will do tab completion (I use a pretty 
much no frills editor, SciTe, and even it does tab completion)  and 
there's IPython if you want tab completion in interactive mode.

-tim