[Tcl-bugs] [ tcl-Bugs-1767293 ] Bad performance using ** operator with integer operands

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Bugs item #1767293, was opened at 2007-08-03 15:53
Message generated for change (Comment added) made by kennykb
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110894&aid=1767293&group_id=10894

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: 46. Bytecode Compiler
Group: current: 8.5a6
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Emiliano (egavilan)
Assigned to: Kevin B KENNY (kennykb)
Summary: Bad performance using ** operator with integer operands

Initial Comment:
When using integers with ** operand, the speed is considerably slower than if using at least one float argument.

After a few runs, I get consistenly these results:

% time {expr {5.0**2}} 10000
3.6763 microseconds per iteration
% time {expr {5**2.0}} 10000
3.6791 microseconds per iteration
% time {expr {double(5)**2}} 10000
5.4605 microseconds per iteration
% time {expr {5**double(2)}} 10000
5.693 microseconds per iteration
% time {expr {5**2}} 10000
38.9746 microseconds per iteration

just to compare with the other "traditional" ways to obtain 5**2

% time {expr {5*5}} 10000
2.7841 microseconds per iteration
% time {expr {pow(5,2)}} 10000
7.7954 microseconds per iteration

% info patchlevel
8.5a6

----------------------------------------------------------------------

>Comment By: Kevin B KENNY (kennykb)
Date: 2007-08-07 16:07

Message:
Logged In: YES 
user_id=99768
Originator: NO

Patch #5 - Added a table for the powers (exponent > 16) of small integers
that fit in a 64-bit  word.  This patch should have just about all of the
possible optimizations done.  It presumes that a Tcl_WideInt is always at
least 64 bits; if that is not the case, then we need to figure out a way to
test for that and condition out the 64-bit code and tables.

File Added: power.patch

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2007-08-07 13:16

Message:
Logged In: YES 
user_id=99768
Originator: NO

Updated patch #4 to fix a bug with first powers (I'd misread the earlier
code as having a special case to handle them. I was mistaken.)
File Added: power.patch

----------------------------------------------------------------------

Comment By: Don Porter (dgp)
Date: 2007-08-07 12:58

Message:
Logged In: YES 
user_id=80530
Originator: NO

failing tests mathop-(20.3,25.6,25.13)

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2007-08-06 23:35

Message:
Logged In: YES 
user_id=99768
Originator: NO

One more try - powers of -2 were still not right.
File Added: power.patch

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2007-08-06 23:26

Message:
Logged In: YES 
user_id=99768
Originator: NO

Updated patch:
   - handles negative exponents correctly (first one had a bug)
   - handles powers up to and including x**16 with 'wide' results
   - does not handle higher powers with 'wide' results.

File Added: power.patch

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2007-08-06 18:36

Message:
Logged In: YES 
user_id=99768
Originator: NO

On Win32 (mingw), the patch is observed to fix the performance anomaly:

% time {expr {5*5}} 100000
0.63991 microseconds per iteration
% time {expr {5**2}} 100000
0.62671 microseconds per iteration
% time {expr {pow(5,2)}} 100000
1.68862 microseconds per iteration
% time {expr {5*5*5*5*5}} 100000
0.91689 microseconds per iteration
% time {expr {5**5}} 100000
0.72139 microseconds per iteration
% time {expr {pow(5,5)}} 100000
1.65471 microseconds per iteration

The performance penalty is still there when quantities overflow from a
32-bit integer; we're still working on that one:

% time {expr {3**19}} 100000; # this fits in a 32-bit word
0.61131 microseconds per iteration
% time {expr {3**20}} 100000; # this doesn't
5.84269 microseconds per iteration

----------------------------------------------------------------------

Comment By: Kevin B KENNY (kennykb)
Date: 2007-08-06 18:20

Message:
Logged In: YES 
user_id=99768
Originator: NO

The attached patch is working toward a solution.

Observation #1: Powers of 2 are the same as shifts of 1, and are handled
thus.

Observation #2: For 32-bit integers, a very efficient exponent can be
computed by doing special-case straight-line code for raising to powers
3-8, and then using table lookups for higher powers (up to 3**19 and up to
10**10). 

Observation #3: For 64-bit integers, a similar computation can be done
with straight-line code for raising to powers 3-16, and then table lookup
for higher powers (up to 3**39
and up to 14**17).  

I haven't tried to implement #3, because I'm none too certain how to
initialize
Tcl_WideInts at compile time with values that don't fit in an int.

The patch should be commitable as it stands (resulting in performance
gains only on 32-bit machines), but obviously #3 is important, too.
File Added: power.patch

----------------------------------------------------------------------

Comment By: Don Porter (dgp)
Date: 2007-08-06 11:24

Message:
Logged In: YES 
user_id=80530
Originator: NO

Aha!  Spoke too soon.

Root cause of this one is found at
line 5240 of Revision 1.309 of
tclExecute.c:

/* TODO: Perform those computations that fit in native types */

Since code to detect and perform those
computations that fit in native types isn't
done yet, all non-trivial integer ** operations
get done as bignums.

Patch still welcome.  Normal priority restored.

----------------------------------------------------------------------

Comment By: Don Porter (dgp)
Date: 2007-08-06 11:19

Message:
Logged In: YES 
user_id=80530
Originator: NO

When at least one of the arguments is double,
the operation turns into a pow() call at the
C level, so it's comparable performance to
[expr {pow(...)}] minus Tcl function call overhead.

When both are integer, the current implementation
is an iterative one.  It's worth looking for improvements,
but only to the extent that the current performance
isn't "fast enough".  Patches welcome; otherwise I'll
get to this when I get to it.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=110894&aid=1767293&group_id=10894

[Tcl-bugs] [ tcl-Bugs-1767293 ] Bad performance using ** operator with integer operands

The Tool Command Language implementation

[Tcl-bugs] [ tcl-Bugs-1767293 ] Bad performance using ** operator with integer operands