Thread: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

Brought to you by: cjgohlke, dsdale, efiring, heeres, and 8 others

matplotlib-users

[Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-12 12:41:13

Hi,

    I am a regular user of matplotlib since I moved from matlab to 
python/numpy/scipy. Even if I find matplotlib to be a real help during 
the transition from matlab to python, I must confess I found it the most 
disappointing compare other packages ( essentially numpy/scipy/ipython). 
This is not a rant; I want to know if this slowness is coming from my 
lack of matplotlib knowledge or not; I apologize in advance if the 
following hurts anyone feelings :)

    First, I must admit that whereas I took a significant amount of time 
to study numpy and scipy, I didn't take that same time for matplotlib. 
So this disappointment may just be a consequences of this laziness.

    My main problem with matplotlib is speed: I find it really annoying 
to use in an interactive manner. For example, when I need to display 
some 2d information, such as spectrogramm or correlogram, this take 1 or 
2 seconds for a small signal (~4500 frames of 256 samples). My function 
correlogram (similar to specgram, but compute correlation instead of log 
spectrum) uses imshow, and this function takes 20 times more time than 
imagesc of matlab for the same size.
    Also, I found changing the size of the matplotlib window really 
'annoying to the eye': I compared to matlab, and this may be due to the 
fact that the whole window is redrawn with matplotlib, including the 
toolbar, whereas in matlab, the top toolbar is not redrawn.
    Finally, plotting many data (using plot(X, Y) with X and Y around 
1000/10000 samples) is 'slow' (the '' are because I don't know much 
about computer graphics, and I understand that slow in the rendering is 
often just a perception)

    So, is this a current limitation of matplotlib, is matplotlib 
optimized for good rendering for publication, and not for interactive 
use, or I am just misguided in my use of matplotlib ?

    Config info:

    - ubuntu edgy on a bi xeon 3.2 Ghz with 2 Gb of Ram
    - numpy SVN (post 1.0)
    - matplotlib 0.87.7
    - matplotlibrc: uses numpy for numeric, Gtk as a backend (or GtkAdd 
for anti aliasing, but this makes the problem worse).

    Cheers,

    David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: John H. <jdh...@ac...> - 2006-12-12 17:05:12

>>>>> "David" == David Cournapeau <da...@ar...> writes:

    David> Hi, I am a regular user of matplotlib since I moved from
    David> matlab to python/numpy/scipy. Even if I find matplotlib to
    David> be a real help during the transition from matlab to python,
    David> I must confess I found it the most disappointing compare
    David> other packages ( essentially numpy/scipy/ipython).  This is

   Meatloaf: Now don't be sad, cause two out of three ain't bad

If you consider the fact that matplotlib was originally an ipython
patch that was rejected, you can see why we are such a bastard child
of the scientific python world.  There is a seed of truth in this;
Numeric, scipy and ipython were all mature packages in widespread use
before the first line of matplotlib code was written.  So they are
farther along in terms of maturity, documentation, usability,
etc... than matplotlib is.  

But we've achieved a lot in a comparably short time.  When I started
working on matplotlib there were probably two dozen plotting packages
that people used and recommended.  Now we are down to 5 or 6, with
matplotlib doing most of what most people need.  I've focused on
making something that does most of what people (and I) need rather
than doing it the fastest, so it is too slow for some purposes but
fast enough for most.  When we get a well defined important test case
that is too slow, we typically try and optimize it, sometimes with
dramatic results (eg 25 fold speedups); more on this below.

A consequence of trying to support most of the needs of most users is
this: we run on all major operating systems and all major GUIs with
all major array packages.  Consider the combinatorial problem: 5
graphical user interfaces with two or more versions in the wild across
3 operating systems and you will get a feel for what the support
problem we have.  This is not an academic point.  Most of the GUI
maintainers for *a single backend* burn out in short order.  Most
graphics packages *solve* this problem by supporting a single output
format (PYX) or GUI (chaco) which is a damned fine and admirable
solution.  But the consequence of this is plotting fragmentation:
people who need GTK cannot use Chaco, people who need SVG cannot use
PYX, and so on, and so they'll write their own plotting library for
their own GUI or output format (the situation before matplotlib).  You
can certainly get closer to bare metal speed by reducing choices and
focusing on a single target -- part of the performance price we pay is
in our abstraction layers, part is in trying to support features that
may be rarely used but cost something (masked array support, rotated
text with newlines), part is because we need to get to work and
optimize the slow parts.

    David> not a rant; I want to know if this slowness is coming from
    David> my lack of matplotlib knowledge or not; I apologize in
    David> advance if the following hurts anyone feelings :)

    Meatloaf: But -- there ain't no way I'm ever gonna love you

OK, I'll stop now.

    David>     First, I must admit that whereas I took a significant
    David> amount of time to study numpy and scipy, I didn't take that
    David> same time for matplotlib.  So this disappointment may just
    David> be a consequences of this laziness.

I suspect this is partly true; see below.

    David>     My main problem with matplotlib is speed: I find it
    David> really annoying to use in an interactive manner. For
    David> example, when I need to display some 2d information, such
    David> as spectrogramm or correlogram, this take 1 or 2 seconds
    David> for a small signal (~4500 frames of 256 samples). My
    David> function correlogram (similar to specgram, but compute
    David> correlation instead of log spectrum) uses imshow, and this
    David> function takes 20 times more time than imagesc of matlab
    David> for the same size.  Also, I found changing the size of the

This is where you can help us.  Saying specgram is slow is only
marginally more useful than saying matplotlib is slow or python is
slow.  What is helpful is to post a complete, free-standing script
that we can run, with some attached performance numbers.  For
starters, just run it with the Agg backend so we can isolate
matplotlib from the respective GUIs.  Show us how the performance
scales with the specgram parameters (frames and samples).  specgram is
divided into two parts (if you look at the Axes.specgram you will see
that it calls matplotlib.mlab.specgram to do the computation and
Axes.imshow to visualize it.  Which part is slow: the mlab.specgram
computation or the visualizion (imshow) part or both?  You can paste
this function into your own python file and start timing different
parts.  The most helpful "this is slow" posts come with profiler
output so we can see where the bottlenecks are.  

Such a post by Fernando Perez on "plot" with markers yielded
performance boosts of 25x for large numbers of points when he showed
we were making about one hundred thousand function calls per plot.

    David> matplotlib window really 'annoying to the eye': I compared
    David> to matlab, and this may be due to the fact that the whole
    David> window is redrawn with matplotlib, including the toolbar,
    David> whereas in matlab, the top toolbar is not redrawn.

It would be nice if we exposed the underlying GTK widgets to you so
you could customize the "expand" and "fill" properties of the gtk
toolbar, but this gets us into the multiple GUI, multiple version
problem discussed above.  Providing an abstract interface to such
details that works across the mpl backends is a lot of work that takes
us away from our core incompetency -- plotting.  What we do is enable
you to write your own widgets and embed mpl in them; see
examples/embedding_in_gtk2.py which shows you how to do this for
GTK/GTKAgg.  You can then customize the toolbar to your heart's
content.

    David> Finally, plotting many data (using plot(X, Y) with X and Y
    David> around 1000/10000 samples) is 'slow' (the '' are because I
    David> don't know much about computer graphics, and I understand
    David> that slow in the rendering is often just a perception)

This shouldn't be slow -- again a test script with some performance
numbers would help so we can compare what we are getting.  One
thought: make sure you are using the numerix layer properly -- ie, if
you are creating arrays with numpy make sure you have numerix set to
numpy ( i see below that you set numerix to numpy but
--verbose-helpful will confirm the setting).  A good way to start is
to write a demonstration script that you find too slow which makes a
call to savefig, and run it with

  > time myscript.py --verbose-helpful -dAgg

and post the output and script.  Then we might be able to help.

    David> So, is this a current limitation of matplotlib, is
    David> matplotlib optimized for good rendering for publication,
    David> and not for interactive use, or I am just misguided in my
    David> use of matplotlib ?

Many people use it interactively, but a number of power users find it
slow.

JDH

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: Fernando P. <fpe...@gm...> - 2006-12-12 17:26:10

On 12/12/06, John Hunter <jdh...@ac...> wrote:

> --verbose-helpful will confirm the setting).  A good way to start is
> to write a demonstration script that you find too slow which makes a
> call to savefig, and run it with
>
>   > time myscript.py --verbose-helpful -dAgg

It may be worth mentioning here this little utility (Linux only, unfortunately):

http://amath.colorado.edu/faculty/fperez/python/profiling/

For profiling more complex codes, it's really a godsend.  And note
that the generated cachegrind files are typically small and can be
sent to others for analysis, so you can run it locally (if for example
the run depends on data you can't share) and then send to the list the
generated profile.  Anyone with Kcachegrind will then be able to load
your profile info and study it in detail.

Cheers,

f

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-13 07:07:22

Attachments: slowmatplotlib.tbz2

John Hunter wrote:
> This is where you can help us.  Saying specgram is slow is only
> marginally more useful than saying matplotlib is slow or python is
> slow.  What is helpful is to post a complete, free-standing script
> that we can run, with some attached performance numbers.  For
> starters, just run it with the Agg backend so we can isolate
> matplotlib from the respective GUIs.  Show us how the performance
> scales with the specgram parameters (frames and samples).  specgram is
> divided into two parts (if you look at the Axes.specgram you will see
> that it calls matplotlib.mlab.specgram to do the computation and
> Axes.imshow to visualize it.  Which part is slow: the mlab.specgram
> computation or the visualizion (imshow) part or both?  You can paste
> this function into your own python file and start timing different
> parts.  The most helpful "this is slow" posts come with profiler
> output so we can see where the bottlenecks are.  
(sorry for double posting)

Ok, here we go: I believe that the rendering of the figure returned by 
imshow to be slow.

For example, let's say I have a 2 minutes signal @ 8kHz sampling-rate, 
with windows of 256 samples with 50 % overlap. I have around 64 frames / 
seconds, eg ~ 8000 frames of 256 windows.

So for benchmark purposes, we can just send random data of shape 
8000x256 to imshow. In ipython, this takes a long time (around 2 seconds 
for imshow(data), where data = random(8000, 256)).

Now, on a small script to have a better idea:

import numpy as N
import pylab as P

def generate_data_2d(fr, nwin, hop, len):
    nframes = 1.0 * fr / hop * len
    return N.random.randn(nframes, nwin)

def bench_imshow(fr, nwin, hop, len, show = True):
    data    = generate_data_2d(fr, nwin, hop, 
len)                                        
    P.imshow(data)
    if show:
        
P.show()                                                                         

 
if __name__ == '__main__':
    # 2 minutes (120 sec) of sounds @ 8 kHz with 256 samples with 50 % 
overlap
    bench_imshow(8000, 256, 128, 120, show = False)

Now, I have a problem, because I don't know how to benchmark when using 
show to True (I have to manually close the figure).

If I run the above script with time, I got 1.5 seconds with show = False 
(after several trials to be sure matplotlib files are in the system 
cache: this matters because my home dir is on NFS). If I set show = 
True, and close the figure by hand once the figure is plotted, I have 
4.5 sec instead.

If I run the above script with -dAgg --versbose-helpful (I was looking 
for this one  to check numerix is correctly set to numpy:) ):

with show = False:

matplotlib data path 
/home/david/local/lib/python2.4/site-packages/matplotlib/mpl-data
$HOME=/home/david
CONFIGDIR=/home/david/.matplotlib
loaded rc file /home/david/.matplotlib/matplotlibrc
matplotlib version 0.87.7
verbose.level helpful
interactive is False
platform is linux2
numerix numpy 1.0.2.dev3484
font search path 
['/home/david/local/lib/python2.4/site-packages/matplotlib/mpl-data']
loaded ttfcache file /home/david/.matplotlib/ttffont.cache
backend Agg version v2.2

real    0m1.185s
user    0m0.808s
sys     0m0.224s

with show = True

matplotlib data path 
/home/david/local/lib/python2.4/site-packages/matplotlib/mpl-data
$HOME=/home/david
CONFIGDIR=/home/david/.matplotlib
loaded rc file /home/david/.matplotlib/matplotlibrc
matplotlib version 0.87.7
verbose.level helpful
interactive is False
platform is linux2
numerix numpy 1.0.2.dev3484
font search path 
['/home/david/local/lib/python2.4/site-packages/matplotlib/mpl-data']
loaded ttfcache file /home/david/.matplotlib/ttffont.cache
backend Agg version v2.2

real    0m1.193s
user    0m0.848s
sys     0m0.192s

So the problem is in the rendering, right ? (Not sure to understand 
exactly what Agg backend is doing).

Now, using hotshot (kcache grind profiles attached to the email), for 
the noshow case:

       1    0.001    0.001    0.839    0.839 
slowmatplotlib.py:181(bench_imshow_noshow)
       1    0.000    0.000    0.837    0.837 
slowmatplotlib.py:163(bench_imshow)
       1    0.000    0.000    0.586    0.586 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:1894(imshow) 

       3    0.000    0.000    0.510    0.170 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:883(gca)
       1    0.000    0.000    0.509    0.509 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:950(ishold) 

       4    0.000    0.000    0.409    0.102 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:903(gcf)
       1    0.000    0.000    0.409    0.409 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:818(figure) 

       1    0.000    0.000    0.408    0.408 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtkagg.py:36(new_figure_manager) 

       1    0.003    0.003    0.400    0.400 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:401(__init__) 

       1    0.000    0.000    0.397    0.397 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtkagg.py:25(_get_toolbar) 

       1    0.001    0.001    0.397    0.397 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:496(__init__) 

       1    0.000    0.000    0.396    0.396 
/home/david/local/lib/python2.4/site-packages/matplotlib/backend_bases.py:1112(__init__) 

       1    0.000    0.000    0.396    0.396 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:557(_init_toolbar) 

       1    0.008    0.008    0.396    0.396 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:595(_init_toolbar2_4) 

       1    0.388    0.388    0.388    0.388 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:967(__init__) 

       1    0.251    0.251    0.251    0.251 
slowmatplotlib.py:155(generate_data_2d)
       3    0.000    0.000    0.101    0.034 
/home/david/local/lib/python2.4/site-packages/matplotlib/figure.py:629(gca)
       1    0.000    0.000    0.101    0.101 
/home/david/local/lib/python2.4/site-packages/matplotlib/figure.py:449(add_subplot) 

       1    0.000    0.000    0.100    0.100 
/home/david/local/lib/python2.4/site-packages/matplotlib/axes.py:4523(__init__) 

       1    0.000    0.000    0.100    0.100 
/home/david/local/lib/python2.4/site-packages/matplotlib/axes.py:337(__init__) 


But the show case is more interesting:

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
       1    0.002    0.002    3.886    3.886 
slowmatplotlib.py:177(bench_imshow_show)
       1    0.000    0.000    3.884    3.884 
slowmatplotlib.py:163(bench_imshow)
       1    0.698    0.698    3.003    3.003 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:70(show) 

       2    0.000    0.000    2.266    1.133 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:275(expose_event) 

       1    0.009    0.009    2.266    2.266 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtkagg.py:71(_render_figure) 

       1    0.000    0.000    2.256    2.256 
/home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_agg.py:385(draw) 

       1    0.000    0.000    2.253    2.253 
/home/david/local/lib/python2.4/site-packages/matplotlib/figure.py:510(draw) 

       1    0.000    0.000    2.251    2.251 
/home/david/local/lib/python2.4/site-packages/matplotlib/axes.py:994(draw)
       1    0.005    0.005    1.951    1.951 
/home/david/local/lib/python2.4/site-packages/matplotlib/image.py:173(draw)
       1    0.096    0.096    1.946    1.946 
/home/david/local/lib/python2.4/site-packages/matplotlib/image.py:109(make_image) 

       1    0.002    0.002    1.850    1.850 
/home/david/local/lib/python2.4/site-packages/matplotlib/cm.py:50(to_rgba)
       1    0.001    0.001    0.949    0.949 
/home/david/local/lib/python2.4/site-packages/matplotlib/colors.py:735(__call__) 

       1    0.097    0.097    0.899    0.899 
/home/david/local/lib/python2.4/site-packages/matplotlib/colors.py:568(__call__) 

     325    0.050    0.000    0.671    0.002 
/home/david/local/lib/python2.4/site-packages/numpy/core/ma.py:533(__init__) 

       1    0.600    0.600    0.600    0.600 
/home/david/local/lib/python2.4/site-packages/numpy/core/fromnumeric.py:282(resize) 

       1    0.000    0.000    0.596    0.596 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:1894(imshow) 

      10    0.570    0.057    0.570    0.057 
/home/david/local/lib/python2.4/site-packages/numpy/oldnumeric/functions.py:117(where) 

       3    0.000    0.000    0.513    0.171 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:883(gca)
       1    0.000    0.000    0.513    0.513 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:950(ishold) 

       4    0.000    0.000    0.408    0.102 
/home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:903(gcf)

For more details, see the .kc files which are the in the tbz2 archive, 
with the script for generating profiles for kcachegrind,

I will post an other email for the other problem (with several subplots)

cheers,

David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-13 08:37:36

David Cournapeau wrote:
> But the show case is more interesting:
>
>  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
>       1    0.002    0.002    3.886    3.886 
> slowmatplotlib.py:177(bench_imshow_show)
>       1    0.000    0.000    3.884    3.884 
> slowmatplotlib.py:163(bench_imshow)
>       1    0.698    0.698    3.003    3.003 
> /home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:70(show) 
>
>       2    0.000    0.000    2.266    1.133 
> /home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtk.py:275(expose_event) 
>
>       1    0.009    0.009    2.266    2.266 
> /home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_gtkagg.py:71(_render_figure) 
>
>       1    0.000    0.000    2.256    2.256 
> /home/david/local/lib/python2.4/site-packages/matplotlib/backends/backend_agg.py:385(draw) 
>
>       1    0.000    0.000    2.253    2.253 
> /home/david/local/lib/python2.4/site-packages/matplotlib/figure.py:510(draw) 
>
>       1    0.000    0.000    2.251    2.251 
> /home/david/local/lib/python2.4/site-packages/matplotlib/axes.py:994(draw) 
>
>       1    0.005    0.005    1.951    1.951 
> /home/david/local/lib/python2.4/site-packages/matplotlib/image.py:173(draw) 
>
>       1    0.096    0.096    1.946    1.946 
> /home/david/local/lib/python2.4/site-packages/matplotlib/image.py:109(make_image) 
>
>       1    0.002    0.002    1.850    1.850 
> /home/david/local/lib/python2.4/site-packages/matplotlib/cm.py:50(to_rgba) 
>
>       1    0.001    0.001    0.949    0.949 
> /home/david/local/lib/python2.4/site-packages/matplotlib/colors.py:735(__call__) 
>
>       1    0.097    0.097    0.899    0.899 
> /home/david/local/lib/python2.4/site-packages/matplotlib/colors.py:568(__call__) 
>
>     325    0.050    0.000    0.671    0.002 
> /home/david/local/lib/python2.4/site-packages/numpy/core/ma.py:533(__init__) 
>
>       1    0.600    0.600    0.600    0.600 
> /home/david/local/lib/python2.4/site-packages/numpy/core/fromnumeric.py:282(resize) 
>
>       1    0.000    0.000    0.596    0.596 
> /home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:1894(imshow) 
>
>      10    0.570    0.057    0.570    0.057 
> /home/david/local/lib/python2.4/site-packages/numpy/oldnumeric/functions.py:117(where) 
>
>       3    0.000    0.000    0.513    0.171 
> /home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:883(gca) 
>
>       1    0.000    0.000    0.513    0.513 
> /home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:950(ishold) 
>
>       4    0.000    0.000    0.408    0.102 
> /home/david/local/lib/python2.4/site-packages/matplotlib/pylab.py:903(gcf) 
>
>
> For more details, see the .kc files which are the in the tbz2 archive, 
> with the script for generating profiles for kcachegrind,
Here is some stuff I tried:

   - first, we can see that in expose_event (one is expensive, the other 
negligeable, from my understanding), two calls are pretty expensive:
the __call__  at line 735 (for normalize functor) and one for __call__ 
at line 568 (for colormap functor).
   - for normalize functor, one line is expensive:  val = 
ma.array(clip(val.filled(vmax), vmin, vmax), mask=mask). If I put a test 
on mask when mask is None (which it is in my case), then the function 
becomes negligeable.
   - for colormap functor, the 3 where calls are expensive. I am not 
sure to understand in which case they are useful; if I understand 
correctly, one tries to avoid
values out of range (0, N), and force out of range values to be clipped. 
Isn't there an easier way than using where ?

   If I remove the where in the colormap functor, I have a 4x speed 
increase for the to_rgba function. After that, it becomes a bit more 
tricky to change things for someone like me who have no knowledge about 
matplotlib internals.

   Cheers,

   David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: Eric F. <ef...@ha...> - 2006-12-13 18:30:01

David,


>    - first, we can see that in expose_event (one is expensive, the other 
> negligeable, from my understanding), two calls are pretty expensive:
> the __call__  at line 735 (for normalize functor) and one for __call__ 
> at line 568 (for colormap functor).
>    - for normalize functor, one line is expensive:  val = 
> ma.array(clip(val.filled(vmax), vmin, vmax), mask=mask). If I put a test 
> on mask when mask is None (which it is in my case), then the function 
> becomes negligeable.
>    - for colormap functor, the 3 where calls are expensive. I am not 
> sure to understand in which case they are useful; if I understand 
> correctly, one tries to avoid
> values out of range (0, N), and force out of range values to be clipped. 
> Isn't there an easier way than using where ?
> 
>    If I remove the where in the colormap functor, I have a 4x speed 
> increase for the to_rgba function. After that, it becomes a bit more 
> tricky to change things for someone like me who have no knowledge about 
> matplotlib internals.

The things you have identified were added by me to support masked array 
bad values and special colors for regions above or below the mapped 
range of values.  I will be happy to make changes to speed them up.

Regarding the clip line, I think that your test for mask is None is not 
the right solution because it knocks out the clipping operation, but the 
clipping is intended regardless of the state of the mask.  I had 
expected it to be a very fast operation, so I am surprised it is a 
bottleneck; in any case I can take a look to see how it can be sped up, 
or whether it can be bypassed in some cases.  Maybe it is also using 
"where" internally.

Now I recall very recent discussion explaining why "where" is slow 
compared to indexing with a boolean, so I know I can speed it up with 
numpy.  Unfortunately Numeric does not support this, so maybe what will 
be needed is numerix functions that take advantage of numpy when 
available.  This is one of those times when I really wish we could drop 
Numeric and numarray support *now* and start taking full advantage of numpy.

In any case, thanks for pointing out the slowdowns--I will fix them as 
best I can--and keep at it.  I share your interest in speeding up 
interactive use of matplotlib, along with fixing bugs, filling holes in 
functionalisy, and smoothing rough edges. There is a lot to be done. As 
John noted, though, there will always be tradeoffs among flexibility, 
code simplicity, generality, and speed.

Eric

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-14 03:09:35

Eric Firing wrote:

>
> Regarding the clip line, I think that your test for mask is None is 
> not the right solution because it knocks out the clipping operation, 
> but the clipping is intended regardless of the state of the mask.  I 
> had expected it to be a very fast operation, so I am surprised it is a 
> bottleneck; in any case I can take a look to see how it can be sped 
> up, or whether it can be bypassed in some cases.  Maybe it is also 
> using "where" internally.
(again, sorry for the double posting, I always forget that some ML do 
not reply automatically to the ML)

My wordings were vague at best :) The clipping operation is *not* 
removed, and it was not the culprit (it becomes a bottleneck once you 
get the 4x speed issue, though). What I did was:

if self.clip:
               mask = ma.getmaskorNone(val)
               if mask == None:
                   val = ma.array(clip(val.filled(vmax), vmin, vmax))
               else:
                   val = ma.array(clip(val.filled(vmax), vmin, vmax),
                               mask=mask)

Actually, the problem is in ma.array: with a value of mask to None, it 
should not make a difference between mask = None or no mask arg, right ? 
I didn't change ma.array to keep my change as local as possible. To 
change only this operation as above gives a speed up from 1.8 s to ~ 1.0 
s for to_rgba, which means calling show goes from ~ 2.2 s to ~1.4 s. I 
also changed

result = (val-vmin)/float(vmax-vmin)

to

invcache    = 1.0 / (vmax - vmin)
result = (val-vmin) * invcache

which gives a moderate speed up (around 100 ms for a 8000x256 points 
array, still in the 5-10 % range of the whole cost, though, and is not 
likely to cause any hidden bug). Once you make both those changes, the 
clip call is by far the most expensive operation in normalize functor, 
but the functor is not really expensive anymore compared to the rest, so 
this is not where I looked at after.

For the where calls in Colormap functor, I was wondering if they are 
necessary in all cases: some of those calls seem redundant, and it may 
be possible to detect that before calling them. This should be both 
easier and faster, at least in this case, than having a fast where ?

I understand that support of multiple array backend, support of mask 
arrays have cost consequences. But it looks like it may be possible to 
speed things up for cases where an array has only meaningful values/no 
mask.

cheers,

David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: Eric F. <ef...@ha...> - 2006-12-14 18:14:49

David,

I have made some changes in svn that address all but one of the points 
you made:

[....]
> if self.clip:
>                mask = ma.getmaskorNone(val)
>                if mask == None:
>                    val = ma.array(clip(val.filled(vmax), vmin, vmax))
>                else:
>                    val = ma.array(clip(val.filled(vmax), vmin, vmax),
>                                mask=mask)

The real problem here is that I should not have been using 
getmaskorNone().  In numpy.ma, we need nomask, not None, so we want an 
ordinary getmask() call.  ma.array(...., mask=ma.nomask) is very fast, 
so the problem goes away.

> 
> Actually, the problem is in ma.array: with a value of mask to None, it 
> should not make a difference between mask = None or no mask arg, right ? 
But it does, because for numpy it needs to be nomask; it does something 
with None, but whatever it is, it is very slow.

> I didn't change ma.array to keep my change as local as possible. To 
> change only this operation as above gives a speed up from 1.8 s to ~ 1.0 
> s for to_rgba, which means calling show goes from ~ 2.2 s to ~1.4 s. I 
> also changed  
> result = (val-vmin)/float(vmax-vmin)
> 
> to
> 
> invcache    = 1.0 / (vmax - vmin)
> result = (val-vmin) * invcache

This is the one I did not address.  I don't understand how this could be 
making much difference, and some testing using ipython and %prun with 
1-line operations showed little difference with variations on this 
theme.  The fastest would appear to be (and logically should be, I 
think) result = (val-vmin)*(1.0/(vmax-vmin)), but I don't think it makes 
much difference--it looks to me like maybe 10-20 msec, not 100, on my 
Pentium M 1.6 Ghz.  Maybe still worthwhile, so I may yet make the change 
after more careful testing.


> 
> which gives a moderate speed up (around 100 ms for a 8000x256 points 
> array). Once you make both those changes, the clip call is by far the 
> most expensive operation in normalize functor, but the functor is not 
> really expensive anymore compared to the rest, so this is not where I 
> looked at.
> 
> For the where calls in Colormap functor, I was wondering if they are 
> necessary in all cases: some of those calls seem redundant, and it may 
> be possible to detect that before calling them. This should be both 
> easier and faster, at least in this case, than having a fast where ?
> 

You hit the nail squarely: where() is the wrong function to use, and I 
have eliminated it from colors.py.  The much faster replacement is 
putmask, which does as well as direct indexing with a Boolean but works 
with all three numerical packages.  I think that using the fast putmask 
is better than trying to figure out special cases in which there would 
be nothing to put, although I could be convinced otherwise.


> I understand that support of multiple array backend, support of mask 
> arrays have cost consequences. But it looks like it may be possible to 
> speed things up for cases where an array has only meaningful values/no 
> mask.

The big gains here were essentially bug fixes--picking the appropriate 
function (getmask versus getmaskorNone and putmask versus where).

Here is the colors.py diff:

--- trunk/matplotlib/lib/matplotlib/colors.py	2006/12/03 21:54:38	2906
+++ trunk/matplotlib/lib/matplotlib/colors.py	2006/12/14 08:27:04	2923
@@ -30,9 +30,9 @@
  """
  import re

-from numerix import array, arange, take, put, Float, Int, where, \
+from numerix import array, arange, take, put, Float, Int, putmask, \
       zeros, asarray, sort, searchsorted, sometrue, ravel, divide,\
-     ones, typecode, typecodes, alltrue
+     ones, typecode, typecodes, alltrue, clip
  from numerix.mlab import amin, amax
  import numerix.ma as ma
  import numerix as nx
@@ -536,8 +536,9 @@
      lut[0] = y1[0]
      lut[-1] = y0[-1]
      # ensure that the lut is confined to values between 0 and 1 by 
clipping it
-    lut = where(lut > 1., 1., lut)
-    lut = where(lut < 0., 0., lut)
+    clip(lut, 0.0, 1.0)
+    #lut = where(lut > 1., 1., lut)
+    #lut = where(lut < 0., 0., lut)
      return lut


@@ -588,16 +589,16 @@
              vtype = 'array'
              xma = ma.asarray(X)
              xa = xma.filled(0)
-            mask_bad = ma.getmaskorNone(xma)
+            mask_bad = ma.getmask(xma)
          if typecode(xa) in typecodes['Float']:
-            xa = where(xa == 1.0, 0.9999999, xa) # Tweak so 1.0 is in 
range.
+            putmask(xa, xa==1.0, 0.9999999) #Treat 1.0 as slightly less 
than 1.
              xa = (xa * self.N).astype(Int)
-        mask_under = xa < 0
-        mask_over = xa > self.N-1
-        xa = where(mask_under, self._i_under, xa)
-        xa = where(mask_over, self._i_over, xa)
-        if mask_bad is not None: # and sometrue(mask_bad):
-            xa = where(mask_bad, self._i_bad, xa)
+        # Set the over-range indices before the under-range;
+        # otherwise the under-range values get converted to over-range.
+        putmask(xa, xa>self.N-1, self._i_over)
+        putmask(xa, xa<0, self._i_under)
+        if mask_bad is not None and mask_bad.shape == xa.shape:
+            putmask(xa, mask_bad, self._i_bad)
          rgba = take(self._lut, xa)
          if vtype == 'scalar':
              rgba = tuple(rgba[0,:])
@@ -752,7 +753,7 @@
              return 0.*value
          else:
              if clip:
-                mask = ma.getmaskorNone(val)
+                mask = ma.getmask(val)
                  val = ma.array(nx.clip(val.filled(vmax), vmin, vmax),
                                  mask=mask)
              result = (val-vmin)/float(vmax-vmin)
@@ -804,7 +805,7 @@
              return 0.*value
          else:
              if clip:
-                mask = ma.getmaskorNone(val)
+                mask = ma.getmask(val)
                  val = ma.array(nx.clip(val.filled(vmax), vmin, vmax),
                                  mask=mask)
              result = 
(ma.log(val)-nx.log(vmin))/(nx.log(vmax)-nx.log(vmin))


Eric

[Matplotlib-users] Horizontal grid lines?

From: Simson G. <si...@ac...> - 2006-12-15 03:37:07

HI. I wand to have just horizontal grid lines. Is there any way to do  
this? Thanks!

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-18 05:41:57

Eric Firing wrote:
> David,
>
> I have made some changes in svn that address all but one of the points 
> you made:
>
> [....]
>> if self.clip:
>>                mask = ma.getmaskorNone(val)
>>                if mask == None:
>>                    val = ma.array(clip(val.filled(vmax), vmin, vmax))
>>                else:
>>                    val = ma.array(clip(val.filled(vmax), vmin, vmax),
>>                                mask=mask)
>
> The real problem here is that I should not have been using 
> getmaskorNone().  In numpy.ma, we need nomask, not None, so we want an 
> ordinary getmask() call.  ma.array(...., mask=ma.nomask) is very fast, 
> so the problem goes away.
>
>>
>> Actually, the problem is in ma.array: with a value of mask to None, 
>> it should not make a difference between mask = None or no mask arg, 
>> right ? 
> But it does, because for numpy it needs to be nomask; it does 
> something with None, but whatever it is, it is very slow.
>
>> I didn't change ma.array to keep my change as local as possible. To 
>> change only this operation as above gives a speed up from 1.8 s to ~ 
>> 1.0 s for to_rgba, which means calling show goes from ~ 2.2 s to ~1.4 
>> s. I also changed  result = (val-vmin)/float(vmax-vmin)
>>
>> to
>>
>> invcache    = 1.0 / (vmax - vmin)
>> result = (val-vmin) * invcache
>
> This is the one I did not address.  I don't understand how this could 
> be making much difference, and some testing using ipython and %prun 
> with 1-line operations showed little difference with variations on 
> this theme.  The fastest would appear to be (and logically should be, 
> I think) result = (val-vmin)*(1.0/(vmax-vmin)), but I don't think it 
> makes much difference--it looks to me like maybe 10-20 msec, not 100, 
> on my Pentium M 1.6 Ghz.  Maybe still worthwhile, so I may yet make 
> the change after more careful testing.
>
>
>>
>> which gives a moderate speed up (around 100 ms for a 8000x256 points 
>> array). Once you make both those changes, the clip call is by far the 
>> most expensive operation in normalize functor, but the functor is not 
>> really expensive anymore compared to the rest, so this is not where I 
>> looked at.
>>
>> For the where calls in Colormap functor, I was wondering if they are 
>> necessary in all cases: some of those calls seem redundant, and it 
>> may be possible to detect that before calling them. This should be 
>> both easier and faster, at least in this case, than having a fast 
>> where ?
>>
>
> You hit the nail squarely: where() is the wrong function to use, and I 
> have eliminated it from colors.py.  The much faster replacement is 
> putmask, which does as well as direct indexing with a Boolean but 
> works with all three numerical packages.  I think that using the fast 
> putmask is better than trying to figure out special cases in which 
> there would be nothing to put, although I could be convinced otherwise.
>
>
>> I understand that support of multiple array backend, support of mask 
>> arrays have cost consequences. But it looks like it may be possible 
>> to speed things up for cases where an array has only meaningful 
>> values/no mask.
>
> The big gains here were essentially bug fixes--picking the appropriate 
> function (getmask versus getmaskorNone and putmask versus where).
Ok, I've installed last svn, and now, there is still one function which 
is much slower than a direct numpy implementation, so I would like to 
know if this is inherent to the multiple backend nature of matplotlib or 
not. The functor Normalize uses the clip function, and a direct numpy 
would be 3 times faster (giving the show call a 20 % speed in my really 
limited benchmarks):

if clip:
    mask = ma.getmask(val)
    #val = ma.array(nx.clip(val.filled(vmax), vmin, vmax),
    #                mask=mask)
    def myclip(a, m, M):
        a[a<m]  = m
        a[a>M]  = M
        return a
    val = ma.array(myclip(val.filled(vmax), vmin, vmax), mask=mask)

I am a bit lost in the matplotlib code to see where clip is implemented 
(is it in numerix and as such using the numpy function clip ?).

Still, I must confess that all this looks quite good, because it was 
possible to speed things up quite considerably without too much effort,

cheers,

David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: Eric F. <ef...@ha...> - 2006-12-18 06:53:40

David Cournapeau wrote:
[...]
> Ok, I've installed last svn, and now, there is still one function which 
> is much slower than a direct numpy implementation, so I would like to 
> know if this is inherent to the multiple backend nature of matplotlib or 
> not. The functor Normalize uses the clip function, and a direct numpy 
> would be 3 times faster (giving the show call a 20 % speed in my really 
> limited benchmarks):
> 
> if clip:
>     mask = ma.getmask(val)
>     #val = ma.array(nx.clip(val.filled(vmax), vmin, vmax),
>     #                mask=mask)
>     def myclip(a, m, M):
>         a[a<m]  = m
>         a[a>M]  = M
>         return a
>     val = ma.array(myclip(val.filled(vmax), vmin, vmax), mask=mask)
> 
> I am a bit lost in the matplotlib code to see where clip is implemented 
> (is it in numerix and as such using the numpy function clip ?).

There is a clip function in all three numeric packages, so a native clip 
is being used.

If numpy.clip is actually slower than your version, that sounds like a 
problem with the implementation in numpy.  By all logic a single clip 
function should either be the same (if it is implemented like yours) or 
faster (if it is a single loop in C-code, as I would expect).  This 
warrants a little more investigation before changing the mpl code.  The 
best thing would be if you could make a simple standalone numpy test 
case profiling both versions and post the results as a question to the 
numpy-discussion list.  Many such questions in the past have resulted in 
big speedups in numpy.

One more thought: it is possible that the difference is because myclip 
operates on the array in place while clip generates a new array.  If 
this is the cause of the difference then changing your last line to 
"return a.copy()"  probably would slow it down to the numpy clip speed 
or slower.

Eric

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-18 07:10:15

Eric Firing wrote:
>
> There is a clip function in all three numeric packages, so a native 
> clip is being used.
>
> If numpy.clip is actually slower than your version, that sounds like a 
> problem with the implementation in numpy.  By all logic a single clip 
> function should either be the same (if it is implemented like yours) 
> or faster (if it is a single loop in C-code, as I would expect).  This 
> warrants a little more investigation before changing the mpl code.  
> The best thing would be if you could make a simple standalone numpy 
> test case profiling both versions and post the results as a question 
> to the numpy-discussion list.  Many such questions in the past have 
> resulted in big speedups in numpy.
I am much more familiar with internal numpy code than matplotlib's, so 
this is much easier for me, too :)
>
> One more thought: it is possible that the difference is because myclip 
> operates on the array in place while clip generates a new array.  If 
> this is the cause of the difference then changing your last line to 
> "return a.copy()"  probably would slow it down to the numpy clip speed 
> or slower.
It would be scary if a copy of a 8008x256 array of double took 100 ms... 
Fortunately, it does not, this does not seem to be the problem.

cheers,

David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: John H. <jdh...@ac...> - 2006-12-19 14:59:39

>>>>> "David" == David Cournapeau <da...@ar...> writes:


    David> In make_image, most of the time is taken into to_rgba:
    David> almost half of it is taken in by the take call in the
    David> Colormap.__call__. Almost 200 ms to get colors from the
    David> indexes seems quite a lot (this means 280 cycles / pixel on
    David> average !). I can reproduce this number by using a small
    David> numpy test.

    David> On my laptop (pentium M, 1.2 Ghz), make_image takes almost
    David> 85 % of the time, which seems to imply that this is where
    David> one should focus if one wants to improve the speed,

This may have been lost in the longer thread above, but what
interpolation are you using?  You may see a good performance boost by
using interpolation='nearest'.  Also, with your clip changes and with
Eric's changes is it still painfully slow for you -- how much have
these changes helped?  Of time time spent in make image, how much is
_image.fromarray, ScalarMappable.to_rgba and _image.resize?

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-20 08:09:20

John Hunter wrote:
>
>
>     David> In make_image, most of the time is taken into to_rgba:
>     David> almost half of it is taken in by the take call in the
>     David> Colormap.__call__. Almost 200 ms to get colors from the
>     David> indexes seems quite a lot (this means 280 cycles / pixel on
>     David> average !). I can reproduce this number by using a small
>     David> numpy test.
>
>     David> On my laptop (pentium M, 1.2 Ghz), make_image takes almost
>     David> 85 % of the time, which seems to imply that this is where
>     David> one should focus if one wants to improve the speed,
>
> This may have been lost in the longer thread above,
I am a bit lost myself between numpy and mpl ML, sorry for the 
inconvenience.
>  but what
> interpolation are you using?  You may see a good performance boost by
> using interpolation='nearest'.  
At  what point is interpolation used ?
> Also, with your clip changes and with
> Eric's changes is it still painfully slow for you
Painfully is a strong word :) It is still 10 to 15 times slower than 
matlab on the same computer: the show call is around 800 ms instead of 
70 ms with matlab, and matlab image is equivalent to imshow + show calls 
actually. Matlab having only one toolkit, it obviously has an advantage, 
but I don't think the problem is on the GUI side anyway.
>  -- how much have
> these changes helped? 
With the original profiling, it took a bit more than  2100 ms for a show 
call after a imshow call for a 8000x256 array according to a saved 
kcachegrind profile. Now, it is around 800 ms, which is already much 
better, and with minimal changes (eg without using a special fast path 
more prone to bugs). I estimate that squeezing to a bit less than 500 ms 
should be easily possible by improving on numpy side (clip, float to int 
convertion and take function), which has the nice effect of improving 
mpl without touching one line of it, and improving numpy as the same time :)

The last 500 ms would be much more difficult to squeeze: half of it is 
used to 'launch' the figure anyway. And below a few hundred ms, it is 
becoming unnoticeable in interactive use  (whereas the change from 2.1 s 
to 0.8 is; on my laptop, it is even more noticeable, because its CPU is 
kind of slow).

David

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: Christopher B. <Chr...@no...> - 2006-12-13 19:54:41

Eric Firing wrote:

> Regarding the clip line, I think that your test for mask is None is not 
> the right solution because it knocks out the clipping operation, but the 
> clipping is intended regardless of the state of the mask.  I had 
> expected it to be a very fast operation,

for what it's worth, a few years ago a wrote a "fast_clip" c extension 
that did clip without making nearly as many temporary arrays as the 
Numeric one -- I don't know what numpy does , I haven't needed a fast 
clip recently. I'd be glad to send the code to anyone interested.

> Now I recall very recent discussion explaining why "where" is slow 
> compared to indexing with a boolean, so I know I can speed it up with 
> numpy.  Unfortunately Numeric does not support this, so maybe what will 
> be needed is numerix functions that take advantage of numpy when 
> available.

good idea.

> This is one of those times when I really wish we could drop 
> Numeric and numarray support *now* and start taking full advantage of numpy.

I'd love that too. Maybe your proposal is a good one, though -- make 
numeric functions that are optimized for numpy. I think that's a good 
way to transition.

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chr...@no...

Re: [Matplotlib-users] Horizontal grid lines?

From: Eric F. <ef...@ha...> - 2006-12-15 07:50:03

Simson Garfinkel wrote:
> HI. I wand to have just horizontal grid lines. Is there any way to do  
> this? Thanks!

gca().yaxis.grid(True)
gca().xaxis.grid(False)

Here is the grid method docstring:

     def grid(self, b=None, which='major', **kwargs):
         """
         Set the axis grid on or off; b is a boolean use which =
         'major' | 'minor' to set the grid for major or minor ticks

         if b is None and len(kwargs)==0, toggle the grid state.  If
         kwargs are supplied, it is assumed you want the grid on and b
         will be set to True

         kwargs are used to set the line properties of the grids, eg,

           xax.grid(color='r', linestyle='-', linewidth=2)


Eric

Re: [Matplotlib-users] Horizontal grid lines?

From: Simson G. <si...@ac...> - 2006-12-15 23:56:18

Looks like I need to read *all* of the docstrings. I wish there was  
an easy way to search them....


On Dec 15, 2006, at 2:49 AM, Eric Firing wrote:

> Simson Garfinkel wrote:
>> HI. I wand to have just horizontal grid lines. Is there any way to  
>> do  this? Thanks!
>
> gca().yaxis.grid(True)
> gca().xaxis.grid(False)
>
> Here is the grid method docstring:
>
>     def grid(self, b=None, which='major', **kwargs):
>         """
>         Set the axis grid on or off; b is a boolean use which =
>         'major' | 'minor' to set the grid for major or minor ticks
>
>         if b is None and len(kwargs)==0, toggle the grid state.  If
>         kwargs are supplied, it is assumed you want the grid on and b
>         will be set to True
>
>         kwargs are used to set the line properties of the grids, eg,
>
>           xax.grid(color='r', linestyle='-', linewidth=2)
>
>
> Eric
>

Re: [Matplotlib-users] Some remarks/questions about perceived slowness of matplotlib

From: David C. <da...@ar...> - 2006-12-19 07:14:13

David Cournapeau wrote:
> Eric Firing wrote:
>> There is a clip function in all three numeric packages, so a native 
>> clip is being used.
>>
>> If numpy.clip is actually slower than your version, that sounds like a 
>> problem with the implementation in numpy.  By all logic a single clip 
>> function should either be the same (if it is implemented like yours) 
>> or faster (if it is a single loop in C-code, as I would expect).  This 
>> warrants a little more investigation before changing the mpl code.  
>> The best thing would be if you could make a simple standalone numpy 
>> test case profiling both versions and post the results as a question 
>> to the numpy-discussion list.  Many such questions in the past have 
>> resulted in big speedups in numpy.
> I am much more familiar with internal numpy code than matplotlib's, so 
> this is much easier for me, too :)
>> One more thought: it is possible that the difference is because myclip 
>> operates on the array in place while clip generates a new array.  If 
>> this is the cause of the difference then changing your last line to 
>> "return a.copy()"  probably would slow it down to the numpy clip speed 
>> or slower.
> It would be scary if a copy of a 8008x256 array of double took 100 ms... 
> Fortunately, it does not, this does not seem to be the problem.
>
> cheers,
>
> David
Ok, so now, with my clip function, still for a 8000x256 double array: we 
have show() after imshow which takes around 760 ms. 3/5 are in 
make_image, 2/5 in the function blop, which is just an alias I put to 
measure the difference between axes.py:1043(draw) and image.py:173(draw) 
in the function Axis.draw (file axes.py):

 def blop(dsu):
     for zorder, i, a in dsu:
          a.draw(renderer)

 blop(dsu)

In make_image, most of the time is taken into to_rgba: almost half of it 
is taken in by the take call in the Colormap.__call__. Almost 200 ms to 
get colors from the indexes seems quite a lot (this means 280 cycles / 
pixel on average !). I can reproduce this number by using a small numpy 
test.

On my laptop (pentium M, 1.2 Ghz), make_image takes almost 85 % of the 
time, which seems to imply that this is where one should focus if one 
wants to improve the speed,

cheers,

David