Thread: [pure-lang-users] Compilation times

pure-lang-users

[pure-lang-users] Compilation times

From: Albert G. <Dr....@t-...> - 2008-07-07 20:42:41

Hi Jiri,

>> In return I promise to investigate why the dict and set modules take
>> so long to compile. :)
> 
> It would be nice, it is really slow.

Ok, I did some profiling now, and it seems that the lion's share (over
70%) of the time is spent within LLVM, only a small fraction (not
exceeding 15%) goes to the parser and all my own semantic routines.
There goes my theory that it might be related to my own environment
code... (Actually I would have preferred that, if it's my own code then
I can fix it. But I checked specifically those routines and they don't
even make a blip on the radar.)

I still have to analyze the profiles more thoroughly, there might be
some hidden inefficiencies in my semantic routines which make it spend a
lot of time in STL routines (which make up for the remaining 15%). But
in any case it looks like LLVM is to blame, which is a major bummer. :(
I would have suspected that it's the optimization passes, but disabling
these doesn't speed it up much either, so it seems that most of those
70% of the compilation time the darn thing just generates LLVM IR.

I guess I'll have to try LLVM 2.3 and see whether it's any faster.
There's still hope since they do have a new IR Builder class in 2.3...

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

[pure-lang-users] Patch to compile Pure with LLVM upstream

From: Rooslan S. K. <kh...@gm...> - 2008-07-07 21:37:46

Attachments: pure-for-llvm23.patch

Albert Graef wrote:

> I guess I'll have to try LLVM 2.3 and see whether it's any faster.
> There's still hope since they do have a new IR Builder class in 2.3...

Here is a patch (created by mostly mechanical substitution) to compile
Pure with LLVM trunk. I guess it should work with 2.3 as well. It passes
all tests on my Linux x86 box. I've never tried Pure with 2.2, so can't
really compare these performance wise. Compiling any of supplied library
modules or sources in examples/ folder takes less than 1 sec on a not so
modern hardware (single core AMD 3200+).

-- 
Best regards,
Rooslan

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-07 22:10:50

Hi Rooslan, welcome to the list!

Rooslan S. Khayrov wrote:
> Here is a patch (created by mostly mechanical substitution) to compile
> Pure with LLVM trunk. I guess it should work with 2.3 as well. It passes
> all tests on my Linux x86 box.

Thanks a bunch, that's heaven-sent! :) Will try it immediately.

> I've never tried Pure with 2.2, so can't
> really compare these performance wise. Compiling any of supplied library
> modules or sources in examples/ folder takes less than 1 sec on a not so
> modern hardware (single core AMD 3200+).

Did you run these with -i? Otherwise the interpreter is not really
compiling anything (it's all done lazily). To actually force the Pure
interpreter to generate the IR for a module when you run it in batch
mode and you're not actually computing anything, you can use a command
like the following:

pure -i set.pure </dev/null

This will force it to think it's in interactive mode, in which case it
compiles all pending definitions before it enters the read-eval-print loop.

Cheers,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Rooslan S. K. <kh...@gm...> - 2008-07-07 22:24:31

Albert Graef wrote:

> Did you run these with -i? Otherwise the interpreter is not really
> compiling anything (it's all done lazily).

Uh, no. (Unfortunately I don't have time to closely follow Pure
development, but Q loveliness made me to try out its successor :-) )

So, real compilation time for sets and dict is 3-4 secs, contrary to
1-1.5 for other modules.

-- 
Best regards,
Rooslan

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-07 22:57:17

Rooslan S. Khayrov wrote:
> Uh, no. (Unfortunately I don't have time to closely follow Pure
> development, but Q loveliness made me to try out its successor :-) )

Once you tried it there's no going back. ;-) No, seriously, if you give
it a whirl, let me know what you think.

> So, real compilation time for sets and dict is 3-4 secs, contrary to
> 1-1.5 for other modules.

So the new builder class isn't any faster, that's bad news. :(

Jiri, I think we might have to work around this for a while. I'll have a
look at those babies tomorrow and see how I can massage some of the
definitions so that the matching automata don't get that big.

Good night,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Jiri S. <jir...@bl...> - 2008-07-07 23:19:56

Albert Graef wrote:
> So the new builder class isn't any faster, that's bad news. :(
> 
> Jiri, I think we might have to work around this for a while. I'll have a
> look at those babies tomorrow and see how I can massage some of the
> definitions so that the matching automata don't get that big.
> 
Maybe we could try the original Bird/Wadler algo and look at the 
compilation/execution speed trade-off. The executions speed in Pure vs. 
Q is often inscrutable. Most operation with dict run some 5 times faster 
with Pure than with Q but the members operation runs about 10 times 
slower with Pure than with Q. Maybe the difference between that two 
implementations of AVLT will not be that big as with Q and the 
compilation will be surely much faster.

Jiri

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-07 23:54:11

Jiri Spitz wrote:
> Maybe we could try the original Bird/Wadler algo and look at the 
> compilation/execution speed trade-off.

Ok, will try that tomorrow.

> Most operation with dict run some 5 times faster 
> with Pure than with Q but the members operation runs about 10 times 
> slower with Pure than with Q.

10 times *slower*?? I don't have this over here.

> let d = dict $ zipwith (=>) (1..100000) (1..100000);
> stats
> #members d;
100000
0.86s

==> def D = dict $ zip [1..100000] [1..100000]

==> #members D; stats
100000
0.81 secs, 300002 reductions, 265536 cells

Still the Pure version seems awfully slow given that Pure is supposed to
run a lot *faster* than Q. Hmm, and it gets slower each time I run the
members function. Something's wrong there. I will have to see whether
the B/W implementation exhibits the same behaviour.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Jiří S. <jir...@bl...> - 2008-07-08 12:12:45

Albert Graef wrote:
> Added (examples/avltree.pure). It's a fairly simplistic implementation,
> but it compiles much faster than set.pure. OTOH, insert and delete take
> about twice as long compared to your implementation, which IIRC is
> consistent with the Q benchmarks that you did earlier. So I really
> wouldn't want to replace your version with it. ;-)
> 
Albert I'll have time to play with it in the evening. For now only one 
remark. The type annotations improve the execution speed quite a lot, so 
it is not upright to compare implementations without and with them.

Jiri

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-08 13:38:49

Jir(í Spitz wrote:
> Albert I'll have time to play with it in the evening. For now only one 
> remark. The type annotations improve the execution speed quite a lot, so 
> it is not upright to compare implementations without and with them.

All I can say that I added them to avltree.pure, and they didn't make
much of a difference in the examples I tried. The worst example I saw
was that deleting all 100000 (int) members from a tree took some 9%
longer than with the ::int tags.

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-08 11:49:48

Hi Jiri,

Albert Graef wrote:
> Jiri Spitz wrote:
>> Maybe we could try the original Bird/Wadler algo and look at the 
>> compilation/execution speed trade-off.
> 
> Ok, will try that tomorrow.

Added (examples/avltree.pure). It's a fairly simplistic implementation,
but it compiles much faster than set.pure. OTOH, insert and delete take
about twice as long compared to your implementation, which IIRC is
consistent with the Q benchmarks that you did earlier. So I really
wouldn't want to replace your version with it. ;-)

I found that adding special case rules to avltree.pure only yields
marginal improvements (<10%), so you might be able to remove them from
your implementation, too, without taking a major performance hit. That
alone will make your modules load much faster, and there's some other
refactoring that we can do (like checking the balance factor in a guard
instead of matching it on the lhs) that will further reduce the number
of overlapping rules.

> Still the Pure version seems awfully slow given that Pure is supposed to
> run a lot *faster* than Q.

Rubbish, who said that? Oops, that was me. :)

In fact, all operations consistently outperform the corresponding Q
definitions by a factor of 5-7, if they're just plain Q definitions. The
only cases where Q is able to keep up with Pure is when operations are
involved that are actually implemented in C, like creating enumerations
and # on lists. Another factor is that Q has a more compact internal
list representation which makes some of these operations even faster.

For a bytecode interpreter Q indeed performs quite well on this stuff
(compared to the recursive benchmark where it sucks big time), but
that's because these definitions don't *do* much beyond pattern matching
and tree construction. I already mentioned in another thread that
there's still room for improvements in Pure there, in particular
inlining the runtime calls which construct symbol and application nodes
will surely help.

> Hmm, and it gets slower each time I run the
> members function. Something's wrong there. I will have to see whether
> the B/W implementation exhibits the same behaviour.

It's the same with my implementation. Will have to investigate.

Cheers,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Jiri S. <jir...@bl...> - 2008-07-08 17:53:29

Albert Graef wrote:
> I found that adding special case rules to avltree.pure only yields
> marginal improvements (<10%), so you might be able to remove them from
> your implementation, too, without taking a major performance hit. That
> alone will make your modules load much faster, and there's some other
> refactoring that we can do (like checking the balance factor in a guard
> instead of matching it on the lhs) that will further reduce the number
> of overlapping rules.
> 
OK. Testing set is now in 'examples/set_test.pure'.

Removed type annotations don't hurt.

I changed pattern matches of constants to guards where possible. The 
result does not seem to compile any faster (at least on my PC) and the 
execution is substantially slower :-( .

Do you see any other possible rewrites of the code?

I looked at other similar implementation of AVL trees (I already 
mentioned it somewhere in the Q discussion). It doesn't use those 
'table' and 'table2' to calculate the new balances, there is some 
div/mod/&/| magic instead. The rest of the code is almost the same. This 
variant of the algorithm was only by about 30 % faster than Bird/Wadler 
in Q. I don't think it would compile much faster than the current one.

Jiri

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-11 03:04:57

Rooslan S. Khayrov wrote:
> Here is a patch (created by mostly mechanical substitution) to compile
> Pure with LLVM trunk. I guess it should work with 2.3 as well.

Ok, I committed this now, thanks a lot for the patch.

This means that starting with r434 Pure really needs LLVM 2.3 now. The
LLVM 2.3 version of Cyrille Berger's 64 bit -fPIC patch is available
under the following URL:
http://pure-lang.sf.net/X86JITInfo.cpp.pic.2.3.patch

Cheers,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Ryan S. <rya...@us...> - 2008-07-18 10:02:44

On Jul 10, 2008, at 22:05, Albert Graef wrote:

> Rooslan S. Khayrov wrote:
>
>> Here is a patch (created by mostly mechanical substitution) to  
>> compile
>> Pure with LLVM trunk. I guess it should work with 2.3 as well.
>
> Ok, I committed this now, thanks a lot for the patch.
>
> This means that starting with r434 Pure really needs LLVM 2.3 now.

Now that llvm in MacPorts is updated to 2.3 I tried updating pure- 
devel to r438 but half of the tests fail (with Mac OS X 10.4.11 and  
Xcode 2.5 on Intel Core 2 Duo):

Running tests.
prelude.pure: FAILED
test001.pure: passed
test002.pure: FAILED
test003.pure: FAILED
test004.pure: FAILED
test005.pure: passed
test006.pure: passed
test007.pure: passed
test008.pure: passed
test009.pure: passed
test010.pure: FAILED
test011.pure: FAILED
test012.pure: passed
test013.pure: passed
test014.pure: FAILED
test015.pure: FAILED

> The LLVM 2.3 version of Cyrille Berger's 64 bit -fPIC patch is
> available under the following URL:
> http://pure-lang.sf.net/X86JITInfo.cpp.pic.2.3.patch

Do we need to add this patch to the llvm port?

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-18 18:29:43

Ryan Schmidt wrote:
> Now that llvm in MacPorts is updated to 2.3 I tried updating pure- 
> devel to r438 but half of the tests fail (with Mac OS X 10.4.11 and  
> Xcode 2.5 on Intel Core 2 Duo):

Latest svn? Can you please do a 'make cleanlogs logs' and send me your 
logs please (just zip up the test directory)? Afterwards you can run 
another make cleanlogs and svn up to get the golden logs again.

>> The LLVM 2.3 version of Cyrille Berger's 64 bit -fPIC patch is
>> available under the following URL:
>> http://pure-lang.sf.net/X86JITInfo.cpp.pic.2.3.patch
> 
> Do we need to add this patch to the llvm port?

Don't think so, otherwise you'd presumably get a linkage error anyway.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Ryan S. <rya...@us...> - 2008-07-18 22:41:58

On Jul 18, 2008, at 15:15, Ryan Schmidt wrote:

> On Jul 18, 2008, at 13:29, Albert Graef wrote:
>
>> Ryan Schmidt wrote:
>>
>>> Now that llvm in MacPorts is updated to 2.3 I tried updating pure-
>>> devel to r438 but half of the tests fail (with Mac OS X 10.4.11 and
>>> Xcode 2.5 on Intel Core 2 Duo):
>>
>> Latest svn?
>
> I'm using r438 of pure which is the latest at this time. Or did you  
> mean latest svn of llvm? If so, then no; the llvm port provides the  
> 2.3 release of llvm. Since yesterday there is also a new llvm-devel  
> port providing a later development version (53722); I could try that.

Ok, pure r438 works with llvm-devel at 53722 so that's fine. The pure- 
devel port is updated!

Re: [pure-lang-users] Patch to compile Pure with LLVM upstream

From: Albert G. <Dr....@t-...> - 2008-07-21 03:28:52

Ryan Schmidt wrote:
> Ok, pure r438 works with llvm-devel at 53722 so that's fine. The pure- 
> devel port is updated!

Thanks. So it seems that there are still some OSX bugs in LLVM 2.3 which 
are fixed in the latest LLVM from svn? I should probably mention that in 
the INSTALL file somewhere.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] Compilation times

From: Albert G. <Dr....@t-...> - 2008-07-07 21:41:26

Albert Graef wrote:
>>> In return I promise to investigate why the dict and set modules take
>>> so long to compile. :)
>> It would be nice, it is really slow.
> 
> Ok, I did some profiling now, and it seems that the lion's share (over
> 70%) of the time is spent within LLVM, only a small fraction (not
> exceeding 15%) goes to the parser and all my own semantic routines.

I think I pinned that one down now. Noticing that only set.pure and
dict.pure are affected, but not the other container data structures
whose code is in principle quite similar, I took a closer look at the
generated pattern matching code of these modules. Lo and behold, there
are in fact some functions in those two modules which have a lot of
overlapping rules producing automata with some 260-300 states. So I
removed some the special case rules in those definitions, and that alone
made the modules load thrice as fast.

Mind you, 260-300 states isn't big at all, my construction algorithm can
easily deal with much bigger automata, but of course the automaton then
also yields a routine in LLVM IR which includes a decision tree (nested
switch statements) with a total of 260-300 target labels.

My guess is that the LLVM IR builder gets really slow when dealing with
"big" assembler routines like this (maybe some badly coded algorithms
with quadratic complexity in there, I haven't looked at the code).

So now I have to get Pure to compile with LLVM 2.3 to see whether they
fixed those horrible inefficiencies with the new builder class.
If not then as an intermediate measure some refactoring of those
overlapping rules in dict.pure and set.pure will help.

Cheers,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag