Thread: [pure-lang-users] llvm 2.3 - good news

Status: Beta

Brought to you by: agraef

pure-lang-users

[pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-07-10 13:50:42

Hello,

I tried to compile Pure with llvm 2.3 under MinGW - without any 
problems. I only used the replacements from the Rooslan's patch and add 
another one.

And first of all - the test #15 runs 12 s instead of original 55 s (both 
under MinGW, of course).

:-)

Jiri

Re: [pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-07-10 14:21:57

Jiri Spitz wrote:
> And first of all - the test #15 runs 12 s instead of original 55 s (both 
> under MinGW, of course).
> 
And under Linux:

$ time make check
Running tests.
prelude.pure: passed
test001.pure: passed
test002.pure: passed
test003.pure: passed
test004.pure: passed
test005.pure: passed
test006.pure: passed
test007.pure: passed
test008.pure: passed
test009.pure: passed
test010.pure: passed
test011.pure: passed
test012.pure: passed
test013.pure: passed
test014.pure: passed
test015.pure: passed

real    0m18.424s
user    0m17.970s
sys     0m0.394s
$

:-) :-) :-)

Jiri

Re: [pure-lang-users] llvm 2.3 - good news

From: Libor S. <li...@gm...> - 2008-07-10 19:17:02

Excellent! I look forward to this, or even LLVM 2.4, becoming the default Pure setup.
Libor

On Thu, 10 Jul 2008 15:21:59 +0100, Jiri Spitz <jir...@bl...> wrote:

> Jiri Spitz wrote:
>> And first of all - the test #15 runs 12 s instead of original 55 s (both
>> under MinGW, of course).
>>
> And under Linux:
>
> $ time make check
> Running tests.
> prelude.pure: passed
> test001.pure: passed
> test002.pure: passed
> test003.pure: passed
> test004.pure: passed
> test005.pure: passed
> test006.pure: passed
> test007.pure: passed
> test008.pure: passed
> test009.pure: passed
> test010.pure: passed
> test011.pure: passed
> test012.pure: passed
> test013.pure: passed
> test014.pure: passed
> test015.pure: passed
>
> real    0m18.424s
> user    0m17.970s
> sys     0m0.394s
> $
>
> :-) :-) :-)
>
> Jiri
>
> -------------------------------------------------------------------------
> Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
> Studies have shown that voting for your favorite open source project,
> along with a healthy diet, reduces your potential for chronic lameness
> and boredom. Vote Now at http://www.sourceforge.net/community/cca08
> _______________________________________________
> pure-lang-users mailing list
> pur...@li...
> https://lists.sourceforge.net/lists/listinfo/pure-lang-users
>

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-10 20:13:33

Libor Spacek wrote:
> Excellent! I look forward to this, or even LLVM 2.4, becoming the default Pure setup.

LLVM 2.3 *is* the current official release, so that's what I'm going to
target to make things easier, especially for the package maintainers.
But Roostan's patches are for LLVM trunk anyway, so you should be able
to use that if you prefer.

Note that for 64 bit we still need Cyrille Berger's patch (fortunately
this has been updated for LLVM 2.3 already, but I don't think it's in
LLVM svn yet).

I'm going to commit the necessary changes tomorrow.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-11 02:37:14

Jiri Spitz wrote:
>> And first of all - the test #15 runs 12 s instead of original 55 s (both 
>> under MinGW, of course).

Yep, the 'let' statement at the beginning of your test module compiles
in 7.6 secs now, versus 96.4 secs before. (That includes the startup
time of the interpreter and loading of the prelude, which takes about
half a second on my AMD32.)

But it's still too slow. 7 secs to initialize a constant list of just
1000 elements? That's ridiculous, Q does that in a heartbeat. And almost
all that time is still spent in the JIT. What is it doing there?

So I'll still have to optimize for that case.

Nevertheless, it does seem that the JIT has improved a lot, so requiring
LLVM 2.3 seems sensible.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Libor S. <li...@gm...> - 2008-07-11 11:00:18

Pure 434 + LLVM 2.3 run happily here, thanks!
L.

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-13 11:29:13

Albert Graef wrote:
> Yep, the 'let' statement at the beginning of your test module compiles
> in 7.6 secs now, versus 96.4 secs before. (That includes the startup
> time of the interpreter and loading of the prelude, which takes about
> half a second on my AMD32.)
> 
> But it's still too slow. 7 secs to initialize a constant list of just
> 1000 elements? That's ridiculous, Q does that in a heartbeat. And almost
> all that time is still spent in the JIT. What is it doing there?
> 
> So I'll still have to optimize for that case.

Ok, I've done that now, putting the pure_listl and pure_tuplel runtime
routines to good use there. Besides the code to generate the element
expressions, a list or tuple expression now needs just three additional
runtime calls, with a flat call graph. That speeds up the JIT
considerably. I'm down to some 0.8 secs for compiling the 'let'
statement at the beginning of the test015 module now, and there doesn't
seem to be any way to make the code still more "digestable".

This example clearly shows that there are some severe performance
bottlenecks in the JIT (even in LLVM 2.3). The JIT doesn't scale well
with code size at all. For the example at hand, on my system assigning a
1000 element list to a variable needs 0.82s from which 0.01s are spent
in IR code generation including all optimization passes, 0.81s in the
JIT(!), and 0.00s (zilch, up to rounding) in actually executing the
code. I got these figures using clock(), so they should be pretty accurate.

One further avenue of working around LLVM's deficiencies there would be
to optimize the case that the expression to be evaluated is a constant
(number, string or list/tuple of constants), in which case I could just
skip the compilation step and directly convert the compile time
expression to a pure_expr* instead. I'll try that tomorrow.

Have a nice Sunday,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-07-13 20:07:26

Albert Graef wrote:
> 
> Ok, I've done that now, putting the pure_listl and pure_tuplel runtime
> routines to good use there. Besides the code to generate the element
> expressions, a list or tuple expression now needs just three additional
> runtime calls, with a flat call graph. That speeds up the JIT
> considerably. I'm down to some 0.8 secs for compiling the 'let'
> statement at the beginning of the test015 module now, and there doesn't
> seem to be any way to make the code still more "digestable".
> 
Hello Albert,
The code compiles much faster now. However, your latest changes made the 
execution memory eager and my favourite test 'set (1..1000000)' caused 
my PC swap to death. It seems the code isn't tail recursive anymore.

Jiri

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-13 22:47:52

Jiri Spitz wrote:
> The code compiles much faster now. However, your latest changes made the 
> execution memory eager and my favourite test 'set (1..1000000)' caused 
> my PC swap to death. It seems the code isn't tail recursive anymore.

Sorry, I can't test right now, because I just upgraded my system and I'm
still in the process of getting up and running again. But it sounds like
I introduced a memory leak with the latest change. (If TCO wouldn't work
any more, you'd get stack overflows instead.) I will have a look asap.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-08-10 08:40:40

Jiri Spitz wrote:
> The code compiles much faster now. However, your latest changes made the 
> execution memory eager and my favourite test 'set (1..1000000)' caused 
> my PC swap to death. It seems the code isn't tail recursive anymore.

Fixed (r459).

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-08-10 14:46:05

Albert Graef wrote:
>> The code compiles much faster now. However, your latest changes made the 
>> execution memory eager and my favourite test 'set (1..1000000)' caused 
>> my PC swap to death. It seems the code isn't tail recursive anymore.
> 
> Fixed (r459).
> 
Thanks, but I am still not happy. The memory consumption is OK now, but 
my 1 M set example runs two times slower than before :-( .

Regards,

Jiri

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-08-10 21:38:46

Jiri Spitz wrote:
> Thanks, but I am still not happy. The memory consumption is OK now, but 
> my 1 M set example runs two times slower than before :-( .

Right, the new code is faster for JIT compilation, but slower on 
execution for small list values. I worked around that now by adding a 
minimum bound for the size of lists/tuples to which the new list 
generation code is applied. Please check whether it's ok for you now.

Using #set(1..1000000) as a test example, over here r462 still seems to 
be a tad slower than r436, but that's probably due to some other, 
unrelated fixes I did to the environment-handling code, which also incur 
some (small) runtime cost; I'll have another look at that tomorrow.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-08-10 22:06:18

Albert Graef napsal(a):
> Right, the new code is faster for JIT compilation, but slower on 
> execution for small list values. I worked around that now by adding a 
> minimum bound for the size of lists/tuples to which the new list 
> generation code is applied. Please check whether it's ok for you now.
> 
The speed is back as it used to be before the fixes.

> Using #set(1..1000000) as a test example, over here r462 still seems to 
> be a tad slower than r436, but that's probably due to some other, 
> unrelated fixes I did to the environment-handling code, which also incur 
> some (small) runtime cost; I'll have another look at that tomorrow.
I do not see any measurable slowdown now.

Thanks,

Jiri

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-08-11 06:35:29

Jiri Spitz wrote:
> I do not see any measurable slowdown now.

Great, then I consider this fixed. :)

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-08-10 22:04:08

Albert Graef wrote:
> One further avenue of working around LLVM's deficiencies there would be
> to optimize the case that the expression to be evaluated is a constant
> (number, string or list/tuple of constants), in which case I could just
> skip the compilation step and directly convert the compile time
> expression to a pure_expr* instead. I'll try that tomorrow.

This is now implemented as well. In most cases, constant expressions at 
the toplevel aren't compiled any more but are directly converted to the 
runtime expression data structure. That makes assigning a big constant 
list to a global variable much faster.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-10 19:41:32

Jiri Spitz wrote:
> I tried to compile Pure with llvm 2.3 under MinGW - without any 
> problems. I only used the replacements from the Rooslan's patch and add 
> another one.

Can you please post your additional change?

> And first of all - the test #15 runs 12 s instead of original 55 s (both 
> under MinGW, of course).

That's good news indeed. :) So I guess it's time to switch to LLVM 2.3
now. I can commit the necessary changes tomorrow. Everybody ready to
take the plunge?

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Ryan S. <rya...@us...> - 2008-07-10 19:51:32

On Jul 10, 2008, at 14:41, Albert Graef wrote:

> Jiri Spitz wrote:
>> I tried to compile Pure with llvm 2.3 under MinGW - without any
>> problems. I only used the replacements from the Rooslan's patch  
>> and add
>> another one.
>
> Can you please post your additional change?
>
>> And first of all - the test #15 runs 12 s instead of original 55 s  
>> (both
>> under MinGW, of course).
>
> That's good news indeed. :) So I guess it's time to switch to LLVM 2.3
> now. I can commit the necessary changes tomorrow. Everybody ready to
> take the plunge?

MacPorts has llvm 2.2 right now. I could ask its maintainer to update  
it to 2.3.

Will pure still work with llvm 2.2 or will llvm 2.3 be required now?

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-10 20:21:53

Ryan Schmidt wrote:
> MacPorts has llvm 2.2 right now. I could ask its maintainer to update  
> it to 2.3.

That would be nice.

> Will pure still work with llvm 2.2 or will llvm 2.3 be required now?

I'd prefer the latter, because of all the quirks we see with the LLVM
2.2 JIT. But maybe you should first test with LLVM 2.3 on OSX after I
committed the patches, before we decide on that.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-07-10 19:58:08

Attachments: mypure.patch.gz

Albert Graef wrote:
> 
> Can you please post your additional change?
> 
My patch against rev. 432 is enclosed.

Jiri

Re: [pure-lang-users] llvm 2.3 - good news

From: Eddie R. <er...@bm...> - 2008-07-10 20:14:39

On Thu, 2008-07-10 at 21:41 +0200, Albert Graef wrote:
> 
> That's good news indeed. :) So I guess it's time to switch to LLVM 2.3
> now. I can commit the necessary changes tomorrow. Everybody ready to
> take the plunge?

NO, I'm never going to use 2.3!!! Just kidding ;=) I already have llvm
2.3 installed but I haven't gotten pure to compile yet. I need Jiri's
changes.

e.r.

Re: [pure-lang-users] llvm 2.3 - good news

From: Libor S. <li...@gm...> - 2008-07-13 11:50:34

On Sun, 13 Jul 2008 12:29:12 +0100, Albert Graef <Dr....@t-...> wrote:

> This example clearly shows that there are some severe performance
> bottlenecks in the JIT (even in LLVM 2.3). The JIT doesn't scale well
> with code size at all. For the example at hand, on my system assigning a
> 1000 element list to a variable needs 0.82s from which 0.01s are spent
> in IR code generation including all optimization passes, 0.81s in the
> JIT(!), and 0.00s (zilch, up to rounding) in actually executing the
> code. I got these figures using clock(), so they should be pretty accurate.

Nice work! Incidentally, I find clock() to be a pretty blunt tool with its
resolution of 10ms, which is way too long timelapse to measure most executions
on modern machines, apart from pretty massive tasks. For example, in real-time
image processing, you would process a whole image in no more than five ticks
of the clock(). gettimeofday is a lot more accurate but measures the elapsed time.
I am not even sure if clock() internally does any rounding, from what I can see
it is just a discrete counter running in 10ms units, meaning that CPU time of 
as much as 9ms can register as 0 (zilch).

L.

Re: [pure-lang-users] llvm 2.3 - good news

From: Albert G. <Dr....@t-...> - 2008-07-13 17:17:29

Libor Spacek wrote:
> I am not even sure if clock() internally does any rounding, from what I can see
> it is just a discrete counter running in 10ms units, meaning that CPU time of 
> as much as 9ms can register as 0 (zilch).

That's just what Linux's default HZ value of 100 will give you. They
changed HZ to 1000 for a while, but then changed it back because it was
eating too much energy on laptops. The latest kernels have tickless
timers (not always enabled by default, so you might have to build your
own kernel to get those), which enables you to get any resolution that
you want with the POSIX highres timers (up to what the hardware
provides, which is typically ~1 microsec on current systems).

I'm not sure where gettimeofday gets the microsec ticks on Linux,
probably it directly reads some hardware timer, but that's not
guaranteed by POSIX. And since it's wallclock time it's useless for
measuring performance anyway.

The proper solution is to use the POSIX highres timers on systems that
have them (recent Linux 2.6.x versions do). Have a look at the
clock_gettime manual page, it should be easy to wrap this in Pure, or
write a little C module for that purpose. (There's also code in Q's
system module which could easily be ported to Pure -- see the nanotime
et al routines in modules/clib/system.c in the Q sources.)

HTH,
Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr....@t-..., ag...@mu...
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Re: [pure-lang-users] llvm 2.3 - good news

From: Jiri S. <jir...@bl...> - 2008-07-13 20:13:09

Libor Spacek wrote:
> Nice work! Incidentally, I find clock() to be a pretty blunt tool with its
> resolution of 10ms, which is way too long timelapse to measure most executions
> on modern machines, apart from pretty massive tasks. For example, in real-time
> image processing, you would process a whole image in no more than five ticks
> of the clock(). gettimeofday is a lot more accurate but measures the elapsed time.
> I am not even sure if clock() internally does any rounding, from what I can see
> it is just a discrete counter running in 10ms units, meaning that CPU time of 
> as much as 9ms can register as 0 (zilch).
> 
Hi Libor,
I am not sure but I think you mentioned somewhere you are using Ubuntu 
8.04. If so, then you can try to install the 'RT' (real time) Linux 
image. It contains timers with higher resolution.

Jiri