Thread: [Sbcl-devel] x86 inline alloc patches

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

This is a report of benchmarking x86 inline allocation following up on
Juho's great improvement of the same on the x86-64.

There were three versions tested, all of them based on 0.9.5.85 (=3D
0.9.6).

* pristine: this one has inline allocation disabled

* xor-swap: use the xor swap trick instead of XCHG

* stack-temporary: use a stack temporary and mov instead of XCHG

Each was tested on uni- and multithread so that's six configurations.

Oh, in the comparison matrix there is one more just to see how bad it
was:

* inline-alloc: this one simply has inline allocation enabled on a=20
multithreaded build

Executive summary: inline-alloc is slow, xor-swap and stack-temporary
are significantly faster in some tests, the only difference between
them being the strange strange peeks in the
multithread-stack-temporary column (puzzle, string-concat, ...).

I'm leaning towards merging xor-swap if you don't have better solutions=20
in mind.?

Cheers, G=E1bor

MULTITHREAD
=2D--------

* pristine

** core file size: 26406912

** cl-bench total (two runs)

real    3m49.590s
user    3m5.146s
sys     0m29.147s

real    3m48.000s
user    3m4.493s
sys     0m29.094s

* xor-swap

** core file size: 26804224 (+ 1.28742%)

** cl-bench total (two runs)

real    3m41.404s
user    3m1.428s
sys     0m28.740s

real    3m41.538s
user    3m0.900s
sys     0m28.973s

* stack-temporary

* core file size: 26746880 (+ 1.0%)

** cl-bench total (two runs)

real    3m48.753s
user    3m5.587s
sys     0m29.175s

real    3m46.859s
user    3m5.970s
sys     0m28.845s

UNITHREAD
=2D--------

* pristine

** core file size: 25272320

** cl-bench total

real    3m40.151s
user    2m58.790s
sys     0m29.081s

* xor-swap

** core file size: 25612288 (+ 1.3%)

** cl-bench total

real    3m56.109s
user    2m56.367s
sys     0m29.022s

* stack-temporary

** core file size: 25563136 (+ 1.1%)

** cl-bench total

real    3m43.939s
user    2m57.387s
sys     0m29.103s

Thread: [Sbcl-devel] x86 inline alloc patches

Common Lisp compiler and runtime

sbcl-devel