sbcl Log


Commit Date  
[946037] by Stas Boukarev Stas Boukarev

Clean up and micro-optimize list checking in some x86-64 VOPs.

In length/list and values-list, instead of manually checking for LIST,
call %test-lowtag, which produces more compact code.

2013-08-19 23:20:04 Tree
[f61f97] by Stas Boukarev Stas Boukarev

Micro-optimize copy-more-arg on x86-64.

Instead of copying RCX into RBX, then modifying RCX and later
restoring RCX from RBX, modify RBX instead.

2013-08-19 22:32:18 Tree
[002a37] by Stas Boukarev Stas Boukarev

Clean up listify-rest-args VOP on x86-64.

It's no longer using loop instructions, remove STD and CLD.

2013-08-19 22:29:01 Tree
[6581fa] by Stas Boukarev Stas Boukarev

Apply a recent optimization more widely.

FOREIGN-SYMBOL-SAP was missing changing
LEA REG, [#xADDRESS]
to
MOV REG, #xADDRESS

2013-08-19 16:56:22 Tree
[348d1b] by Stas Boukarev Stas Boukarev

Add a memory barrier inside pseudo-atomic on PPC.

Solves problems with allocation and multiple threads.

2013-08-15 18:02:54 Tree
[1d9fe1] by Stas Boukarev Stas Boukarev

Set up alien stack correctly on non-x86oids.

It's assumed that the C stack grows upward everywhere but X86oids,
which is not true. Define two new conditions,
ALIEN_STACK_GROWS_DOWNWARD and ALIEN_STACK_GROWS_UPWARD.

This fixes FFI issues on PPC.

2013-08-15 17:52:24 Tree
[2b69e4] by Stas Boukarev Stas Boukarev

create_os_thread: put pthread stack inside alien-stack.

On !LISP_FEATURE_C_STACK_IS_CONTROL_STACK set pthread stack to
alien_stack, not control_stack.

2013-08-15 17:00:06 Tree
[13bf11] by Stas Boukarev Stas Boukarev

Warn when defining a setf-function together with a setf-expander.

Patch by Douglas Katzman.

2013-08-15 14:40:51 Tree
[9e7a18] by Stas Boukarev Stas Boukarev

Throw errors on malformed FUNCTION.

(funcall (function X junk)) didn't throw an error in the presence of a
compiler-macro for X.

Patch by Douglas Katzman.

2013-08-15 13:43:13 Tree
[076d38] by Stas Boukarev Stas Boukarev

Optimize calling asm routines and static foreign functions on x86-64.

Instead of loading the address using
LEA REG, [#xADDRESS]
use
MOV REG, #xADDRESS

Which saves 2 bytes.

2013-08-15 13:21:04 Tree
[1540c1] by Stas Boukarev Stas Boukarev

Fix undefined function errors on PPC and MIPS.

undefined_tramp hardcodes the register in which FDEFN resides, but the
format was recently changed (f69e89d..).

Other platforms can be susceptible to this.
A proper fix would avoid hardcoding this by exporting
sc-offset-scn-byte/sc-offset-offset-byte, and register offsets.

Thanks to the GCC Compile Farm project for providing machines for
testing and uncovering this.

2013-08-06 17:11:16 Tree
[d5520a] by Stas Boukarev Stas Boukarev

Microoptimize (signed-byte 64) type test on x86-64.

Similar to the (unsigned-byte 64) one:
TEST CL, 3
MOV EAX, ECX
=>
MOV EAX, ECX
TEST AL, 3

Also add tests/run-tests-* to .gitignore.

2013-08-01 17:51:55 Tree
[eca54d] by Christophe Rhodes Christophe Rhodes

fix manual build under texinfo 5

Texinfo 5 is more assertive about its syntax: macros with
non-alphanumerics have never actually been allowed, but we used to be
able to get away with @& to escape an ampersand under @iftex, and
defining @&key macros under @iffnottex. Nuh-uh, not any more. (fixes
lp#1189146)

The details of the indexes, particularly in html format, differ slightly
under texinfo 4 and 5 (related to the trickery around hiding package
prefixes for decent alphabetization). It might be nice to sort this out
Once And For All, eventually.

2013-07-31 13:06:43 Tree
[d62278] by Stas Boukarev Stas Boukarev

Microoptimize comparisons with 0 on x86oids.

Implement the common idiom of using TEST REG, REG in place of CMP REG,
0, saving 1 byte, for fast-if->/< VOPs.

2013-07-28 18:26:18 Tree
[49ab16] by Stas Boukarev Stas Boukarev

Optimize (unsigned-byte 32/64) type tests on x86oids.

Instead of doing
TEST CL, 3
MOV EAX, ECX

do
MOV EAX, ECX
TEST AL, 3

AL has shorter encoding and can save 1 byte with ECX or 4 bytes with
ESI, which doesn't have SIL on x86.

Also revert a part of the previous commit which used untagged
pointers, which can cause problems with the GC.

2013-07-28 16:41:58 Tree
[205611] by Stas Boukarev Stas Boukarev

Microoptimize type-tests on x86oids.

On x86-64 in %test-lowtag instead of doing:

MOV EAX, ECX
AND AL, 15
CMP AL, 15
do
LEA EAX, [RCX-15]
TEST AL, 15

Which allows to save one byte.

On x86 this optimization is already applied, but since LEA loads a
32-bit integer, EAX can be later used as an already untagged pointer
in %test-headers: MOV EAX, [ECX-7] => MOV EAX, [EAX], which takes one
byte less to encode.

2013-07-28 15:46:09 Tree
[9ff18d] (sbcl-1.1.10) by Christophe Rhodes Christophe Rhodes

1.1.10: will be tagged as "sbcl-1.1.10"

2013-07-28 14:14:11 Tree
[dacd3f] by Paul Khuong Paul Khuong

Modular integer %NEGATE on x86oids

Forms like (logand (- word) word) now compute the negation in modular
arithmetic, without consing an intermediate bignum, just like integer
addition, multiplication and subtraction.

The VOPs are trivial, and should be easily added on all other
platforms, I just don't have access to build hosts.

2013-07-18 21:04:13 Tree
[ba39d1] by Paul Khuong Paul Khuong

Pack (mostly) stack TNs according to lexical scope information

Packing TNs from shallow scopes before more deeply nested one
is a perfect elimination order when the live ranges span the
full scope (the interference graph is a comparability graph).
Use that as a heuristic, and do that for TNs that are known
to have such simple live ranges before the rest: this ensures
that bad TNs don't mess everything up.

The result is much tighter stack allocation (most of the effect
comes from initialising stack frames at a smaller size, and growing
less aggressively), and fewer long-lived stray references.

Incidentally: fix catch block packing on win32, solving lp#1072739

2013-07-18 21:02:28 Tree
[a21899] by Paul Khuong Paul Khuong

Grow regalloc datastructures geometrically for unbounded SCs

2013-07-18 20:17:30 Tree
[3b98d3] by Paul Khuong Paul Khuong

Smaller stack frames on x86oids

Start at 4 slots (for some reason, it seems that 3 isn't really
the minimum, and grows by one slot at a time.

2013-07-18 20:17:30 Tree
[44fa19] by Paul Khuong Paul Khuong

Disentangle storage base initial size from growth increments

Before, an initial stack frame size of 8 meant that the stack frame
always grew in increments of 8. Not only is a large initial size bad
for GC (it leaves more dead references untouched), but a large increment
is even worse.

2013-07-18 20:17:30 Tree
[df2d63] by Paul Khuong Paul Khuong

Insert explicit cut to width when needed

When modular arithmetic operations are replaced with specialised
modular variants, the result's bitwidth is determined by the variant,
and might be wider than expected. If necessary, insert an explicit
cut to the exact bitwidth before returning a value in a non-modular
context.

Spotted by pfdietz's random tester.

Fixes lp#1199428.

2013-07-18 19:43:24 Tree
[be3993] by Paul Khuong Paul Khuong

Avoid uselessly re-scanning modular arithmetic expressions

When modular arithmetic transforms have already fired for a
subexpression, and that subexpression's width is at most as wide
as the bitwidth we're cutting to, there is no need to re-traverse
the subexpression.

There was already some code to detect that case. Make it more general,
and, more importantly, sound.

2013-07-18 19:43:24 Tree
[e24061] by Paul Khuong Paul Khuong

No more destructive MERGE of shared data in best-modular-version

The old code worked by accident: few/no platform implements
untagged signed modular arithmetic VOPs.

The new code handles that common case to avoid consing a fresh list
when the MERGE will be an identity.

2013-07-18 19:43:24 Tree
Older >