sbcl Log


Commit Date  
[3031b2] by Paul Khuong Paul Khuong

Back end work for short vector SIMD packs

* Platform-agnostic changes:
- Declare type testing/checking routines.
- Define three primitive types: simd-pack-double for packs
of doubles, simd-pack-single for packs of singles, and
simd-pack-int for packs of integer/unknown.
- Define a heap-representation for 128-bit SIMD packs,
along with reserving a widetag and filling the corresponding
entries in gencgc's tables.
- Make the simd-pack class definition fully concrete.
- Teach IR1 how to expand SIMD-PACK type checks.
- IR2-conversion maps SIMD-PACK types to the right primitive type.
- Increase the limit on the number of storage classes: SIMD packs
went way past the previous (arbitrary?) limit of 40.

* Platform-specific changes, in src/compiler/target/simd-pack:
- Create new storage classes (that are backed by the float-reg [i.e. SSE]
storage base): one for each of double, single and integer sse packs.
- Also create the corresponding immediate-constant and stack storage
classes.
- Teach the assembler and the inline constant code about this new kind
of registers/constants, and how to map constant SIMD-PACKs to which SC.
- Define movement/conversion VOPs for SSE packs, along with VOP routines
needed for basic creation/manipulation of SSE packs.
- The type-checking VOP in generic/late-type-vops is extremely
x86-64-specific... IIRC, there are ordering issues I do not
want to tangle with.

* Implementation idiosyncrasy: while type *tests* (i.e. TYPEP calls) consider
the element type, type *checks* (e.g. THE or DECLARE) only check for
SIMD-PACKness, without looking at the element type. This is allowed by the
standard, is similar to what Python does for FUNCTION types, and helps
code remain efficient even when type checks can't be fully elided.

The vast majority of the code is verbatim or heavily inspired by Alexander
Gavrilov's branch.

2013-05-21 19:11:26 Tree
[b38f10] by Paul Khuong Paul Khuong

Front end infrastructure for short vector SIMD packs

* new feature, sb-simd-pack.

* define a new IR1 type for SIMD packs:
- (SB!KERNEL:SIMD-PACK [eltype]), where [eltype] is a subtype
of the plaform-specific SIMD element type universe, or * (default),
the union of all these possibilities;
- Element types are always upgraded to the platform's element type
(small) universe, so we can easily manipulate unions of SIMD-PACK
types by working in terms of the element types.

* immediately specify the universe of SIMD pack element types
(sb!kernel:*simd-pack-element-types*) for x86-64, to ensure
#!+sb-simd-pack buildability.

* declare basic functions to create/manipulate SIMD packs:
- simd-pack-p is the basic type predicate;
- %simd-pack-tag returns a fixnum tag associated with each SIMD-PACK;
currently, we suppose it only encodes the element type, as the
position of the element type in *simd-pack-element-types*;
- %make-simd-pack creates a 128-bit SIMD pack from a tag and two
64 bit integers;
- %make-simd-pack-double creates an appropriately-tagged pack from
two double floats;
- %make-simd-pack-single creates a tagged pack from four single
floats;
- %make-simd-pack-ub{32,64} creates a tagged pack from four 32 bit
or two 64 bit integers;
- %simd-pack-{low,high} returns the low/high integer half of a
128 bit pack;
- %simd-pack-ub{32,64}s returns the four integer quarters or two
integer halves of a 128 bit pack;
- %simd-pack-singles returns the four singles in a 128 bit pack;
- %simd-pack-doubles returns the two doubles in a 128 bit pack.

Alexander Gavrilov kept a branch alive for the last couple years. The
creation/manipulation primitives are largely taken from that branch,
or informed by the branch's usage.

2013-05-21 19:10:50 Tree
[d142a5] by Stas Boukarev Stas Boukarev

Fix foreign-symbol-address transform on +sb-dynamic-core.

Badly placed ` was resulting in a wrong result.

2013-05-21 11:05:19 Tree
[729ce5] by Paul Khuong Paul Khuong

Make some instances of IF/IF conversion more direct

When faced with CFGs that look like (if (if ...) ...), we duplicate
the outer NULL test forward in the branches (and jump to the correct
branch, so very little code is duplicated). However, this transform
depends on later ir1 optimisation to handle patterns like
(if (if ... nil t) ...). Try and get them right with a specialised
rewrite to get good code even when ir1opt doesn't run until fixpoint.

Also, refactored the code a bit while working on it.

2013-05-21 00:02:04 Tree
[9ce27b] by Paul Khuong Paul Khuong

Exploit specialised VOPs for EQL of anything/constant fixnum

By swapping constant arguments to the right ourselves before
strength reducing EQL into EQ, rather than erroneously using
commutative-arg-swap.

Spotted by Douglas Katzman.

2013-05-20 22:14:43 Tree
[679437] by Paul Khuong Paul Khuong

More efficient integer=>word conversion and fixnump tests on x86-64

* Special-case on 63-bit fixnums to detect non-zero fixnum tag bits
with a shift right when converting fixnum-or-bignum to ub64.

* In fixnump/unsigned-byte-64, use MOVE to avoid useless mov x, x.

* In fixnump/signed-byte-64, use the conversion's left shift to
detect overflows.

* Based on a patch by Douglas Katzman.

2013-05-20 21:38:19 Tree
[39117f] by Paul Khuong Paul Khuong

Cleverer handling of medium (32 < bit width <= 64) constants on x86-64

* Exploit sign-extension for large unsigned constants.

* Always force the remaining operand and the result in a register:
in the worst case, we use a RIP-relative unboxed constant.

* Based on a patch by Douglas Katzman.

2013-05-20 20:58:30 Tree
[aae8dd] by Paul Khuong Paul Khuong

POPCNT instruction on x86-64

Patch by Douglas Katzman.

2013-05-20 19:26:44 Tree
[b67554] by Paul Khuong Paul Khuong

Fix disassembly for BT* instructions on x86oids

* A dedicated instruction format gets the details right.

* Patch by Douglas Katzman.

2013-05-20 19:17:36 Tree
[044fd6] by Paul Khuong Paul Khuong

Annotate disassembly with unboxed constant values

* Only on x86-64, for qword-sized values.

* Patch by Douglas Katzman.

2013-05-20 19:02:45 Tree
[ff68ef] by Paul Khuong Paul Khuong

Improved local call analysis for inlined higher-order functions

Locall analysis greatly benefits from forwarding function arguments to
their use site. Do that in locall and hopefully trigger further rewrites,
rather than waiting for a separate ir1opt phase to do its magic.

2013-05-20 18:49:33 Tree
[f25039] by Paul Khuong Paul Khuong

Constant-fold backquote of constant expressions

* There is no guarantee that backquote expressions cons up fresh
storage, so we are free to allocate (sub)lists or vectors at
compile-time. In addition to regular constant-folding, perform
part of LIST/LIST*/APPEND at compile-time.

* Fix one instance of CL:SORT of now-literal data.

* Implement SB!IMPL:PROPER-LIST-P because BACKQ-APPEND needed that.

* Based on a patch by James Y Knight; closes lp#1026439.

2013-05-20 18:11:48 Tree
[f21e0f] by Paul Khuong Paul Khuong

Enable (type-directed) constant folding for LOGTEST on x86oids and PPC

* COMBINATION-IMPLEMENTATION-STYLE can return :maybe. Like :default,
it enables transforms, but transforms can call C-I-S themselves to
selectively disable rewrites.

* Implement type-directed constant folding for LOGTEST. !x86oids/PPC
platforms get that for free via inlining.

* Use :maybe to enable all LOGTEST transforms except inlining.

2013-05-20 16:19:27 Tree
[0d8a5f] by Paul Khuong Paul Khuong

Exploit associativity to fold more constants

* Implement transforms for logand, logior, logxor and logtest to
detect patterns like (f (f x k1) k2) => (f x (f k1 k2)).

* Same for + and * of rational values.

* Similar logic for mask-signed-field: we only need to keep the
narrowest width.

2013-05-20 15:36:21 Tree
[09c781] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

room: Fix reconstituting CONS cells with unbound-marker in the CAR.

* When I originally rewrote ROOM in terms of RECONSTITUTE-OBJECT,
I looked at what constitutes a valid CONS according to the runtime.
I noticed that one of the immediate types was an unbound marker
and said to myself "nobody's going to put one of those in a list".
This turned out to be a mistake.

* x86 systems (and plausibly not any others) put unbound-markers
in lists when loading FASLs. I have no real idea how or why, but
they do. This would lead to an error, "Unrecognized widetag #x4A
in reconstitute-object".

* Fix, by recording unbound-marker-widetag as being valid as the
first word of a CONS cell.

* Issue reported by "scymtym" on #sbcl.

2013-05-20 19:43:19 Tree
[c6aa07] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Decide earlier about pinning large object pages.

* The old logic here called maybe_adjust_large_object(), and
then re-checked the pointer to preserve for validity. This is
non-optimal, as it means that maybe_adjust_large_object can't
promote pages to newspace directly, it instead merely adjusts the
page allocation to fit the possibly-shrunken object.

* It turns out that large_object pages can contain bignums,
vectors, code-objects, or in unusual cases instances. Neither
bignums, vectors, nor instances can contain embedded objects.
Code-objects can contain only functions or LRAs. None of these
objects have list-pointer-lowtag on their references. The "tail"
of a shrunken object is comprised of conses with both cells as
fixnum zero. The minor catch is that we allow untagged pointers
to pin code-allocated pages, but the saving grace here is that
code-objects don't shrink.

* Alter preserve_pointer() to test the lowtag and page type to
check for invalid pointers to large-object pages before calling
maybe_adjust_large_object() instead of bounds-checking the pointer
after the fact.

2013-05-19 17:00:52 Tree
[443808] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Fix potential out-of-bounds access in page_ends_contiguous_block_p().

* If we're testing to see if the LAST page in dynamic space is
the end of a contiguous block, and it is a full page (bytes_used
is GENCGC_CARD_BYTES), we turn around and start investigating the
next page table entry... but there isn't one, it's beyond the end
of the allocation.

* Fix, by bounds-testing the page index against the index of the
high-water mark for dynamic space. This is guaranteed to be no
more than the total maximum for the page table, and is slightly
more micro-efficient than using the actual maximum, as any page
after the high-water mark will be page_free_p().

2013-05-14 22:45:30 Tree
[28b584] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Introduce a new predicate, page_ends_contiguous_block_p().

* There are a number of places in gencgc where a number of
attributes of a page and possibly the subsequent page are tested
for various values. Invariably, this is actually testing to see
if a page ends a contiguous block.

* Extract the various tests to a new inlined predicate function,
page_ends_contiguous_block_p(), thus revealing the intent of
what's going on far better than the bare tests, and coalescing the
code to a single copy to make it easier to fix if there is a bug
in it (and there is, but this is a refactoring commit, not a
behavior change commit).

2013-05-14 22:39:06 Tree
[78a953] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Introduce a new predicate, page_starts_contiguous_block_p().

* There are a number of places in gencgc where scan_start_offset
for a page is tested for zero. Invariably, this is actually
testing to see if a page starts a contiguous block... Or starts
on an object boundary.

* Extract the various tests for a zero scan_start_offset to a
new inlined predicate function, page_starts_contiguous_block_p(),
thus revealing the intent of what's going on far better than the
bare test.

2013-05-14 00:57:03 Tree
[920989] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Rename page_table field region_start_offset to scan_start_offset.

* Let's call it what it is: The offset from where to start any
scan through the page to the start of the page. The only relation
this field has to an alloc_region is the way it is initialized.

2013-05-13 23:19:43 Tree
[8e6b74] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Commentary fix for struct page, field region_start_offset.

* Simply describing region_start_offset as being related to an
allocation region which contains a page is disingenuous at best,
and misleading at worst. Its relation with an alloc_region is due
to its initialization strategy, and has nothing to do with what
the value is for.

* Say it like it is, it's an offset to a known object boundary,
from where we can start a call to gc_search_space() or scavenge().
That's what it's for, not for keeping track of alloc_regions.

2013-05-13 22:41:11 Tree
[8c2a72] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Defer moving pinned pages to newspace as late as possible.

* Rather than moving pinned pages to newspace immediately, defer
moving them until just before we start to scavenge (evacuate) all
of the oldpsace pages.

* This, in theory, makes it easier to move pages to newspace if
they are mostly-live, rather than having to allocate new pages for
the data (increasing peak address-space use during GC), assuming
that we know that some page meets such criteria.

* While we're here, commentary updates also replace an "XX I'd
rather not do this but the GC logic can't cope with not doing it"
with an actual explanation of WHY it needs to be done. In fact,
commentary updates explain it twice, in two different locations.

2013-05-12 14:49:55 Tree
[379e3d] by Alastair Bridgewater Alastair Bridgewater , pushed by Alastair Bridgewater Alastair Bridgewater

gencgc: Fix commentary for page table allocation field.

* The commentary for the page table allocation field was
misleading, presumably not updated when the definitions for the
constants used for its actual contents were last changed, and cost
me a bit of surprise and time spent trying to figure out why core
file saving and loading worked at all.

* Updated the commentary on the allocation field to match
current reality, and added cross-references between the field
itself and the definitions for its contents, so that a future
desync between commentary and reality is less likely.

2013-05-12 15:43:09 Tree
[5b63b0] by Paul Khuong Paul Khuong

More robust function-name testing in CUT-TO-WIDTH

Let's use lvar-fun-name instead of replicating half the logic; as
a bonus, modularity transforms now heeds NOTINLINE.

2013-05-20 14:40:00 Tree
[22c592] by Paul Khuong Paul Khuong

Fix (CONCATENATE 'null ...) for generic sequences

* (CONCATENATE 'NULL SEQUENCE1 SEQUENCE2 ...) ensures that SEQUENCE1,
SEQUENCE2, ... are empty, but only did so for lists and
vectors. Instead, use new function EMPTYP which works for all
sequences. EMPTYP is not exported.

* Add generic function SEQUENCE:EMPTYP to which EMPTYP dispatches for
generic sequences. Methods for lists, vectors and generic sequences
use NULL or (ZEROP (LENGTH ...)).

* Test cases in seq.impure.lisp.

* Patch by Jan Moringen; fixes lp#1162301.

2013-05-20 05:03:01 Tree
Older >

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.





No, thanks