[Owners of 64-bit machines should mentally replace UB32 with UB64 in
the discussion that follows.]
Christophe and anybody else who cares about boinkmarks performance will
be happy to know that I have spent a bit of time improving the code
for SB-KERNEL:UB32-BASH-COPY, which underpins optimized REPLACE on
Unicode character strings. The patch is attached and results in a
compiled UB32-BASH-COPY that is roughly 20% shorter than the one resulting
from the current code. In particular, the new version does not have
fixnum arithmetic overflow checking.
However, there is a price to pay for this improvement. Readers of the
patch will note the liberal use of TRULY-THE, which is responsible for
the elimination of the aforementioned fixnum arithmetic overflow
checking. This makes UB32-BASH-COPY a very dangerous function if called
with the wrong arguments (e.g. it could scribble on memory off the end
of arrays, silently corrupting user data structures).
I would like to commit this, but the safety considerations make discussion
on its inclusion profitable. Points in favor of its inclusion:
1) UB32-BASH-COPY is in SB-KERNEL; functions in internal packages are
not guaranteed to be "safe" (e.g. %RAW-BITS in same);
2) UB32-BASH-COPY is only used inside SBCL itself as a replacement for
REPLACE on specialized arrays where the width of the elements is
equal to the word size. The REPLACE transform generates appropriate
checking code (assuming SAFETY > 0) on the indices of the REPLACE
call. Once this checking is done, it is guaranteed that the fixnum
arithmetic inside of UB32-BASH-COPY could *never* overflow; hence,
the checking that was being done was superfluous.
3) Users can and will use functions in SB-KERNEL--this is Lisp, after
all, we trust our programmers. CLX comes immediately to mind and
there are probably others. I think such users generally use the
UB8-BASH-* family of functions, which are compiled with checking
of the consistency of their arguments. Users who use such internal
functions are also generally knowledgeable about the risks they
take by doing so.
If you have any thoughts, pro or con, about the patch, please voice them;
otherwise, this patch will be committed early in the 0.9.2.x cycle.
Nathan | From Man's effeminate slackness it begins. --Paradise Lost
The last good thing written in C was Franz Schubert's Symphony Number 9.