Alexey Dejneka writes:
> Hello,
>
> Christophe Rhodes writes:
>
>> OK, I think I understand this -- each +/mod29 must clear the top bit,
>> whereas +/smod30 need not; this is because for +/smod30 the entire
>> 30-bit effective register is always valid (the two tag bits need to be
>> cleared always, of course, though they remain clear automatically for
>> +) while +/mod29 needs special treatment of the top bit, because the
>> return value from that function must be non-negative.
>
> My point is that our separation of signed and unsigned modular
> arithmetic is wrong: the sign bit is important only for the outer
> operation. And integers of fixnum width can only be signed :-(
Right. Here's an implementation of probably an imperfect workaround:
when optimizing logand (and related unsigned friends), prefer a
suitable signed implementation of a smaller width to an unsigned
implementation with larger width. Really this distinction is between
tagged and untagged, preferring the untagged case, but Nathan's
canonical test case
(defun foom (x y)
(declare (fixnum x y))
(logand (logxor x y) most-positive-fixnum))
now compiles to
; 0A6D5AAE: 8BD0 MOV EDX, EAX
; B0: 31FA XOR EDX, EDI
; B2: 81E2FCFFFF7F AND EDX, 2147483644
which I think is right, and better than the
; 0A7D320A: 8BD0 MOV EDX, EAX
; 0C: C1FA02 SAR EDX, 2
; 0F: 8BCF MOV ECX, EDI
; 11: C1F902 SAR ECX, 2
; 14: 31CA XOR EDX, ECX
; 16: 81E2FFFFFF1F AND EDX, 536870911
; 1C: C1E202 SHL EDX, 2
No test cases fail (on x86), and in the process I found a bug in
smod30 handling of ash, which I think I've fixed.
This doesn't start addressing Nathan's point that we can also usefully
do mod32 arithmetic on x86-64, and that that should probably only
happen when a width of /exactly/ 32 is asked for, but I think this is
better than what we currently have.