You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(19) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
1
(5) |
2
(3) |
3
(1) |
4
(4) |
5
(1) |
6
(11) |
7
(5) |
|
8
|
9
(6) |
10
(2) |
11
(10) |
12
|
13
|
14
(4) |
|
15
(7) |
16
(1) |
17
(3) |
18
|
19
|
20
|
21
(1) |
|
22
(1) |
23
|
24
|
25
|
26
|
27
|
28
(4) |
|
29
|
30
|
31
|
|
|
|
|
|
From: Jeremy F. <je...@go...> - 2002-12-09 21:51:40
|
On Mon, 2002-12-09 at 11:32, Julian Seward wrote:
> [...]
> > That said, it has been a long while since I looked at that patch in
> > detail, so maybe there's some simple improvements. In particular, I
> > think it leaves some dead code, so that should be cleaned up.
>
> I just a bit disinclined against having two mechanisms for integer
> multiplication (the helper fns _and_ direct ucode). If the direct route
> covered all the bases, I'd take it. Not only does it allow scope for better
> instrumentation, the generated code is surely better too.
Well, there are really two kinds of multiply: the NxN->2N set, and the
NxN->N set. The latter has a UInstr, but the former are done with
helpers. Since the 2N forms are slower instructions which stomp
specific registers, they're not really desireable to generate all the
time; it seems to me that to support inline code generation for the 2N
forms pretty much requires separate opcodes, which leads for 4 being
used for multiply (though perhaps a flag can be used to distinguish
either N from 2N or signed from unsigned, though all the unsigned
multiplies are 2N).
I don't think the quality of the generated code is all that important
since the helpers aren't that expensive to call (push and pop are
cheap), and 2N forms are hardly ever used in code I've tested. Also,
making sure that everything is in the right register would kill a lot of
the potential efficency gains (unless the regalloc can be changed to
make sure that specific temp end up in specific registers so that the
rearrangement happens at compile time rather than runtime - but that
sounds even more complex).
J
|
|
From: Julian S. <js...@ac...> - 2002-12-09 19:25:04
|
[...] > That said, it has been a long while since I looked at that patch in > detail, so maybe there's some simple improvements. In particular, I > think it leaves some dead code, so that should be cleaned up. I just a bit disinclined against having two mechanisms for integer multiplication (the helper fns _and_ direct ucode). If the direct route covered all the bases, I'd take it. Not only does it allow scope for better instrumentation, the generated code is surely better too. > > I've started to fix various end-user reported bugs, as part of > > stabilisation efforts, as you'll see from the cvs mail. > > Yes. As you can see I've started making the attempt at packaging > everything up. I think we should push out another dev snapshot soon so > that we can get more eager testers. I'll try building the current head on various distros, and if that looks promising, I'll try and emit a 1.9.1 snapshot this evening. J |
|
From: Jeremy F. <je...@go...> - 2002-12-09 17:26:57
|
I notice you implemented the rest of the jccs. It struck me that a more
efficient pattern for the SF == OF and SF != OF (jnl/jl) tests would be:
testl $EFlagS|EFlagO, EFLAGS(%ebp)
j[n]p true
The ones which involve Z as well could use 2 jumps:
testl $EFlagZ, EFLAGS(%ebp)
j[n]z true
testl $EFlagS|EFlagO, EFLAGS(%ebp)
j[n]p true
I've tested the simple case - seems to work fine (69-simple-jlo). I
have no idea if two jumps is better or worse than the shifts and bitops,
but it does require a different code structure, so it isn't quite such a
simple patch.
J
|
|
From: Jeremy F. <je...@go...> - 2002-12-09 05:29:29
|
On Sat, 2002-12-07 at 05:43, Julian Seward wrote:
> -- work out the fiddly details of extending this to accesses of
> sizes 1, 2 and 8 (treat as 2 x 4 ?).
1 and 2 should be easy; just round the address down to the next multiple
of 4 and probe (since presence in the hash means that N - N+3 are
valid). 8 should also be easy, as two probes, but probably isn't common
enough to spend lots of effort on. Misalignment where the access can
cross a multiple of 4 boundary is irritating, but see below:
> -- make sure that the working out gives correct, slow-case behaviour
> for all possible cases of misaligned addresses.
Misaligned accesses are the tricky bit. Detecting a mis-aligned access
is going to complicate the test site somewhat (another test and
conditional jump). You could make it fall into the slow path, but
testing the next hash entry up would be just as simple (and a hack to
make this slightly quicker: if you have a hash with 2^N entries, then
make the array 2^N+1 entries long, with the last entry always being a
copy of the first entry - that way you can always probe the next one up
without worrying about wrapping). On the other hand, there probably
aren't enough misaligned accesses to make it worth complicating the
inline fastpath.
Hm, so the details:
For size == 4, the access is aligned iff a & 3 == 0. So testing for
that is easy.
For size == 2, the access is aligned (as in not crossing a multiple-of-4
address) if (a & 3 < 3), which can be tested with:
testl $3, %addr
jz aligned // addr is ....00
jnp aligned // addr is ....10 or ....01
call slowpath
jmp done
aligned:
// fastpath
done: ...
And fortunately, size==1 can't be misaligned.
So I guess the full code for size = 4 would be:
testl $3, %a
jnz slow
movl %a, %r
andl $MASK, %r
cmpl cache(%r), %a
jz done
slow: call slow-path
done:
For size == 2:
testl $3, %a
jz fast
jp slow
fast: movl %a, %r
andl $MASK, %r
cmpl cache(%r), %a
jz done
slow: call slow-path
done:
For size ==1:
> movl %a, %r -- %r := %a
> andl $(N_MASK << 2), %r -- %r := sizeof(cache-slot) *
> index(a)
> cmpl cache(%r), %a -- Z flag set iff cache hit
> jz fast-case-continuation
>
> call slow-case-helper
>
> fast-case-continuation:
> -- figure out how the cache interacts with it's backing store,
> ie the existing sparse array
> -- sanity check the entire story, including that about invalidating
> cache entries (I think my story is ok, but not 100% sure)
You mean putting ~a into index(a). Seems reasonable to me; it could
only be a problem if (a & mask) can ever equal (~a & mask); the simple
case is where mask = ~0: can a == ~a?
> -- figure out how this impacts set_address_range_perms(), since that
> is a frequently-called function (every time the simulated machine's
> %esp changes!)
Shouldn't be too hard to write an efficient cache-stomper.
J
|
|
From: Jeremy F. <je...@go...> - 2002-12-09 03:48:02
|
On Sun, 2002-12-08 at 16:25, Julian Seward wrote:
> Have considered 01-partial-mul but am somewhat put off by the fact that it
> doesn't cover all smul+umul cases and therefore only patchily achieves its
> aim. How about modifying the UMUL/SMUL uinstrs so that they do a
> NxN -> 2N multiply for N=8/16/32 bits, taking two TempRegs, which are
> read as operands, and then have the double-length result written to them
> both? This simplifies the code generation too since you can just generate
> the NxN -> 2N x86 insn (IIRC; not sure if it is available for insns and
> signedness)?
>
> Or perhaps it's not worth the effort.
Well, in terms of frequency, I didn't find any of the other multiply
forms being used in real code. gcc can be convinced to use the 8 bit
multiply, but partial results don't matter there (at least, I haven't
found any uses of multiply which expect partial results from partial
arguments at the bit level).
In particular, I didn't find any uses of unsigned multiply, so I'm
really unsure about whether its worth adding a new UMUL UInstr just for
its sake. (I know I reserved an opcode for it, but there's no other
support for it.)
That said, it has been a long while since I looked at that patch in
detail, so maybe there's some simple improvements. In particular, I
think it leaves some dead code, so that should be cleaned up.
> I've started to fix various end-user reported bugs, as part of
> stabilisation efforts, as you'll see from the cvs mail.
Yes. As you can see I've started making the attempt at packaging
everything up. I think we should push out another dev snapshot soon so
that we can get more eager testers.
J
|
|
From: Julian S. <js...@ac...> - 2002-12-09 00:17:47
|
Hi. I merged 61-special-d 62-lazy-eflags 67-dist 65-fix-ldt 55-ac-clientreq Thanks as ever for them. Have considered 01-partial-mul but am somewhat put off by the fact that it doesn't cover all smul+umul cases and therefore only patchily achieves its aim. How about modifying the UMUL/SMUL uinstrs so that they do a NxN -> 2N multiply for N=8/16/32 bits, taking two TempRegs, which are read as operands, and then have the double-length result written to them both? This simplifies the code generation too since you can just generate the NxN -> 2N x86 insn (IIRC; not sure if it is available for insns and signedness)? Or perhaps it's not worth the effort. I've started to fix various end-user reported bugs, as part of stabilisation efforts, as you'll see from the cvs mail. Thx for your msg re meaning of new_emit, which just arrived. J |