From: Andreas R. <and...@gm...> - 2004-04-06 21:29:27
|
Hi Dan, > There are a couple of cleanups like the size of the numbered > primitive field and the layout of the format field in classes. > I am assuming we'll just do those, and that they'll take a day > or two each. Any problems with Tim's suggestion to drop the > limit to 511? Fine with me. One thing to keep in mind here: When the "split primitive index" was introduced we lost one bit in the method header which would have allowed us to use up to 31 args for methods. The bytecodes to support it are actually there but the method header doesn't have that extra bit. I'm not saying that we should necessarily use it but we should be Really Careful before assigning it for something else and later realize that we might need it. > Compact classes > I put these in way back, and they do save almost a word per > object in small kernel images. They complicate things a bit, > but the code is already written, and it has been stable for > nearly a decade now. > > On the other hand, the complexity is an inhibition to trying > out various other cool hacks. Moreover if we dropped the compact > class feature, we could pick up 5 more bits of object hash, which > would definitely be nice. With regard to compactness, my experience > is that the savings due to this feature is less than the difference > between a well-designed kernel and a poor one, so I don't consider > it a bit issue. Even PDA's have plenty of space these days (yes, > there's an upside to the Microsoft world ;-), and for extreme > compactness issues (like web downloads), there are all sorts of > other compactness strategies. > > I'm inclined to drop them and give the bits back to the hash, > but it's only a mild inclination. How do you guys feel? No strong opinions here. If we agree that the savings in size aren't worth it, we should go with whatever is best for speed (both, current and future such as a JIT - it might have a benefit to "tag classes" via their cc index). > Immediate objects (tagging) > We *could* do lots of things here -- enough so I start to get > worried about doing too much. For instance, we could have > immediate Floats of, say, 30 bits, and immediate Characters > of 30 bits, in addition to our current immediate SmallIntegers. > > I'm inclined not to do any of these, but how do you feel? A very clear "NO" to the idea of taking away a bit from SmallInteger. 30bit floats are almost useless - if I had understood the trouble I'm getting into when I first encountered them in VW I would've never used them ;-) Even 32bit floats are almost always useless except in very limited situations (say 3D stuff where accuracy really doesn't count all that much). Here is an alternative that I considered a while ago: Instead of using "another tag" bit we could declare a tag pattern which looks like: 00 - OOP 10 - Immediate x1 - SmallInteger With "Immediate" being described by 6 further bits which specify the index into the "immediate classes array" (precisely the same as compact classes). This would leave 24 bits which I could (out of my mind) think of 4 situations in which they would be useful: * Colors (8 bits each) * Immediate Points (signed 12 bits x/y) * Wide characters (Unicode is defined as 21 bit; but Yoshiki really needs 24) * Scaled decimals (16.8 or 12.12) The only problem with using the 10 tag is that the garbage collector currently uses that - but I believe this is fixable. In any case, there is a clear objection from my end against taking away a bit from SmallIntegers. > Form bits modulus > I want to change the modulus of Form bitmaps to 64 bits > in both 32- and 64-bit images. It will simplify things > for them to be the same, and I think we can convert old ones > on the fly for compatibility (ie, when reading in a project, > or an old '.form' file. Do you see any problems with this? A key question is whether the "bits" of a 64bit form would store 32bit quantities or 64bit. If the latter I'd say this is going to get problematic. If the first, e.g., we merely extend the form bits to make sure we have always 64bit aligned pixel rows, that should fine. > CompiledMethods > A major question to decode is whether to split up bytecodes > and literals (and source pointer) as in Tim's NewCompiledMethod > work. This adds a little space overhead, but it certainly makes > things simpler. > > And what about native methods support? Is there something simple > that would suffice, or something better? This is a time when we > could open some possibilities. If we had some idea, then at least > we could put the right hooks in the image, even if we need to do > some VM work to use it later. I don't have a good understanding about the tradeoffs involved (Ian should comment on this) so I'll just throw in a general thought: I think that if we can (e.g., not being overly wasteful in terms of space) we should leave some "unused" slot. Much of the experimentation we saw in the past was only possible because such "unused" places existed (say, MethodContext's receiverMap). > Closures > I'm just assuming we would load some form of Anthony's work, > which probably also means using his compiler. Ian, are you > happy with the structures he chose, or should we make some > changes before loading them up? I'm assuming this is separable > from the contiguous stack work, but maybe not. We probably > want to start with something that doesn't change the existing > context structures much. I guess a key issue here is the question whether any changes in V4 could have a long-term benefit for the closure work. As it stands, Anthony's stuff works out of the box with a VM that has a few extra primitives so it seems as if it isn't intrinsically related to image format changes. > Immutability bit > This sounds intriguing. Has anyone here thought seriously > about what is needed to support it? Not very much. Obviously you need to put an "immutable object" into some place where you have a cheap write barrier or at least don't have to check on every store (so clearly it can't reside in young space). But other than that, no. > Monitor bit > This was a concern of Alan's way back. Andreas - is this > needed for any of the Croquet synchronization work? Uh... what is a monitor bit? This is the first time I hear this term ;-) Cheers, - Andreas |
From: Yoshiki O. <Yos...@ac...> - 2004-04-06 21:50:33
|
Dan and Andreas, > > Compact classes > > I put these in way back, and they do save almost a word per > > object in small kernel images. They complicate things a bit, > > but the code is already written, and it has been stable for > > nearly a decade now. > > > > On the other hand, the complexity is an inhibition to trying > > out various other cool hacks. Moreover if we dropped the compact > > class feature, we could pick up 5 more bits of object hash, which > > would definitely be nice. With regard to compactness, my experience > > is that the savings due to this feature is less than the difference > > between a well-designed kernel and a poor one, so I don't consider > > it a bit issue. Even PDA's have plenty of space these days (yes, > > there's an upside to the Microsoft world ;-), and for extreme > > compactness issues (like web downloads), there are all sorts of > > other compactness strategies. > > > > I'm inclined to drop them and give the bits back to the hash, > > but it's only a mild inclination. How do you guys feel? > > No strong opinions here. If we agree that the savings in size aren't worth > it, we should go with whatever is best for speed (both, current and future > such as a JIT - it might have a benefit to "tag classes" via their cc > index). I agree this. 1.5MB image with compact classes and 2MB image without compact classes (a made up example) are pretty much in the same ballpark. > > Immediate objects (tagging) > > We *could* do lots of things here -- enough so I start to get > > worried about doing too much. For instance, we could have > > immediate Floats of, say, 30 bits, and immediate Characters > > of 30 bits, in addition to our current immediate SmallIntegers. > > > > I'm inclined not to do any of these, but how do you feel? > > A very clear "NO" to the idea of taking away a bit from SmallInteger. 30bit > floats are almost useless - if I had understood the trouble I'm getting into > when I first encountered them in VW I would've never used them ;-) Even > 32bit floats are almost always useless except in very limited situations > (say 3D stuff where accuracy really doesn't count all that much). Yes. 30bit or (62 bit) float save us too much. > Here is an alternative that I considered a while ago: Instead of using > "another tag" bit we could declare a tag pattern which looks like: > > 00 - OOP > 10 - Immediate > x1 - SmallInteger > > With "Immediate" being described by 6 further bits which specify the index > into the "immediate classes array" (precisely the same as compact classes). > This would leave 24 bits which I could (out of my mind) think of 4 > situations in which they would be useful: > * Colors (8 bits each) > * Immediate Points (signed 12 bits x/y) > * Wide characters (Unicode is defined as 21 bit; but Yoshiki really needs > 24) > * Scaled decimals (16.8 or 12.12) I'd like to have 30 bit for a char. I think once Tim suggested this, but how about using "01" for OOP and "00" for SmallInteger? Some processor's addressing mode let us access the word-aligned memory with such pointer, while "no-tag" for SmallInteger may save some bit-operations. (Well, I'd admit that the saving we can do here is small compared to keeping the compatibility with 32-bit image format. So, I don't push this idea too much.) -- Yoshiki |
From: Ian P. <ian...@in...> - 2004-04-06 22:11:37
|
Hi Yoshiki, > how about using "01" for OOP and "00" for SmallInteger? Some > processor's addressing mode let us > access the word-aligned memory with such pointer, while "no-tag" for > SmallInteger may save some bit-operations. The effectiveness of this depends on whether you can keep -1 around permanently in a register to use as an index (or add it in as a constant) on all pointer defeferencing operations. The G5 can do that (either of them). I don't believe the Itanium can, however. Cheers, Ian |
From: Andreas R. <and...@gm...> - 2004-04-06 22:35:50
|
Yoshiki, > I'd like to have 30 bit for a char. The scheme I proposed wouldn't get us there unless we would reserve "10" for characters and that (for me) doesn't seem worth the hazzle of dealing with the "extra immediates". The point of the proposal was that similar to compact classes there may be a number of immediate classes and we would definitely reserve some space so people would be able to define their own. So there would always be "a few bits" taken away and 24 made sense to me because it is divisable by two as well as three and four (none of the values above are) so you would have the option to say you use either 24, 2x12, 3x8, or 4x6 bit which seems worthwhile if you take it with the (potentially) 63 immediate classes you could define. > I think once Tim suggested this, but how about using "01" for OOP > and "00" for SmallInteger? Some processor's addressing mode let us > access the word-aligned memory with such pointer, while "no-tag" for > SmallInteger may save some bit-operations. I don't think that's a good idea. For one thing, the places where we use tag checks today, would still require them (no advantage there). For another if a processor doesn't have a way of ignoring the low bits you pay significant penalties (disadvantage here). Lastly, I don't really see where this would really help any integer operations - most of them can simply drop the low bits and would have to do this regardless of whether there's a one or a zero in it (no clear advantage here). So it seems that this equation seems to be heavier on the "don't do it if you are uncertain if all processors can do it efficiently" side. Cheers, - Andreas |
From: Ian P. <ian...@in...> - 2004-04-06 22:41:01
|
On 07 Apr 2004, at 00:35, Andreas Raab wrote: > Lastly, I don't really see where this would > really help any integer operations - most of them can simply drop the > low > bits and would have to do this regardless of whether there's a one or > a zero > in it (no clear advantage here). If you can do double-word products and then correct for the 4-bit "drift", it saves you a cycle or two. Otherwise I agree. Ian |
From: Ian P. <ian...@in...> - 2004-04-06 22:48:48
|
On 06 Apr 2004, at 23:50, Yoshiki Ohshima wrote: > how about using "01" for OOP > and "00" for SmallInteger? Some processor's addressing mode let us > access the word-aligned memory with such pointer, while "no-tag" for > SmallInteger may save some bit-operations. Just another data point: some Smalltalk implementations put the SmallInteger tag in the topmost bit. This makes SI tag and overflow checks after arithmetic simpler: addition and subtraction work in-place, plus you can just look at the sign flag after the operation instead of "mask + test-zero" or "shift + xor + sign-test". On architectures where you can set the sign flag during move this can also often eliminate any need to mask and test on the tag bit; after a move you can "trap" immediately on (non-)SI oops. Cheers, Ian |
From: Andreas R. <and...@gm...> - 2004-04-06 23:06:45
|
> Just another data point: some Smalltalk implementations put the > SmallInteger tag in the topmost bit. This makes SI tag and overflow > checks after arithmetic simpler: addition and subtraction work > in-place, plus you can just look at the sign flag after the operation > instead of "mask + test-zero" or "shift + xor + sign-test". > > On architectures where you can set the sign flag during move this can > also often eliminate any need to mask and test on the tag bit; after a > move you can "trap" immediately on (non-)SI oops. Hm... if we'd accept cutting the address range in half (which isn't that big of a deal) that is actually a pretty interesting idea. It would allow us to use low bits as "other tags" similar to immediate, say 00->oop, 01->GC flag, 10->Character, 11->Undefined, without too much of a bother. The only objection I'd have is whether this would have any effects on machines that "like to give you the upper half of the address space" (e.g., all addresses must have that "tag bit" set). Cheers, - Andreas |
From: Ian P. <ian...@in...> - 2004-04-06 23:20:22
|
On 07 Apr 2004, at 01:06, Andreas Raab wrote: > The only objection I'd have is whether this would have any effects on > machines that "like to give you the upper half of the address space" > (e.g., > all addresses must have that "tag bit" set). I've never met one of these (outside of 32-bit address spaces). This would only be serious problem on machines that weren't consistent about whether the top bit was set (consider: top N bits of virtual address tell you what kind of segment the memory is allocated in). Given that we've only (reaslistically) got 3 s/w architectures to worry about, the experiment would be trivial to perform... (I'll do so as soon as I get the chance.) Cheers, Ian |
From: Tim R. <ti...@su...> - 2004-04-06 23:28:25
|
In message <F8D...@in...> Ian Piumarta <ian...@in...> wrote: > On 07 Apr 2004, at 01:06, Andreas Raab wrote: > > > The only objection I'd have is whether this would have any effects on > > machines that "like to give you the upper half of the address space" > > (e.g., > > all addresses must have that "tag bit" set). > > I've never met one of these (outside of 32-bit address spaces). This > would only be serious problem on machines that weren't consistent about > whether the top bit was set (consider: top N bits of virtual address > tell you what kind of segment the memory is allocated in). > > Given that we've only (reaslistically) got 3 s/w architectures to worry > about, the experiment would be trivial to perform... (I'll do so as > soon as I get the chance.) RISC OS uses the upper half of memory quite happily. Mess up the VM for my OS and I'll be popping around for a vigorous discussion... Windows, Mac and *nix are not the only systems around and even if they were the chances of none of them ever changing to use top-bit-set memory addresses seems zero. tim -- Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim Strange OpCodes: SEXI: Sign EXtend Integer |
From: Ian P. <ian...@in...> - 2004-04-06 23:37:33
|
On 07 Apr 2004, at 01:27, Tim Rowledge wrote: > In message <F8D...@in...> > Ian Piumarta <ian...@in...> wrote: >> >> I've never met one of these (outside of 32-bit address spaces). > > Windows, Mac and *nix are not the only systems around Oh, just go install NetBSD on your ARM and never look back. ;) > and even if they > were the chances of none of them ever changing to use top-bit-set > memory addresses seems zero. I had accidentally slipped into 64-bit mode. (The clue was inside the parentheses.) Besides, with a little healthy abstraction here and there, the tag bit(s) can go anywhere the target wants them the most... Ian |
From: Ned K. <ne...@bi...> - 2004-04-07 22:55:00
|
On Tuesday 06 April 2004 3:48 pm, Ian Piumarta wrote: > On 06 Apr 2004, at 23:50, Yoshiki Ohshima wrote: > > how about using "01" for OOP > > and "00" for SmallInteger? Some processor's addressing mode let us > > access the word-aligned memory with such pointer, while "no-tag" for > > SmallInteger may save some bit-operations. > > Just another data point: some Smalltalk implementations put the > SmallInteger tag in the topmost bit. This makes SI tag and overflow > checks after arithmetic simpler: addition and subtraction work > in-place, plus you can just look at the sign flag after the operation > instead of "mask + test-zero" or "shift + xor + sign-test". > > On architectures where you can set the sign flag during move this can > also often eliminate any need to mask and test on the tag bit; after a > move you can "trap" immediately on (non-)SI oops. On a related note, does it seem wasteful to anyone but me that we do the following in primBytecodeAdd: int rcvr; int arg; rcvr = longAt(localSP - (1 * 4)); arg = longAt(localSP - (0 * 4)); if (((rcvr & arg) & 1) != 0) /* areIntegers: rcvr and: arg */ { result = ((rcvr >> 1)) + ((arg >> 1)); if ((result ^ (result << 1)) >= 0) /* isIntegerValue: result */ { /* begin internalPop:thenPush: */ longAtput(localSP -= (2 - 1) * 4, ((result << 1) | 1)); /* begin fetchNextBytecode */ currentBytecode = byteAt(++localIP); goto l9; } } else { /* Try to add them as a float */ /* If success, we're done. Get the next bytecode and loop */ } /* otherwise, do a normal send */ For a total operation count (not counting the stack load/store) of: 1 AND 2 bit tests 2 right shifts 2 left shifts 1 OR 1 XOR 1 ADD when for the majority of additions (those that don't overflow 31 bits) we only have to add the top two values (resetting one low bit first so that we don't get a carry from B0 to B1). Seems like we could save the shifts in most cases by looking at the top two bits of the receiver and argument; if the sign bits are different or the high bits (B30) are both the same as the sign bits we aren't going to get any overflow. -- Ned Konz http://bike-nomad.com GPG key ID: BEEA7EFE |
From: Ian P. <ian...@in...> - 2004-04-08 00:27:01
|
On 08 Apr 2004, at 00:54, Ned Konz wrote: > On a related note, does it seem wasteful to anyone but me that we do > the > following in primBytecodeAdd: Probably not. ;) You introduce an additional branch into the critical path, by checking both operands for overflow instead of checking just the result. > Seems like we could save the shifts in most cases by looking at the > top two > bits of the receiver and argument; if the sign bits are different or > the high > bits (B30) are both the same as the sign bits we aren't going to get > any > overflow. You end up with exactly the same number of instructions anyway. Current version: lwz r3,0xfffc(r27) lwz r4,0(r27) and r28,r3,r4 andi. r9,r28,0x1 beq <fail> srawi r5,r3,1 srawi r0,r4,1 add r4,r5,r0 rlwinm r2,r4,1,0,30 xor. r9,r4,r2 blt <fail> ori r6,r2,0x1 stwu r6,0xfffc(r27) <dispatch> Nedified version: lwz r3,0xfffc(r27) lwz r4,0(r27) xor. r0,r3,r4 blt <fail> rlwinm r5,r3,1,0,30 xor. r2,r5,r3 blt <fail> rlwinm r6,r4,1,0,30 xor. r2,r6,r4 blt <fail> add r7,r3,r4 addi r3,r7,0xffff stwu r3,0xfffc(r27) <dispatch> While it probably won't impact speed on a decent implementation of the CPU (the additional branch will be predicted correctly) it won't increase speed either (you haven't reduced the overall number of data hazards the pipeline has to deal with). Cheers, Ian |
From: Ian P. <ian...@in...> - 2004-04-08 01:02:44
|
On 08 Apr 2004, at 02:39, Andreas Raab wrote: > Which reminds of something else we were talking about in the past: > Passing > primitive arguments as C arguments instead of the Smalltalk stack. Which reminds me of something else Dan & I talked about in the past: evaluating arguments from right to left. Saves an awful lot of tedious peeking into the middle of the stack to pick up the receiver. (Combined with the above, potentially wins Really Big for 386 too. OTOH, the tradeoffs for register architectures are a little more complex.) Cheers, Ian |
From: Andreas R. <and...@gm...> - 2004-04-08 01:46:30
|
Which reminds about something totally unrelated but potentially *hugely* helpful: How about if we disable GC in primitives? This idea came back recently when we were talking about chasing GC problems - I don't even want to know how many places we have that aren't GC safe. And I wonder if it's even worthwhile to do this in primitives. If it is, we could still have a flag that basically "turns GC back on" (and this could be the default for quick-indexed primitives). Or maybe we just turn it off for any kind of named primitives. Thoughts? - Andreas ----- Original Message ----- From: "Ian Piumarta" <ian...@in...> To: "Andreas Raab" <and...@gm...> Cc: <squ...@li...> Sent: Thursday, April 08, 2004 3:02 AM Subject: Re: [Squeak-VMdev] Versiojn 4 changes > On 08 Apr 2004, at 02:39, Andreas Raab wrote: > > > Which reminds of something else we were talking about in the past: > > Passing > > primitive arguments as C arguments instead of the Smalltalk stack. > > Which reminds me of something else Dan & I talked about in the past: > evaluating arguments from right to left. Saves an awful lot of tedious > peeking into the middle of the stack to pick up the receiver. > (Combined with the above, potentially wins Really Big for 386 too. > OTOH, the tradeoffs for register architectures are a little more > complex.) > > Cheers, > Ian > |
From: Tim R. <ti...@su...> - 2004-04-08 02:12:50
|
In message <066701c41d0b$4c545ee0$b2d0fea9@R22> "Andreas Raab" <and...@gm...> wrote: > Which reminds about something totally unrelated but potentially *hugely* > helpful: > > How about if we disable GC in primitives? From my audit recently to find places that needed looking at for messing with the 'interrupt check right now' I found very few numbered prims can trigger a GC. Most of those that could, ought to be rewritten to fail and let the image work it out. The nastiest case I can remember is when sending a message and trying to allocate a context; run out of memory there and there probably isn't much that can be done. Perhaps automatically send an email to RamChipsRUs.com? I'd suggest that the VM changes are pretty small but ought to include the ability to pass back an error value (fortunately I have ancient code to do that sitting somewhere) so the image knows what the problem was and does the smart thing. tim -- Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim Old programmers never die; they just branch to a new address. |
From: Andreas R. <and...@gm...> - 2004-04-08 10:08:21
|
> I'd suggest that the VM changes are pretty small but ought to include > the ability to pass back an error value (fortunately I have ancient > code to do that sitting somewhere) so the image knows what the problem > was and does the smart thing. Yes, that is a "must do" in my understanding. Thanks for reminding. I think we might just extend #primitiveFail to include an "error reason", so you would use it via: interpreterProxy->primitiveFail(ERROR_BAD_ARGUMENT); or somesuch. Cheers, - Andreas |
From: Tim R. <ti...@su...> - 2004-04-08 02:21:50
|
In message <6F7...@in...> Ian Piumarta <ian...@in...> wrote: > On 08 Apr 2004, at 02:39, Andreas Raab wrote: > > > Which reminds of something else we were talking about in the past: > > Passing > > primitive arguments as C arguments instead of the Smalltalk stack. > That could save time in the primitive code, but what would it cost in the calling code? Outside of a translator, how would we load up the registers (and think of the platform differences in which registers and how many etc)? Wouldn't it end up with primitiveResponse looking like switch(numArgs) { case 1: (prim)(*sp); break; case 2: (prim)(*sp, *s--p); break; etc which surely wouldn't net much benfit? > Which reminds me of something else Dan & I talked about in the past: > evaluating arguments from right to left. You mean having rcvr as TOS at prim call time? What benefit does that have ? Is there some specialness about the TOS value on x86? tim -- Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim Useful Latin Phrases:- Utinam coniurati te in foro interficiant! = May conspirators assassinate you in the mall! |
From: Ian P. <ian...@in...> - 2004-04-08 02:32:21
|
On 08 Apr 2004, at 04:20, Tim Rowledge wrote: > In message <6F7...@in...> > Ian Piumarta <ian...@in...> wrote: > >> On 08 Apr 2004, at 02:39, Andreas Raab wrote: >> >>> Which reminds of something else we were talking about in the past: >>> Passing >>> primitive arguments as C arguments instead of the Smalltalk stack. >> > That could save time in the primitive code, but what would it cost in > the calling code? Outside of a translator, how would we load up the > registers (and think of the platform differences in which registers and > how many etc)? Why did I say: "the tradeoffs for register architectures are more complex"? > You mean having rcvr as TOS at prim call time? What benefit does that > have ? You don't need to know how many arguments are passed to know where the receiver is. If you want to do anything more interesting that dumb interpretation, chances are you end up having the receiver right where you need it just when you discover it's time to dynamic bind. Makes inlining statically-bound sends, not to mention deleting that useless frame pointer, a whole lot easier. Too lazy to think of any more excuses. (No, nothing special about TOS on x86. The only thing there was R-2-L eval + C ABI args gives you zero-copy callout to prims.) Ian |
From: Andreas R. <and...@gm...> - 2004-04-08 22:47:21
|
Tim, > That could save time in the primitive code, but what would it cost in > the calling code? Outside of a translator, how would we load up the > registers (and think of the platform differences in which registers and > how many etc)? Wouldn't it end up with primitiveResponse looking like > > switch(numArgs) { > case 1: (prim)(*sp); break; > case 2: (prim)(*sp, *s--p); break; > etc > which surely wouldn't net much benfit? The idea was that you would specify primitives with arguments (like FFI and/or SmartInterpreterplugin) and then glue gets generated for the interpreter. So it would actually look somewhat along the lines of: int gluePrimitiveAt() { return primitiveAt(stackTop()) } int primitiveAt(oop index) { /* yaddaya */ } Cheers, - Andreas |
From: John M M. <jo...@sm...> - 2004-04-08 04:25:56
|
Well we only trigger a GC because of allocation count, or some other condition, or if in fact we've run out of memory. Certainly I think you could change that to allow growth of the image if the VM support it, but not to trigger GC activity. If growth fails and we can't find the memory then exit to shell I'd guess. I seem to recall we aren't very good at post checking object allocation in primitives and handing failure cases so failure (write the stack to stdout) is ok. Could then get rid of the remapping logic I'd guess that handles the current messy details of oops moving during allocation in prim. On Apr 7, 2004, at 6:46 PM, Andreas Raab wrote: > Which reminds about something totally unrelated but potentially > *hugely* > helpful: > > How about if we disable GC in primitives? > > This idea came back recently when we were talking about chasing GC > problems - I don't even want to know how many places we have that > aren't GC > safe. And I wonder if it's even worthwhile to do this in primitives. > If it > is, we could still have a flag that basically "turns GC back on" (and > this > could be the default for quick-indexed primitives). Or maybe we just > turn it > off for any kind of named primitives. > > Thoughts? > > - Andreas > > ----- Original Message ----- > From: "Ian Piumarta" <ian...@in...> > To: "Andreas Raab" <and...@gm...> > Cc: <squ...@li...> > Sent: Thursday, April 08, 2004 3:02 AM > Subject: Re: [Squeak-VMdev] Versiojn 4 changes > > >> On 08 Apr 2004, at 02:39, Andreas Raab wrote: >> >>> Which reminds of something else we were talking about in the past: >>> Passing >>> primitive arguments as C arguments instead of the Smalltalk stack. >> >> Which reminds me of something else Dan & I talked about in the past: >> evaluating arguments from right to left. Saves an awful lot of >> tedious >> peeking into the middle of the stack to pick up the receiver. >> (Combined with the above, potentially wins Really Big for 386 too. >> OTOH, the tradeoffs for register architectures are a little more >> complex.) >> >> Cheers, >> Ian >> > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IBM Linux Tutorials > Free Linux tutorial presented by Daniel Robbins, President and CEO of > GenToo technologies. Learn everything from fundamentals to system > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > _______________________________________________ > Squeak-VMdev mailing list > Squ...@li... > https://lists.sourceforge.net/lists/listinfo/squeak-vmdev > > -- ======================================================================== === John M. McIntosh <jo...@sm...> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
From: John M M. <jo...@sm...> - 2004-04-08 04:40:53
|
Tim just posted a note about Processor Yield prim call on the mail list. Which reminded me of a change I did a few years back to collect Process dispatch time as part of the VM scheduler. The current processor watcher can only estimate that value now and it requires a quite a bit of overhead, where as it could be preformed at little cost within the VM yield logic. -- ======================================================================== === John M. McIntosh <jo...@sm...> 1-800-477-2659 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== === |
From: <gor...@bl...> - 2004-04-08 07:01:18
|
John M McIntosh <jo...@sm...> wrote: > Could then get rid of the remapping logic I'd guess that handles the > current messy details of oops moving during allocation in prim. Aargh! And I just learned how to use that stuff! :) :) (My GtkPlugin creates an Array of ByteArrays for my little callback mechanism etc, it sure took me a while to get all those pushs and pops in the right order...) regards, Göran |
From: Andreas R. <and...@gm...> - 2004-04-08 09:46:57
|
John, Yup, that's pretty much what I had in mind, e.g., * when we go into a primitive we set a disableGC flag * if there is allocation we fall through in sufficientSpaceToAllocate: (possibly growing mem if needed but not GCing) * upon primitive return we do the allocation check and GC if necessary That's all. We would effectively continue to run everything else as it is today, and the red zone we have for signaling low-space as well as the ability to grow/shrink will be able to deal with the remaining situations. Effectively, all we need to do is to make sure we have "enough headroom for the primitive", and I would be surprised if we *ever* had more than 1k allocation per primitive except in #new: - and those primitives might be marked "gc-safe" to begin with (e.g., resetting the disableGC flag and dealing with remapping). This would add an extra check at the end of primitive returns but to me, this is acceptable if I consider all the potential and yet undiscovered GC hazards we potentially have right now. And heck, we might be able to hack this right away... really there is no need to wait for V4 to get this going. And yes, the point would be to get away from the messy remappings - I was recently reviewing some primitive code and not surprisingly I found three potential GC problems in the one method I was looking at. I think GC problems is the single biggest issue we have for writing prims, followed by argument passing and stack imbalance. Cheers, - Andreas ----- Original Message ----- From: "John M McIntosh" <jo...@sm...> To: <squ...@li...> Sent: Thursday, April 08, 2004 6:25 AM Subject: Re: [Squeak-VMdev] Versiojn 4 changes > Well we only trigger a GC because of allocation count, or some other > condition, or if in fact we've run out of memory. > Certainly I think you could change that to allow growth of the image if > the VM support it, but not to trigger GC activity. > If growth fails and we can't find the memory then exit to shell I'd > guess. I seem to recall we aren't very good at post checking object > allocation in primitives and handing failure cases so failure (write > the stack to stdout) is ok. > > Could then get rid of the remapping logic I'd guess that handles the > current messy details of oops moving during allocation in prim. > > On Apr 7, 2004, at 6:46 PM, Andreas Raab wrote: > > > Which reminds about something totally unrelated but potentially > > *hugely* > > helpful: > > > > How about if we disable GC in primitives? > > > > This idea came back recently when we were talking about chasing GC > > problems - I don't even want to know how many places we have that > > aren't GC > > safe. And I wonder if it's even worthwhile to do this in primitives. > > If it > > is, we could still have a flag that basically "turns GC back on" (and > > this > > could be the default for quick-indexed primitives). Or maybe we just > > turn it > > off for any kind of named primitives. > > > > Thoughts? > > > > - Andreas > > > > ----- Original Message ----- > > From: "Ian Piumarta" <ian...@in...> > > To: "Andreas Raab" <and...@gm...> > > Cc: <squ...@li...> > > Sent: Thursday, April 08, 2004 3:02 AM > > Subject: Re: [Squeak-VMdev] Versiojn 4 changes > > > > > >> On 08 Apr 2004, at 02:39, Andreas Raab wrote: > >> > >>> Which reminds of something else we were talking about in the past: > >>> Passing > >>> primitive arguments as C arguments instead of the Smalltalk stack. > >> > >> Which reminds me of something else Dan & I talked about in the past: > >> evaluating arguments from right to left. Saves an awful lot of > >> tedious > >> peeking into the middle of the stack to pick up the receiver. > >> (Combined with the above, potentially wins Really Big for 386 too. > >> OTOH, the tradeoffs for register architectures are a little more > >> complex.) > >> > >> Cheers, > >> Ian > >> > > > > > > > > ------------------------------------------------------- > > This SF.Net email is sponsored by: IBM Linux Tutorials > > Free Linux tutorial presented by Daniel Robbins, President and CEO of > > GenToo technologies. Learn everything from fundamentals to system > > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > > _______________________________________________ > > Squeak-VMdev mailing list > > Squ...@li... > > https://lists.sourceforge.net/lists/listinfo/squeak-vmdev > > > > > -- > ======================================================================== > === > John M. McIntosh <jo...@sm...> 1-800-477-2659 > Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com > ======================================================================== > === > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: IBM Linux Tutorials > Free Linux tutorial presented by Daniel Robbins, President and CEO of > GenToo technologies. Learn everything from fundamentals to system > administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click > _______________________________________________ > Squeak-VMdev mailing list > Squ...@li... > https://lists.sourceforge.net/lists/listinfo/squeak-vmdev > |
From: Ian P. <ian...@in...> - 2004-04-08 12:37:34
|
On 08 Apr 2004, at 11:46, Andreas Raab wrote: > Yup, that's pretty much what I had in mind, e.g., > > * when we go into a primitive we set a disableGC flag The last two (maybe three) attempts at jitter did precisely this (the flag was called by exactly that name). It was less important for the primitives than it was for internal mechanisms (such as flushing a volatile context into the heap during return, which is a real pain if you have to keep every single pointer remappable over every context allocation when flushing). Cheers, Ian |
From: Andreas R. <and...@gm...> - 2004-04-08 13:14:17
|
> The last two (maybe three) attempts at jitter did precisely this (the > flag was called by exactly that name). It was less important for the > primitives than it was for internal mechanisms (such as flushing a > volatile context into the heap during return, which is a real pain if > you have to keep every single pointer remappable over every context > allocation when flushing). I take that as a clear "yes", right?! ;-) Cheers, - Andreas |