You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(54) |
Jun
(3) |
Jul
|
Aug
(23) |
Sep
(33) |
Oct
(14) |
Nov
(1) |
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(5) |
Jun
|
Jul
|
Aug
(15) |
Sep
(4) |
Oct
|
Nov
|
Dec
|
| 2004 |
Jan
(1) |
Feb
|
Mar
(26) |
Apr
(130) |
May
(5) |
Jun
|
Jul
(21) |
Aug
(3) |
Sep
(24) |
Oct
(10) |
Nov
(37) |
Dec
(2) |
| 2005 |
Jan
(30) |
Feb
(15) |
Mar
(4) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(1) |
Aug
(2) |
Sep
(2) |
Oct
|
Nov
(2) |
Dec
|
| 2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(2) |
Dec
(10) |
| 2007 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: Ian P. <ian...@in...> - 2004-04-08 23:27:12
|
Hi Andreas, On 09 Apr 2004, at 01:10, Andreas Raab wrote: >> just one simple, consistent, linear thing. > > Hm... this would actually be a strong argument against introducing more > immediates. They would have similar properties wouldn't they? Yes, absolutely. (At least in the lower magnitudes of the price/performance equation.) > Not "64 bits deep" but "64 bits aligned". Gotcha. Thanks for the clarification. > SmaCC. Anyone know if SmaCC-generated parsers can "coroutine" (not really, but you get the idea) to each other (or coexist in parallel under a common abstract scanner) with the same source stream? Thanks! Ian |
|
From: Andreas R. <and...@gm...> - 2004-04-08 23:10:13
|
Hi Ian, [Compact classes] > If minimising (Context) membership test time by keeping > CCs around is our prime goal, then I'm entirely against > it. It would make certain classes "privileged", in that > you can *never* make them non-compact without breaking > your VM. Otherwise it means the VM can never use CC > fields exclusively to check for membership (it would > still have to load the class generically and test > pointer eqv). This kind of defeats the argument for > keeping CCs around -- entirely. Agreed. Let's kill them if we don't worry about the space savings then. > I actually think it would make for the fastest possible execution > if compact classes just went away leaving all class pointers > explicit in the method header. No bit fields to extract. > No privileged CC indices to "secretly know about". No > branch-over-check-for-CC-index that might be zero requiring > a punt to class header load and compare. Instead: > just one simple, consitent, linear thing. Hm... this would actually be a strong argument against introducing more immediates. They would have similar properties wouldn't they? [Forms] > Something is itching me about making Forms 64-bits deep while > retaining 31-bit integers. I can't really put my finger on it, > but the phrase "LargeInteger explosion" is floating around my > head for some reason... Not "64 bits deep" but "64 bits aligned". The form bits still contain 32bit integers (otherwise we'd pay horrible prices in 32bit BitBlt versions). [Anthony's closure work] > Is this a modified St-80 Compiler, or one based on SmaCC? SmaCC. Cheers, - Andreas |
|
From: Andreas R. <and...@gm...> - 2004-04-08 23:03:58
|
Dan,
This sounds all good to me. Couple of extra notes:
> Order of evaluation
> I know we've chatted about this in the past, and believe me
> I *want* to do this. Unfortunately, I think for Squeak it
> would be a mistake. There is a fair amount of useful
> software (mainly compilers) in the community that has
> the left-to-right convention built in. Then there is the
> occasional bad usage of the form (strm next * 256 + strm next),
> but they deserve what they get.
Uh, ah. That wasn't quite clear to me. Would this mean that, say:
self foo: self arg1
bar: self arg2
baz: self arg3.
Changes the order of evaluation for arg1-arg3? Would this be inavoidable? If
so I'm absolutely against changing order of evaluation to right-to-left.
Reading direction is *so* important for people that evaluating the above
"the other way around" would almost certainly lead to the most horrible
subtle bugs.
> Decoupling GC from Primitives
> This seems like an all-around win, although not really V4-related.
> i would love it if someone would take it upon themselves to spec
> this and estimate difficulty, so we can review it, and then actually
> do it.
Sounds good. I'm not sure I'll find the time to do it myself but I'll see if
I can stick it in one lazy afternoon (say, wasn't there an easter weekend
somewhere? hm...)
> Separate Allocation of some Bits Objects
> I understand that this one matters. I'm just the wrong person
> to spec it or do it (since I know relatively little about media
> engines and OS resource management). Most other serious
> implementations have such a facility, and I'm sure it would
> help with any serious media work. I'm happy to manage it as
> part of this project.
The "most important" thing is to have some field in the image header that's
set to zero - then we got plenty of time to figure out what we stick in when
we have it :-)
> Flat Rectangles
> I never wrote about this, but I came very close do doing
> this in the early history of Squeak. It saves time getting
> to BitBlt, doing intersections, etc, and it saves a little
> space, too. I wouldn't do this for me right now, but I'm
> wondering if Mr. Graphics has ever wished for it.
> I think it's not a hugely hard change to make (at least it
> wasn't back then).
Hm... I don't think there are too many advantages to it. BitBlt is the only
object in the image that uses a "fat" primitive interface - almost all other
prims dealing with rectangles just pass points or numbers verbatim. In
short, I've never truly wished for it, but I don't think it would do any
harm either. The most important part might be the space savings (avoids two
object headers for points) but that's about it.
Cheers,
- Andreas
|
|
From: Andreas R. <and...@gm...> - 2004-04-08 22:47:21
|
Tim,
> That could save time in the primitive code, but what would it cost in
> the calling code? Outside of a translator, how would we load up the
> registers (and think of the platform differences in which registers and
> how many etc)? Wouldn't it end up with primitiveResponse looking like
>
> switch(numArgs) {
> case 1: (prim)(*sp); break;
> case 2: (prim)(*sp, *s--p); break;
> etc
> which surely wouldn't net much benfit?
The idea was that you would specify primitives with arguments (like FFI
and/or SmartInterpreterplugin) and then glue gets generated for the
interpreter. So it would actually look somewhat along the lines of:
int gluePrimitiveAt() {
return primitiveAt(stackTop())
}
int primitiveAt(oop index) {
/* yaddaya */
}
Cheers,
- Andreas
|
|
From: Tim R. <ti...@su...> - 2004-04-08 22:05:51
|
In message <165...@ka...>
Bryce Kampjes <br...@ka...> wrote:
> It would probably be more worthwhile looking at at: and at:put:. Could
> Tim's improvements to primitives be used to create simple at:
> primitives that are quick to execute?
That's one of the aims; dispatch prims from a value cached in the
method lookup cache (I'm not actually doing that yet, but soon I hope)
so that specialized mini-prims can be used. It's an old trick and not
anything I thought up but it usually has benefits.
For example, one could have an at mini-prim that knows the rcvr is a
word array and so doesn't need to check the format, another for byte
arrays etc. The correct version is chosen and installed at lookup time.
For primPerform type code I imagine it is simpler to use the 'normal'
versions but perhaps it could be extended.
What I'd love to achieve is a send that goes something like:-
lookup message in cache
not found? do lookup & cache fill
find cache entry
branch to cached function addr
err, that's it. Oh, return!
Sadly we need to have the backup for if the prim function fails. I
considered making primfailure go via a tail call to the normal method
activation but that would require rewriting huge amounts of stuff.
Nobody is currently paying me enough to take on _that_ particular
labour. It would be nice though because instead of having a primIndex
of 0 we could have a pointer direct to methodActivate.
Current state is in
http://sumeru.stanford.edu/tim/pooters/SqFiles/packages/VMMaker/VMMaker3
-7b2.sar for what it's worth.
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
Strange OpCodes: SCEU: Simulate Correct Execution, Usually
|
|
From: Yoshiki O. <Yos...@ac...> - 2004-04-08 22:02:23
|
Ned, > That's not the case unless you have decided on a standard encoding. For > instance, if you load Ian's X11 Fonts package into a stock image, you will > have fonts that display the same Character as different glyphs, because the > fonts in the stock image use the MacRoman encoding, and the X11 fonts use > Latin-1. Oh. I completely assume that we move to Unicode based encoding, Which means that first 256 chars are compatible latin-1. > > To which point? We should go with 24 bit immediate? We should go > > with "naked code point + attributes" approach? > > I think that we should probably stick to 24 bits and put other information > into the Strings. It is going to be a little bit more intrusive change than we might want to. The Compiler should carry the attributes from the thing it evaluates to the result. For example, if you put a string in a workscape something like: dict at: ',' put: ',' and put the different language tag to the second comma from the first in the workspace. In that case, the commas look differently in the workspace. If you do-it the expression, you want to make the key and value in the dictionary carry the same attribute when *they* were in the workscape. The self-containedness is nice in this regard. But, what would be the best argument to make the wide-characters immediate objects? -- Yoshiki |
|
From: Ned K. <ne...@bi...> - 2004-04-08 21:40:26
|
On Tuesday 06 April 2004 10:57 pm, Yoshiki Ohshima wrote: > What do you mean by "they don't know their encoding?" You mean that > they don't know the font to use by default, etc.? > > For the first 256 chars, the default font simply *works*. For the > other characters, this is not always the case. That's not the case unless you have decided on a standard encoding. For instance, if you load Ian's X11 Fonts package into a stock image, you will have fonts that display the same Character as different glyphs, because the fonts in the stock image use the MacRoman encoding, and the X11 fonts use Latin-1. > > How often do you find yourself inspecting individual Characters inside a > > String? > > > > Even more to the point: how often do you find yourself inspecting > > Characters that aren't inside a String? > > To which point? We should go with 24 bit immediate? We should go > with "naked code point + attributes" approach? I think that we should probably stick to 24 bits and put other information into the Strings. > I guess you're saying that you like the *first* approach I wrote. > Which isn't too different from my position. Right. -- Ned Konz http://bike-nomad.com GPG key ID: BEEA7EFE |
|
From: Dan I. <Da...@Sq...> - 2004-04-08 21:37:57
|
Hi, Ian - <many arguments for truth and beauty snipped> >George Bernard Shaw: The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. Thanks for the nudge. That's what friends are for. >Smalltalk programmers should be every bit as careful: (AFAIK) the eval order is not specified by the standard. > >(Somebody please, *please* correct me if I'm wrong about this. I can't find my copy of the standard to check. Is it online anywhere yet?) Early on I urged them to keep it unspecified (you'll be glad to know). Wish I had stuck around for some of the other decisions ;-). Thanks again, Ian. I'll make a substantive reply after some further discussion has ensued. - Dan |
|
From: Ian P. <ian...@in...> - 2004-04-08 21:10:24
|
Hi Dan, Great r=E9sum=E9 (even if I do feel obliged to take issue with a couple = of=20 items in it ;-). > Compact classes > Nobody seems to care much here. I was impressed that Andreas picked=20= > up on the slight "tagging" advantage that CCs offer. This actually=20 > saves an instruction or two testing for contexts, as I recall. But, it's also a royal pain when you decide you want to evict Context=20 from the set of compact classes. If minimising space by keeping CCs is our prime goal, then I'm entirely=20= neutral on the issue. If minimising (Context) membership test time by keeping CCs around is=20 our prime goal, then I'm entirely against it. It would make certain=20 classes "privileged", in that you can *never* make them non-compact=20 without breaking your VM. Otherwise it means the VM can never use CC=20 fields exclusively to check for membership (it would still have to load=20= the class generically and test pointer eqv). This kind of defeats the=20= argument for keeping CCs around -- entirely. I actually think it would make for the fastest possible execution if=20 compact classes just went away leaving all class pointers explicit in=20 the method header. No bit fields to extract. No privileged CC indices=20= to "secretly know about". No branch-over-check-for-CC-index that might=20= be zero requiring a punt to class header load and compare. Instead:=20 just one simple, consitent, linear thing. > SmallIntegers > We had some interesting discussion about changing the format of=20 > SmallIntegers. I'm going to invoke executive privilege however, to=20 > say that we will not change the current representation on this go=20 > round. I'm sure it would be an interesting exercise, but I don't see=20= > a really big payoff, and I can imagine it ending up being a lot of=20 > work. > > If anyone wants to do this on their own and give me the code, or if=20 > they want to argue one last time for the benefits, I'm still willing=20= > to reconsider. I really mean this -- i just don't want to get bogged=20= > down in gratuitous changes. The only tag change I would push for is putting SI tag in the topmost=20 bit. But Tim (or rather risc-os) killed that one stone dead. > The 64-bit format > My inclination is to tie this as closely to the 32-bit format as=20 > possible for now, just to simplify the project. To me that means just=20= > extending every oop, and header word with zeroes, and sign-extending=20= > every "small" integer. For now, we'll also stick with the same 31-bit=20= > value range in both systems. > > > Form bits modulus > I want to change the modulus of Form bitmaps to 64 bits in both 32-=20 > and 64-bit images. As per discussion this does not change the format,=20= > only the raster width. Something is itching me about making Forms 64-bits deep while retaining=20= 31-bit integers. I can't really put my finger on it, but the phrase=20 "LargeInteger explosion" is floating around my head for some reason... > CompiledMethods > A major question to decide is whether to split up bytecodes and=20 > literals *Yes*. > (and source pointer) In the CM (or in the literals if you can fathom a reason to prefer=20 that). > and throw in a "serendipity" field for fun this summer. Let's just make the VM completely agnostic w.r.t. the size of CMs. =20 Stick the variable pointers where pointers belong, stick the bytecodes=20= where variable bytes belong, and stick fixed fields (including method=20 header) where fixed fields belong -- in something about which the VM=20 never makes assumptions beyond the first N fields and which never cares=20= if someone subclasses it in cruel and unusual ways. > [By the way, I'm assuming this adds another very high bandwidth=20 > register to access literals independent of bytecodes, right? And Ian=20= > please comment on the relative importance of 3 objects vs two] For interpretation you maybe want to care (a teeny bit) about=20 minimising the number of "hot" registers, but for all architectures bar=20= one (the one that lets you get at 8 out of the 64? 128? registers ;)=20 this is utterly irrelevant: we still have plenty of "fixed" registers=20 to spare. For a translating runtime I'd want the simplest and most generic model=20= possible. Each object has precisly one function (separate out the CM,=20= literals, bytecodes) and -- whatever else we end up deciding to do or=20 not to do -- get those #^$*@! block bodies out of their defining CM and=20= into their own, independent CM once and for all! ;) > Closures > I'm just assuming we would load some form of Anthony's work, which=20 > probably also means using his compiler. Is this a modified St-80 Compiler, or one based on SmaCC? If the latter than I'd vote for this (compiler change) faster than I'd=20= vote for Ralph Nader (which is pretty darn fast ;). > Ian, are you happy with the structures he chose, or should we make=20 > some changes before loading them up? We probably want to start with=20= > something that doesn't change the existing context structures much. I'd like to talk with Anthony about this (great) detail, in a much=20 higher-bandwidth setting than a mailing list, and come to a unanimous=20 concensus on the optimal bang-for-the-buck to be had in a "short-term=20 project" (paying equal attention to the efficiency of interpretive and=20= translational approaches). I think we can come up with something novel=20= that will be sufficiently close to what we have to make for very little=20= implementation impact, yet yield significant performance benefits. In any case I think it's important not to lose much of Anthony's work,=20= but at the same time I think we have a great opportunity to "undo"=20 several mistakes that were made in the past (by all concerned -- I'm=20 not pointing fingers at Anthony or anyone else in particular, nor=20 disclaiming responsibility for my own fair share of them!). > Order of evaluation > I know we've chatted about this in the past, and believe me I *want*=20= > to do this. Unfortunately, I think for Squeak it would be a mistake. I think _not_ to do it would invalidate a lot of very useful stuff that=20= could be done elsewhere, in primitive calling conventions, translating=20= runtimes, etc... > There is a fair amount of useful software (mainly compilers) in the=20 > community that has the left-to-right convention built in. Then maybe _they_ are broken? We should fix them (not make concessions=20= to them). George Bernard Shaw: The reasonable man adapts himself to the world;=20 the unreasonable one persists in trying to adapt the world to himself.=20= Therefore all progress depends on the unreasonable man. (I like that one almost as much as I do the one about "cheating and not=20= getting caught". ;) > Then there is the occasional bad usage of the form (strm next * 256 +=20= > strm next), but they deserve what they get. Hear hear! > I'm mainly concerned about bucking a convention that has paid off in=20= > synergy over the years. Depending on any undefined (by some standard someplace) implementation=20= behaviour is a bug (even if the program runs without exhibiting any=20 erroneous behaviour). You would not find any C programmer (beyond=20 elementary school, or Redmond maybe ;-) writing the C equivalent of the=20= above Stream example. Smalltalk programmers should be every bit as=20 careful: (AFAIK) the eval order is not specified by the standard. (Somebody please, *please* correct me if I'm wrong about this. I can't=20= find my copy of the standard to check. Is it online anywhere yet?) > Separate Allocation of some Bits Objects Separate allocation of "locked-down" (or "wired" or "non-moving" or=20 whatever) objects of any format. > Proxy support (divergent "self' and "receiver") > Is 'receiverMap' (ie a spare slot in Contexts) enough for now? I think so. Hope that was useful (and not too contentious and/or factually=20 inaccurate)! Cheers, Ian |
|
From: Dan I. <Da...@Sq...> - 2004-04-08 19:48:41
|
Hi, Guys - Thanks for all your comments and suggestions. I don't know about you, but I find this to be a lot of fun. Now that discussion has tapered off, I will do my best to summarize the various areas, and to make some tentative decisions for your approval or further discussion in the second round. I do this, not because I make the best decisions, but because we need to reduce uncertainty to move forward. Please object now if you feel like I'm leaning the wrong way on items above the line here. Cleanups First of these is the split primitive field of method headers. I think there's good agreement on returning to a 9-bit field and doing away with (or converting to named access) all those that were using higher indices. I have a specific proposal which I will document soon on the web site, but it is essentially what Tim suggested, except I would drop of the loadInstVar range so we leave room for 50 or so unused inedexed primitives at the top. My reasoning is that, ugly or not, indexed primitives are the easiest way for curious hackers to experiment with new capabilities in Squeak. [By the way, it looks like "External primitive support primitives" 570-574 are still in use and would need to be slid down] Compact classes Nobody seems to care much here. I was impressed that Andreas picked up on the slight "tagging" advantage that CCs offer. This actually saves an instruction or two testing for contexts, as I recall. I checked the stats of 'mini.image', the" complete" MVC Smalltalk in 520k. It has 13426 instances of compact classes, most of which would be 4 bytes bigger without CCs (some are large enough to have big headers already). The real count is less, since I picked up 1000 contexts in my stats, but those are meaningful, too. So the space cost for this example is about 54k. I expect this factor or 10% would scale pretty well to any small kernel, since it will mostly be full of the same stuff (methods, symbols, arrays, points and rectangles). However... Consider if we revamp CompiledMethods as well. Then the 4600 methods in this benchmark become either 9200 or 13,800 objects, all with 2-word headers. So the cost (versus doing new CMs, but with CCs) would be an additional 18.4k or 37k depending on whether methods become 2 or 3 objects. I think this change is worth one more round of contemplation on grounds of IIABDFI (if it ain't broke...). The part that bothers me most is actually compatibility with Version-3 projects (image segments), but we're going to have to do something for this anyway. What's nasty about this change is that it's harder to do on the fly, since some objects get bigger. SmallIntegers We had some interesting discussion about changing the format of SmallIntegers. I'm going to invoke executive privilege however, to say that we will not change the current representation on this go round. I'm sure it would be an interesting exercise, but I don't see a really big payoff, and I can imagine it ending up being a lot of work. If anyone wants to do this on their own and give me the code, or if they want to argue one last time for the benefits, I'm still willing to reconsider. I really mean this -- i just don't want to get bogged down in gratuitous changes. Immediate objects (tagging) I like Adreas's proposal for an extensible set of Immediate Objects. I still haven't researched the GC conflict with the 10 tag. Also I haven't looked at how small a kernel of support we could have in the VM. The 64-bit format My inclination is to tie this as closely to the 32-bit format as possible for now, just to simplify the project. To me that means just extending every oop, and header word with zeroes, and sign-extending every "small" integer. For now, we'll also stick with the same 31-bit value range in both systems. Form bits modulus I want to change the modulus of Form bitmaps to 64 bits in both 32- and 64-bit images. As per discussion this does not change the format, only the raster width. CompiledMethods A major question to decide is whether to split up bytecodes and literals (and source pointer) as in Tim's NewCompiledMethod work. This adds a little space overhead, but it certainly makes things simpler. I'm going to assume that we go with Tim's 2-object format, and throw in a "serendipity" field for fun this summer. [By the way, I'm assuming this adds another very high bandwidth register to access literals independent of bytecodes, right? And Ian please comment on the relative importance of 3 objects vs two] Closures I'm just assuming we would load some form of Anthony's work, which probably also means using his compiler. Ian, are you happy with the structures he chose, or should we make some changes before loading them up? We probably want to start with something that doesn't change the existing context structures much. Order of evaluation I know we've chatted about this in the past, and believe me I *want* to do this. Unfortunately, I think for Squeak it would be a mistake. There is a fair amount of useful software (mainly compilers) in the community that has the left-to-right convention built in. Then there is the occasional bad usage of the form (strm next * 256 + strm next), but they deserve what they get. Hey, I know -- how about left-to-right on big-endian machines, and right to left on little-endian. Just kidding. I'm mainly concerned about bucking a convention that has paid off in synergy over the years. Decoupling GC from Primitives This seems like an all-around win, although not really V4-related. i would love it if someone would take it upon themselves to spec this and estimate difficulty, so we can review it, and then actually do it. Separate Allocation of some Bits Objects I understand that this one matters. I'm just the wrong person to spec it or do it (since I know relatively little about media engines and OS resource management). Most other serious implementations have such a facility, and I'm sure it would help with any serious media work. I'm happy to manage it as part of this project. If someone can pull this off in a compatible time frame, I'll do my best to merge it with the V4 changes. If not, perhaps we could at least spec it in such a way that future V4 VMs could add it without needing further image changes. ----------------------------------------- Other Changes "While we have it in the shop..." Beyond here are changes I haven't thought about enough to render even a temporary judgement. I'd love to hear more pro's and con's and maybe some estimate of how much work and who would do it. Immutability bit This sounds intriguing. Has anyone here thought seriously about what is needed to support it? Proxy support (divergent "self' and "receiver") Is 'receiverMap' (ie a spare slot in Contexts) enough for now? Also I've heard of "resend" bytecodes for delegation that re-use the context without having to load and push all the args as well, when all you want to do is forward the message to, eg, an aspect variable. Squat Support Need to hear more discussion. Flat Rectangles I never wrote about this, but I came very close do doing this in the early history of Squeak. It saves time getting to BitBlt, doing intersections, etc, and it saves a little space, too. I wouldn't do this for me right now, but I'm wondering if Mr. Graphics has ever wished for it. I think it's not a hugely hard change to make (at least it wasn't back then). I probably left something out -- please remind me Thanks all -- you guys are great! - Dan |
|
From: Bryce K. <br...@ka...> - 2004-04-08 19:38:49
|
Ned Konz writes: > On a related note, does it seem wasteful to anyone but me that we do the > following in primBytecodeAdd: Yes, that sequence does seem wasteful. Ideally I'd like to generate: and arg1, arg2, andResult and andResult, 2r11, dummy1 branch-if-not-zero overflowBlock add addResult, arg1, arg2 branch-on-overflow overflowBlock This sequence is possible if we change from integers having tag 1 to integers having tag 0 leaving the tag in the lower bits. The Self community wrote some nice papers on tagging and high performance simple garbage collection. The above sequence only has the add on the critical path during the common case where the arguments are SmallIntegers and the result is also a SmallInteger. The other four instructions do not matter so long as both branches are correctly predicted as not taken. Looking up oops can be done by using the small offset provided in most load or store instructions. This means that a non-zero oop tag has no overhead. But interpreted code probably spends most of it's time recovering from branch mispredicts. Compiled code should be sufficiently faster that the garbage collector will be the major bottle-neck. Realistically any changes to tagging should be looking to provide a large improvement to garbage collector performance. It would probably be more worthwhile looking at at: and at:put:. Could Tim's improvements to primitives be used to create simple at: primitives that are quick to execute? My limited benchmarking indicates that code using arrays is dominated by at: and at:put: overhead. We're not yet at the point where our inefficient SmallInteger format matters. Once we're at the point where SmallInteger addition is worth improving our garbage collector will be a serious bottleneck. Currently the garbage collector uses about 10% of the time reduce the 90% by enough to make these few instructions matter and garbage collection will be closer to 50%. So, yes it does seem wasteful, but first let's improve things so that the waste really matters. Even then other things (the garbage collector) will be more important which require image changes. Anything that simplifies and speeds up at: and at:put: will matter much more. Tim, that primitive work? Bryce |
|
From: Dan I. <Da...@Sq...> - 2004-04-08 19:30:11
|
>On 08 Apr 2004, at 20:07, Avi Bryant wrote: > >>On Apr 8, 2004, at 7:32 AM, Jecel Assumpcao Jr wrote: >> >>>My Neo Smalltalk evaluates right-to-left. >> >>Not that this really matters at all, but - this also makes implementing Forth-like languages easier. Ian commented... >This matters even less, but right-to-left arg eval order makes implementing Java on top of the Squeak VM harder. and likewise implementing Squeak on top of the Java VM ;-) - Dan |
|
From: Ian P. <ian...@in...> - 2004-04-08 19:20:33
|
On 08 Apr 2004, at 20:07, Avi Bryant wrote: > On Apr 8, 2004, at 7:32 AM, Jecel Assumpcao Jr wrote: > >> My Neo Smalltalk evaluates right-to-left. > > Not that this really matters at all, but - this also makes > implementing Forth-like languages easier. This matters even less, but right-to-left arg eval order makes implementing Java on top of the Squeak VM harder. Tchao, Ian |
|
From: Avi B. <av...@be...> - 2004-04-08 18:07:46
|
On Apr 8, 2004, at 7:32 AM, Jecel Assumpcao Jr wrote: > My Neo Smalltalk evaluates right-to-left. That lets me get by with just > one send instruction. The first instructions in a compiled method grab > the right number of arguments from the caller's stack. Unless it is a > short leaf method - in that case it can skip creating its own stack and > run directly with the caller's stack. This is a very common case since > I currently do very little inlining (just the usual ifTrue: stuff). Not that this really matters at all, but - this also makes implementing Forth-like languages (see my Sorrow package on SM) easier. |
|
From: Avi B. <av...@be...> - 2004-04-08 18:02:52
|
On Apr 8, 2004, at 9:16 AM, Ian Piumarta wrote: > Which reminds me, for V4 the following might be on my wishlist too: > > - separate the notion of "self" and "receiver". > > For most purposes they remain the same thing. Within > #doesNotUnderstand: (or in some kind of explicit primitive #perform: > type thing that allows divergence) they can be made different objects. > Receiver inst vars remain with the "receiver" definition, while sends > to self remain with the "self" definition. > > This allows (for example) "bullet-proof proxies" where #dNU: sets > receiver to the object that fields the send and retains the proxy as > "self". Any subsequent send to self goes straight back to the proxy, > rather than the receiver, without affecting how state is accessed > within the receiver; i.e., it's completely transparent to the > (non-meta ;) programmer. Yes, please, I'd love to have that. |
|
From: Andreas R. <and...@gm...> - 2004-04-08 17:31:07
|
> Which reminds me, for V4 the following might be on my wishlist too: > > - separate the notion of "self" and "receiver". IIRC, then that's what Chango did - using the "receiverMap" in contexts as "self" (as you describe it). Cheers, - Andreas |
|
From: Andreas R. <and...@gm...> - 2004-04-08 17:26:11
|
> Why go to the trouble of adding extra code simply to turn off > something. simpler, cleaner and more intelligable to just remove the > damn code! Err ... that was a joke, yes? Removing GC? Or what do you mean... - A. |
|
From: Tim R. <ti...@su...> - 2004-04-08 16:51:36
|
In message <07e401c41d4e$66c86440$b2d0fea9@R22>
"Andreas Raab" <and...@gm...> wrote:
> John,
>
> Yup, that's pretty much what I had in mind, e.g.,
>
> * when we go into a primitive we set a disableGC flag
> * if there is allocation we fall through in sufficientSpaceToAllocate:
> (possibly growing mem if needed but not GCing)
> * upon primitive return we do the allocation check and GC if necessary
>
Why go to the trouble of adding extra code simply to turn off
something. simpler, cleaner and more intelligable to just remove the
damn code!
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
Useful random insult:- Hypnotized as a child and couldn't be woken.
|
|
From: Tim R. <ti...@su...> - 2004-04-08 16:48:57
|
In message <083a01c41d51$64cbf0a0$b2d0fea9@R22>
"Andreas Raab" <and...@gm...> wrote:
> > I'd suggest that the VM changes are pretty small but ought to include
> > the ability to pass back an error value (fortunately I have ancient
> > code to do that sitting somewhere) so the image knows what the problem
> > was and does the smart thing.
>
> Yes, that is a "must do" in my understanding. Thanks for reminding. I think
> we might just extend #primitiveFail to include an "error reason", so you
> would use it via:
>
> interpreterProxy->primitiveFail(ERROR_BAD_ARGUMENT);
>
> or somesuch.
Well, I guess I'll dig out the old code then. Needs a trivial change to
the compiler when handling prims IIRC.
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
Strange OpCodes: PO: Punch Operator
|
|
From: Ian P. <ian...@in...> - 2004-04-08 16:16:28
|
On 08 Apr 2004, at 18:05, Andreas Raab wrote: > Incidentally, talking about message forwarding. We might want to keep > in > mind the stuff that Stephen Pair did for the Chango VM - is he on this > list? > If not, we should invite him. He had some fairly interesting stuff > going > with delegation and that might be of some interest here. Which reminds me, for V4 the following might be on my wishlist too: - separate the notion of "self" and "receiver". For most purposes they remain the same thing. Within #doesNotUnderstand: (or in some kind of explicit primitive #perform: type thing that allows divergence) they can be made different objects. Receiver inst vars remain with the "receiver" definition, while sends to self remain with the "self" definition. This allows (for example) "bullet-proof proxies" where #dNU: sets receiver to the object that fields the send and retains the proxy as "self". Any subsequent send to self goes straight back to the proxy, rather than the receiver, without affecting how state is accessed within the receiver; i.e., it's completely transparent to the (non-meta ;) programmer. I think VisualWorks does (or did or was going to do a long time ago) something similar, precisely for the proxy interest. Cheers, Ian |
|
From: Andreas R. <and...@gm...> - 2004-04-08 16:05:47
|
Hi Craig, > > What kind of method-lookup change? What is needed here... > > The current Squat implementation checks if the receiver's class is a > special "proxy" class (currently called "Other"), which is in the > special objects array. If it is, it kicks the send back to image, using > the same mechanism as for >>doesNotUnderstand:, so that the message may > be forwarded by the proxy. I'm missing something here - if it uses the same mechanism why would we need to change the lookup? Put differently, wouldn't it have the same effect if you just make Other a ProtoObject? Incidentally, talking about message forwarding. We might want to keep in mind the stuff that Stephen Pair did for the Chango VM - is he on this list? If not, we should invite him. He had some fairly interesting stuff going with delegation and that might be of some interest here. > > I'm slightly hesitent to reserve an object format for something like > > method dictionary marking. Can you elaborate on why this would be > > needed? > > The VM is the thing that does the activation marking. I'm using that > mark to tell when any method from a method dictionary has been run; it's > useful for, e.g., calculating which classes can be swapped out of an > image at some point in time. (There are also primitives for reading and > clearing the marks.) I'm missing something here too (guess I'll have to check the code). This sounds as if a method dictionary with an extra iVar or so would achieve the same effect at the same cost. > > ...we only have so many object format types. > > Well, the one I want to use (5) has gone unused for Squeak's entire > history, as far as I can tell. :) And it's not the only one (format 7 > is also unused). And in eight years I've yet to hear of anyone else > wanting one of them. :) And this is a very good cause. :) And I'm not > asking for any additional header bits. ;) (oops, smilie overload...) Oh, I'm not saying we shouldn't do this. I'm trying to understand if there is a "generalized benefit" from it, e.g., what else could one potentially do with these changes. > > Hm... something a little more specific would be nice ;-) > > 'Sorry, I was pressed for time (still am :). See > http://www.netjam.org/squat/releases/current/vmChanges.zip . Pick a VM > and snapshot from http://www.netjam.org/squat/releases/current/#theBits > to see the image side of things. Thanks, I'll check it out - guess that'll help with the above. Cheers, - Andreas |
|
From: Craig L. <cr...@ne...> - 2004-04-08 15:49:20
|
Hi Andreas-- > > Incorporate method-lookup change to support remote message-sending, > > in anticipation of Squat's support for minimal snapshots and > > inter-system communication. > > What kind of method-lookup change? What is needed here... The current Squat implementation checks if the receiver's class is a special "proxy" class (currently called "Other"), which is in the special objects array. If it is, it kicks the send back to image, using the same mechanism as for >>doesNotUnderstand:, so that the message may be forwarded by the proxy. > and what are the implications for both speed and potential impact on > JIT? I have found the performance impact negligible; the usual case is one additional identity check against a special object (in C, "class == (longAt(((((char *) specialObjectsOop)) + someInteger"). I see no impact on JIT (assuming it can deals acceptably with the way method lookup happens now). Hopefully Ian will chime in about that. > > Allocate header format five for Squat's method dictionary > > [activation] marking, and version bits in method trailers for > > Squat's module system. > > I'm slightly hesitent to reserve an object format for something like > method dictionary marking. Can you elaborate on why this would be > needed? The VM is the thing that does the activation marking. I'm using that mark to tell when any method from a method dictionary has been run; it's useful for, e.g., calculating which classes can be swapped out of an image at some point in time. (There are also primitives for reading and clearing the marks.) > ...we only have so many object format types. Well, the one I want to use (5) has gone unused for Squeak's entire history, as far as I can tell. :) And it's not the only one (format 7 is also unused). And in eight years I've yet to hear of anyone else wanting one of them. :) And this is a very good cause. :) And I'm not asking for any additional header bits. ;) (oops, smilie overload...) > Version bits in method trailers shouldn't be needed if I understand the > CM changes correctly - you might just add another iVar to CMs. True, it just seemed like a logical thing to put in the trailer, and I was already adding a known-to-the-VM bit there. Oops, I forgot to mention that in my previous message. That's for *method* activation marking. > > The Squat homepage is http://netjam.org/squat/. The release page has > > links to all the relevant code. > > Hm... something a little more specific would be nice ;-) 'Sorry, I was pressed for time (still am :). See http://www.netjam.org/squat/releases/current/vmChanges.zip . Pick a VM and snapshot from http://www.netjam.org/squat/releases/current/#theBits to see the image side of things. thanks again, -C -- Craig Latta improvisational musical informaticist cr...@ne... www.netjam.org [|] Proceed for Truth! |
|
From: Jecel A. Jr <je...@me...> - 2004-04-08 14:32:06
|
Ooops... I sent this only to Ian the first time instead of the list. He wrote a very nice reply giving even more examples of why this is a good idea. On Wednesday 07 April 2004 22:02, Ian Piumarta wrote: > Which reminds me of something else Dan & I talked about in the past: > evaluating arguments from right to left. Saves an awful lot of > tedious peeking into the middle of the stack to pick up the receiver. > (Combined with the above, potentially wins Really Big for 386 too. > OTOH, the tradeoffs for register architectures are a little more > complex.) My Neo Smalltalk evaluates right-to-left. That lets me get by with just one send instruction. The first instructions in a compiled method grab the right number of arguments from the caller's stack. Unless it is a short leaf method - in that case it can skip creating its own stack and run directly with the caller's stack. This is a very common case since I currently do very little inlining (just the usual ifTrue: stuff). In the green book Dan mentioned that the change to left-to-right from Smalltlak-78 to -80 was supposed to make people stop complaining about surprising results and to make creating really simple compilers possible. I didn't get this last part as the right-to-left order is easier in the recursive compilers I have written so far. -- Jecel P.S.: I hope comparing how my own and other implementations (I am very familiar with Self, for example) do things isn't too annoying. If it is, please let me know |
|
From: Ian P. <ian...@in...> - 2004-04-08 13:56:25
|
On 08 Apr 2004, at 15:14, Andreas Raab wrote: > I take that as a clear "yes", right?! ;-) Absolutely. The two GC things I'd most like to see are: - being able to lock-down objects (so they don't move, allocated outside the heap, preferably via a generalised mechanism that isn't restricted just to variable words -- our recent exchange about LIFO context allocation, etc...) - a way to reserve a certain guaranteed amount of heap memory that will never be allocated during normal instantiation, but remains available for (e.g.) primitives (LargeInts, whatever) and other non-primitive internal mechanisms. (In addition to context flushing, there are things like Message creation for send/return exceptions that could benefit from this. Orthogonal example: Unix filesystem partitions reach 100% capacity for users before they're really full -- the additional space remains available to root processes.) Cheers, Ian |
|
From: Andreas R. <and...@gm...> - 2004-04-08 13:14:17
|
> The last two (maybe three) attempts at jitter did precisely this (the > flag was called by exactly that name). It was less important for the > primitives than it was for internal mechanisms (such as flushing a > volatile context into the heap during return, which is a real pain if > you have to keep every single pointer remappable over every context > allocation when flushing). I take that as a clear "yes", right?! ;-) Cheers, - Andreas |