|
From: Dan I. <Da...@Sq...> - 2004-04-06 19:47:39
|
Hi, folks - I know many of you are busy, but I'd like some input in the next few days about various possible or desirable changes to be included in the Version 4 image format. Feel free to mark up the swiki page, and I'll try to pull some consensus out of the discussion toward the end of the week. Cleanups There are a couple of cleanups like the size of the numbered primitive field and the layout of the format field in classes. I am assuming we'll just do those, and that they'll take a day or two each. Any problems with Tim's suggestion to drop the limit to 511? More important to me are... Compact classes I put these in way back, and they do save almost a word per object in small kernel images. They complicate things a bit, but the code is already written, and it has been stable for nearly a decade now. On the other hand, the complexity is an inhibition to trying out various other cool hacks. Moreover if we dropped the compact class feature, we could pick up 5 more bits of object hash, which would definitely be nice. With regard to compactness, my experience is that the savings due to this feature is less than the difference between a well-designed kernel and a poor one, so I don't consider it a bit issue. Even PDA's have plenty of space these days (yes, there's an upside to the Microsoft world ;-), and for extreme compactness issues (like web downloads), there are all sorts of other compactness strategies. I'm inclined to drop them and give the bits back to the hash, but it's only a mild inclination. How do you guys feel? Immediate objects (tagging) We *could* do lots of things here -- enough so I start to get worried about doing too much. For instance, we could have immediate Floats of, say, 30 bits, and immediate Characters of 30 bits, in addition to our current immediate SmallIntegers. I'm inclined not to do any of these, but how do you feel? The 64-bit format My inclination is to tie this as closely to the 32-bit format as possible for now, just to simplify the project. To me that means just extending every oop, and header word with zeroes, and sign-extending every "small" integer. The trade-off is complex between having a different range of SmallIntegers in the 64- and 32-bit worlds, and sticking with the same 31-bit range in each. I hope to understand this better in the coming days. If you do already, please speak up. Form bits modulus I want to change the modulus of Form bitmaps to 64 bits in both 32- and 64-bit images. It will simplify things for them to be the same, and I think we can convert old ones on the fly for compatibility (ie, when reading in a project, or an old '.form' file. Do you see any problems with this? CompiledMethods A major question to decode is whether to split up bytecodes and literals (and source pointer) as in Tim's NewCompiledMethod work. This adds a little space overhead, but it certainly makes things simpler. And what about native methods support? Is there something simple that would suffice, or something better? This is a time when we could open some possibilities. If we had some idea, then at least we could put the right hooks in the image, even if we need to do some VM work to use it later. Closures I'm just assuming we would load some form of Anthony's work, which probably also means using his compiler. Ian, are you happy with the structures he chose, or should we make some changes before loading them up? I'm assuming this is separable from the contiguous stack work, but maybe not. We probably want to start with something that doesn't change the existing context structures much. Immutability bit This sounds intriguing. Has anyone here thought seriously about what is needed to support it? Monitor bit This was a concern of Alan's way back. Andreas - is this needed for any of the Croquet synchronization work? |
|
From: <gor...@bl...> - 2004-04-06 21:26:14
|
Hi Dan and all! I don't have the knowledge to really respond but just a few things: Dan Ingalls <Da...@Sq...> wrote: [SNIP] > Compact classes > I'm inclined to drop them and give the bits back to the hash, but it's only a mild inclination. > How do you guys feel? It sounds like I agree. I love simplicity. :) > And what about native methods support? Is there something simple that would suffice, or something better? This is a time when we could open some possibilities. If we had some idea, then at least we could put the right hooks in the image, even if we need to do some VM work to use it later. Bryce Kampjes should have some answers, I cced him with this post. Hey Bryce, if you aren't on this list you should be. :) > Immutability bit > This sounds intriguing. Has anyone here thought seriously about what is needed to support it? I think Stephen Pair is interested in a bit or two, though it was a while since I heard from Stephen. IIRC a "dirty bit" would be nice for transactional images like Magma and Stephen's work etc. And I also think Stephen was playing with some bits for managing an object cache. regards, Göran |
|
From: Ian P. <ian...@in...> - 2004-04-06 22:08:53
|
On 06 Apr 2004, at 23:45, Andreas Raab wrote: > Oh, I almost forgot that: Here is one issue which is actually VERY > high up > on my VM/image-wishlist. It is the ability to have bits objects that > won't > be moved around by GC. Seconded. > The easiest way to solve the above > problem is [...] via a C > allocator and just make sure these bit objects are properly read and > written. Motion passed. :) Ciao, Ian |
|
From: Tim R. <ti...@su...> - 2004-04-06 22:51:06
|
In message <F99...@in...>
Ian Piumarta <ian...@in...> wrote:
> On 06 Apr 2004, at 23:45, Andreas Raab wrote:
>
> > Oh, I almost forgot that: Here is one issue which is actually VERY
> > high up
> > on my VM/image-wishlist. It is the ability to have bits objects that
> > won't
> > be moved around by GC.
>
> Seconded.
>
> > The easiest way to solve the above
> > problem is [...] via a C
> > allocator and just make sure these bit objects are properly read and
> > written.
>
> Motion passed.
I've this sort of thing several times before and so far as I can recall
there are two basic approaches
1) Create a format of object that allows a memory pointer to some other
place. I'd really hate to see this more complicated than allowing
just not-oops in an object; mixing real oops and external pointers....
euch. Primitives encapsulate access easily enough. Needs gc support to
free external blocks when object dies. Ugly with a scavenger IIRC.
2) Allow object allocation outside ObjectSpace. Did this for Interval
and I think Ian might be using the rump-version of this that I used to
have for the RISC OS DisplayScreen. Main problem is with OSs- it's a
pain to not have a uniform allocation direction. My memory is that Mac
and RISC OS would allocate new outside-objects above ObjectSpace and
Windows would do it below. Since one of the things you need to do is
have a quick way to handle the young/old object decision, Windows was a
bugger - no simply saying addr>youngStart => young object. The Interval
code would allow any kind of object to be external but if we really
only want a 'large non-pointer object space' we could simplify by only
allowing word/byte objects and so not need to trace into them. AFAICR
there were no GC problems particularly. If there is going to be a lot
ot traffic (allocate/free) then you need to either trust the OS
allocator or write your own and deal with all that fun.
With either approach you have the obvious issue of sandbarring. This
might not be considered a problem for big desktop machine with virtual
memory but it could be a major pain for PDAs and embedded systems.
Let's make sure we don't block anyone out.
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
My computer isn't that nervous, it's just a bit ANSI.
|
|
From: Andreas R. <and...@gm...> - 2004-04-06 23:15:43
|
Tim, > I've this sort of thing several times before and so far as I can recall > there are two basic approaches > > 1) Create a format of object that allows a memory pointer to some other > place. I'd really hate to see this more complicated than allowing > just not-oops in an object; mixing real oops and external pointers.... > euch. Primitives encapsulate access easily enough. Needs gc support to > free external blocks when object dies. Ugly with a scavenger IIRC. Yup. Wouldn't want to go that way. Much too complex for what I really need. > 2) Allow object allocation outside ObjectSpace. Did this for Interval > and I think Ian might be using the rump-version of this that I used to > have for the RISC OS DisplayScreen. Main problem is with OSs- it's a > pain to not have a uniform allocation direction. My memory is that Mac > and RISC OS would allocate new outside-objects above ObjectSpace and > Windows would do it below. Since one of the things you need to do is > have a quick way to handle the young/old object decision, Windows was a > bugger - no simply saying addr>youngStart => young object. Doesn't matter for bits - the above is the precise reason for the restriction towards bit objects. > With either approach you have the obvious issue of sandbarring. This > might not be considered a problem for big desktop machine with virtual > memory but it could be a major pain for PDAs and embedded systems. > Let's make sure we don't block anyone out. Sorry, didn't get this - what do you mean with "sandbarring"? Cheers, - Andreas |
|
From: John M M. <jo...@sm...> - 2004-04-07 02:46:20
|
On Apr 6, 2004, at 4:15 PM, Andreas Raab wrote:
>
>> With either approach you have the obvious issue of sandbarring. This
>> might not be considered a problem for big desktop machine with virtual
>> memory but it could be a major pain for PDAs and embedded systems.
>> Let's make sure we don't block anyone out.
>
> Sorry, didn't get this - what do you mean with "sandbarring"?
>
> Cheers,
> - Andreas
Sandbarring was an issue with mac os pre-x if you used Handles. Memory
that had double indirection so it could move on a compact memory space
event. The issue was that you could lock down memory and make it
unmoveable.
The problem was that after running a while it was possible then to
distribute randomly those locked down bits of memory in say 128MB, then
gee you want say 8MB contiguous? Nope I've got 48MB free, but the nasty
locked down pieces prevent the the allocation because of their
distribution you just don't have 8mb contiguous...
Like of a ship sliding on a sandbar and well it can be disaster...
This is less of an issue (by far) since you just allocate those bigger
object higher up in the 64 bit address space.
Mind I'm wondering here of the impact of having those unmoveable chunks
sitting there, versus have a header and the body elsewhere from the
hardware perspective since as we are busy sweeping we are scanning
memory linearly and hardware is attempting to prefetch data and then we
jump 64K away? Although I like the idea of not having a performance
issue if you say foo makeFixed or foo makeNotFixed
{Mind should this feature be on 32bit VMs?} Might need to offer both
ways?
========================================================================
===
John M. McIntosh <jo...@sm...> 1-800-477-2659
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
========================================================================
===
|
|
From: Ned K. <ne...@bi...> - 2004-04-06 23:01:47
|
On Tuesday 06 April 2004 12:47 pm, Dan Ingalls wrote: > I'm inclined to drop them and give the bits back to the hash, but it's only > a mild inclination. How do you guys feel? I'd be in favor of this. It's apparent that improving hashing for things like IdentityDictionary instances would require both better choices of table size and a better hash distribution; having more bits should improve the distribution. -- Ned Konz http://bike-nomad.com GPG key ID: BEEA7EFE |
|
From: Tim R. <ti...@su...> - 2004-04-06 23:07:43
|
In message <003101c41c1e$345e2c00$b2d0fea9@R22>
"Andreas Raab" <and...@gm...> wrote:
> Hi Dan,
>
[sni]
> > or two each. Any problems with Tim's suggestion to drop the
> > limit to 511?
>
> Fine with me.
[snip]
The ones we'd lose are the top 8 quick instvar pushing options. Hard to
iamge that will cost anything beyond a trivial Compiler patch
> > Immediate objects (tagging)
[snip]
> A very clear "NO" to the idea of taking away a bit from SmallInteger. 30bit
> floats are almost useless
I agree.
>
> Here is an alternative that I considered a while ago: Instead of using
> "another tag" bit we could declare a tag pattern which looks like:
>
> 00 - OOP
> 10 - Immediate
> x1 - SmallInteger
>
> With "Immediate" being described by 6 further bits which specify the index
> into the "immediate classes array" (precisely the same as compact classes).
> This would leave 24 bits which I could (out of my mind) think of 4
> situations in which they would be useful:
> * Colors (8 bits each)
> * Immediate Points (signed 12 bits x/y)
> * Wide characters (Unicode is defined as 21 bit; but Yoshiki really needs
> 24)
> * Scaled decimals (16.8 or 12.12)
Sounds plausible to me. There is some definite image level impact to
chaging things to be immediates; for example no more Point>x: usage.
>
> The only problem with using the 10 tag is that the garbage collector
> currently uses that - but I believe this is fixable. In any case, there is a
> clear objection from my end against taking away a bit from SmallIntegers.
>
> > Form bits modulus
> > I want to change the modulus of Form bitmaps to 64 bits
> > in both 32- and 64-bit images. It will simplify things
> > for them to be the same, and I think we can convert old ones
> > on the fly for compatibility (ie, when reading in a project,
> > or an old '.form' file. Do you see any problems with this?
>
> A key question is whether the "bits" of a 64bit form would store 32bit
> quantities or 64bit. If the latter I'd say this is going to get problematic.
> If the first, e.g., we merely extend the form bits to make sure we have
> always 64bit aligned pixel rows, that should fine.
The equivalent approach worked fine in BrouHaHa for hainge 32bit words
supporting 16bit Form granularity.
>
> > CompiledMethods
[snip]
>
> I don't have a good understanding about the tradeoffs involved (Ian should
> comment on this) so I'll just throw in a general thought: I think that if we
> can (e.g., not being overly wasteful in terms of space) we should leave some
> "unused" slot. Much of the experimentation we saw in the past was only
> possible because such "unused" places existed (say, MethodContext's
> receiverMap).
For the cleaned up CM format adding extra instvars is no problem at all
since it's just a normal object. Ian once strongly advocated having the
bytecodes and literals as separate arrays, whereas my initial
implementation had the literals as indexable vars following the header,
bytecode oop, etc. The VM cost is one more pointer to chase
balanced against the flexibility provided.
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
For every action, there is an equal and opposite criticism.
|
|
From: Dan I. <Da...@Sq...> - 2004-04-08 19:48:41
|
Hi, Guys - Thanks for all your comments and suggestions. I don't know about you, but I find this to be a lot of fun. Now that discussion has tapered off, I will do my best to summarize the various areas, and to make some tentative decisions for your approval or further discussion in the second round. I do this, not because I make the best decisions, but because we need to reduce uncertainty to move forward. Please object now if you feel like I'm leaning the wrong way on items above the line here. Cleanups First of these is the split primitive field of method headers. I think there's good agreement on returning to a 9-bit field and doing away with (or converting to named access) all those that were using higher indices. I have a specific proposal which I will document soon on the web site, but it is essentially what Tim suggested, except I would drop of the loadInstVar range so we leave room for 50 or so unused inedexed primitives at the top. My reasoning is that, ugly or not, indexed primitives are the easiest way for curious hackers to experiment with new capabilities in Squeak. [By the way, it looks like "External primitive support primitives" 570-574 are still in use and would need to be slid down] Compact classes Nobody seems to care much here. I was impressed that Andreas picked up on the slight "tagging" advantage that CCs offer. This actually saves an instruction or two testing for contexts, as I recall. I checked the stats of 'mini.image', the" complete" MVC Smalltalk in 520k. It has 13426 instances of compact classes, most of which would be 4 bytes bigger without CCs (some are large enough to have big headers already). The real count is less, since I picked up 1000 contexts in my stats, but those are meaningful, too. So the space cost for this example is about 54k. I expect this factor or 10% would scale pretty well to any small kernel, since it will mostly be full of the same stuff (methods, symbols, arrays, points and rectangles). However... Consider if we revamp CompiledMethods as well. Then the 4600 methods in this benchmark become either 9200 or 13,800 objects, all with 2-word headers. So the cost (versus doing new CMs, but with CCs) would be an additional 18.4k or 37k depending on whether methods become 2 or 3 objects. I think this change is worth one more round of contemplation on grounds of IIABDFI (if it ain't broke...). The part that bothers me most is actually compatibility with Version-3 projects (image segments), but we're going to have to do something for this anyway. What's nasty about this change is that it's harder to do on the fly, since some objects get bigger. SmallIntegers We had some interesting discussion about changing the format of SmallIntegers. I'm going to invoke executive privilege however, to say that we will not change the current representation on this go round. I'm sure it would be an interesting exercise, but I don't see a really big payoff, and I can imagine it ending up being a lot of work. If anyone wants to do this on their own and give me the code, or if they want to argue one last time for the benefits, I'm still willing to reconsider. I really mean this -- i just don't want to get bogged down in gratuitous changes. Immediate objects (tagging) I like Adreas's proposal for an extensible set of Immediate Objects. I still haven't researched the GC conflict with the 10 tag. Also I haven't looked at how small a kernel of support we could have in the VM. The 64-bit format My inclination is to tie this as closely to the 32-bit format as possible for now, just to simplify the project. To me that means just extending every oop, and header word with zeroes, and sign-extending every "small" integer. For now, we'll also stick with the same 31-bit value range in both systems. Form bits modulus I want to change the modulus of Form bitmaps to 64 bits in both 32- and 64-bit images. As per discussion this does not change the format, only the raster width. CompiledMethods A major question to decide is whether to split up bytecodes and literals (and source pointer) as in Tim's NewCompiledMethod work. This adds a little space overhead, but it certainly makes things simpler. I'm going to assume that we go with Tim's 2-object format, and throw in a "serendipity" field for fun this summer. [By the way, I'm assuming this adds another very high bandwidth register to access literals independent of bytecodes, right? And Ian please comment on the relative importance of 3 objects vs two] Closures I'm just assuming we would load some form of Anthony's work, which probably also means using his compiler. Ian, are you happy with the structures he chose, or should we make some changes before loading them up? We probably want to start with something that doesn't change the existing context structures much. Order of evaluation I know we've chatted about this in the past, and believe me I *want* to do this. Unfortunately, I think for Squeak it would be a mistake. There is a fair amount of useful software (mainly compilers) in the community that has the left-to-right convention built in. Then there is the occasional bad usage of the form (strm next * 256 + strm next), but they deserve what they get. Hey, I know -- how about left-to-right on big-endian machines, and right to left on little-endian. Just kidding. I'm mainly concerned about bucking a convention that has paid off in synergy over the years. Decoupling GC from Primitives This seems like an all-around win, although not really V4-related. i would love it if someone would take it upon themselves to spec this and estimate difficulty, so we can review it, and then actually do it. Separate Allocation of some Bits Objects I understand that this one matters. I'm just the wrong person to spec it or do it (since I know relatively little about media engines and OS resource management). Most other serious implementations have such a facility, and I'm sure it would help with any serious media work. I'm happy to manage it as part of this project. If someone can pull this off in a compatible time frame, I'll do my best to merge it with the V4 changes. If not, perhaps we could at least spec it in such a way that future V4 VMs could add it without needing further image changes. ----------------------------------------- Other Changes "While we have it in the shop..." Beyond here are changes I haven't thought about enough to render even a temporary judgement. I'd love to hear more pro's and con's and maybe some estimate of how much work and who would do it. Immutability bit This sounds intriguing. Has anyone here thought seriously about what is needed to support it? Proxy support (divergent "self' and "receiver") Is 'receiverMap' (ie a spare slot in Contexts) enough for now? Also I've heard of "resend" bytecodes for delegation that re-use the context without having to load and push all the args as well, when all you want to do is forward the message to, eg, an aspect variable. Squat Support Need to hear more discussion. Flat Rectangles I never wrote about this, but I came very close do doing this in the early history of Squeak. It saves time getting to BitBlt, doing intersections, etc, and it saves a little space, too. I wouldn't do this for me right now, but I'm wondering if Mr. Graphics has ever wished for it. I think it's not a hugely hard change to make (at least it wasn't back then). I probably left something out -- please remind me Thanks all -- you guys are great! - Dan |
|
From: Ian P. <ian...@in...> - 2004-04-08 21:10:24
|
Hi Dan, Great r=E9sum=E9 (even if I do feel obliged to take issue with a couple = of=20 items in it ;-). > Compact classes > Nobody seems to care much here. I was impressed that Andreas picked=20= > up on the slight "tagging" advantage that CCs offer. This actually=20 > saves an instruction or two testing for contexts, as I recall. But, it's also a royal pain when you decide you want to evict Context=20 from the set of compact classes. If minimising space by keeping CCs is our prime goal, then I'm entirely=20= neutral on the issue. If minimising (Context) membership test time by keeping CCs around is=20 our prime goal, then I'm entirely against it. It would make certain=20 classes "privileged", in that you can *never* make them non-compact=20 without breaking your VM. Otherwise it means the VM can never use CC=20 fields exclusively to check for membership (it would still have to load=20= the class generically and test pointer eqv). This kind of defeats the=20= argument for keeping CCs around -- entirely. I actually think it would make for the fastest possible execution if=20 compact classes just went away leaving all class pointers explicit in=20 the method header. No bit fields to extract. No privileged CC indices=20= to "secretly know about". No branch-over-check-for-CC-index that might=20= be zero requiring a punt to class header load and compare. Instead:=20 just one simple, consitent, linear thing. > SmallIntegers > We had some interesting discussion about changing the format of=20 > SmallIntegers. I'm going to invoke executive privilege however, to=20 > say that we will not change the current representation on this go=20 > round. I'm sure it would be an interesting exercise, but I don't see=20= > a really big payoff, and I can imagine it ending up being a lot of=20 > work. > > If anyone wants to do this on their own and give me the code, or if=20 > they want to argue one last time for the benefits, I'm still willing=20= > to reconsider. I really mean this -- i just don't want to get bogged=20= > down in gratuitous changes. The only tag change I would push for is putting SI tag in the topmost=20 bit. But Tim (or rather risc-os) killed that one stone dead. > The 64-bit format > My inclination is to tie this as closely to the 32-bit format as=20 > possible for now, just to simplify the project. To me that means just=20= > extending every oop, and header word with zeroes, and sign-extending=20= > every "small" integer. For now, we'll also stick with the same 31-bit=20= > value range in both systems. > > > Form bits modulus > I want to change the modulus of Form bitmaps to 64 bits in both 32-=20 > and 64-bit images. As per discussion this does not change the format,=20= > only the raster width. Something is itching me about making Forms 64-bits deep while retaining=20= 31-bit integers. I can't really put my finger on it, but the phrase=20 "LargeInteger explosion" is floating around my head for some reason... > CompiledMethods > A major question to decide is whether to split up bytecodes and=20 > literals *Yes*. > (and source pointer) In the CM (or in the literals if you can fathom a reason to prefer=20 that). > and throw in a "serendipity" field for fun this summer. Let's just make the VM completely agnostic w.r.t. the size of CMs. =20 Stick the variable pointers where pointers belong, stick the bytecodes=20= where variable bytes belong, and stick fixed fields (including method=20 header) where fixed fields belong -- in something about which the VM=20 never makes assumptions beyond the first N fields and which never cares=20= if someone subclasses it in cruel and unusual ways. > [By the way, I'm assuming this adds another very high bandwidth=20 > register to access literals independent of bytecodes, right? And Ian=20= > please comment on the relative importance of 3 objects vs two] For interpretation you maybe want to care (a teeny bit) about=20 minimising the number of "hot" registers, but for all architectures bar=20= one (the one that lets you get at 8 out of the 64? 128? registers ;)=20 this is utterly irrelevant: we still have plenty of "fixed" registers=20 to spare. For a translating runtime I'd want the simplest and most generic model=20= possible. Each object has precisly one function (separate out the CM,=20= literals, bytecodes) and -- whatever else we end up deciding to do or=20 not to do -- get those #^$*@! block bodies out of their defining CM and=20= into their own, independent CM once and for all! ;) > Closures > I'm just assuming we would load some form of Anthony's work, which=20 > probably also means using his compiler. Is this a modified St-80 Compiler, or one based on SmaCC? If the latter than I'd vote for this (compiler change) faster than I'd=20= vote for Ralph Nader (which is pretty darn fast ;). > Ian, are you happy with the structures he chose, or should we make=20 > some changes before loading them up? We probably want to start with=20= > something that doesn't change the existing context structures much. I'd like to talk with Anthony about this (great) detail, in a much=20 higher-bandwidth setting than a mailing list, and come to a unanimous=20 concensus on the optimal bang-for-the-buck to be had in a "short-term=20 project" (paying equal attention to the efficiency of interpretive and=20= translational approaches). I think we can come up with something novel=20= that will be sufficiently close to what we have to make for very little=20= implementation impact, yet yield significant performance benefits. In any case I think it's important not to lose much of Anthony's work,=20= but at the same time I think we have a great opportunity to "undo"=20 several mistakes that were made in the past (by all concerned -- I'm=20 not pointing fingers at Anthony or anyone else in particular, nor=20 disclaiming responsibility for my own fair share of them!). > Order of evaluation > I know we've chatted about this in the past, and believe me I *want*=20= > to do this. Unfortunately, I think for Squeak it would be a mistake. I think _not_ to do it would invalidate a lot of very useful stuff that=20= could be done elsewhere, in primitive calling conventions, translating=20= runtimes, etc... > There is a fair amount of useful software (mainly compilers) in the=20 > community that has the left-to-right convention built in. Then maybe _they_ are broken? We should fix them (not make concessions=20= to them). George Bernard Shaw: The reasonable man adapts himself to the world;=20 the unreasonable one persists in trying to adapt the world to himself.=20= Therefore all progress depends on the unreasonable man. (I like that one almost as much as I do the one about "cheating and not=20= getting caught". ;) > Then there is the occasional bad usage of the form (strm next * 256 +=20= > strm next), but they deserve what they get. Hear hear! > I'm mainly concerned about bucking a convention that has paid off in=20= > synergy over the years. Depending on any undefined (by some standard someplace) implementation=20= behaviour is a bug (even if the program runs without exhibiting any=20 erroneous behaviour). You would not find any C programmer (beyond=20 elementary school, or Redmond maybe ;-) writing the C equivalent of the=20= above Stream example. Smalltalk programmers should be every bit as=20 careful: (AFAIK) the eval order is not specified by the standard. (Somebody please, *please* correct me if I'm wrong about this. I can't=20= find my copy of the standard to check. Is it online anywhere yet?) > Separate Allocation of some Bits Objects Separate allocation of "locked-down" (or "wired" or "non-moving" or=20 whatever) objects of any format. > Proxy support (divergent "self' and "receiver") > Is 'receiverMap' (ie a spare slot in Contexts) enough for now? I think so. Hope that was useful (and not too contentious and/or factually=20 inaccurate)! Cheers, Ian |
|
From: Dan I. <Da...@Sq...> - 2004-04-08 21:37:57
|
Hi, Ian - <many arguments for truth and beauty snipped> >George Bernard Shaw: The reasonable man adapts himself to the world; the unreasonable one persists in trying to adapt the world to himself. Therefore all progress depends on the unreasonable man. Thanks for the nudge. That's what friends are for. >Smalltalk programmers should be every bit as careful: (AFAIK) the eval order is not specified by the standard. > >(Somebody please, *please* correct me if I'm wrong about this. I can't find my copy of the standard to check. Is it online anywhere yet?) Early on I urged them to keep it unspecified (you'll be glad to know). Wish I had stuck around for some of the other decisions ;-). Thanks again, Ian. I'll make a substantive reply after some further discussion has ensued. - Dan |
|
From: Andreas R. <and...@gm...> - 2004-04-08 23:10:13
|
Hi Ian, [Compact classes] > If minimising (Context) membership test time by keeping > CCs around is our prime goal, then I'm entirely against > it. It would make certain classes "privileged", in that > you can *never* make them non-compact without breaking > your VM. Otherwise it means the VM can never use CC > fields exclusively to check for membership (it would > still have to load the class generically and test > pointer eqv). This kind of defeats the argument for > keeping CCs around -- entirely. Agreed. Let's kill them if we don't worry about the space savings then. > I actually think it would make for the fastest possible execution > if compact classes just went away leaving all class pointers > explicit in the method header. No bit fields to extract. > No privileged CC indices to "secretly know about". No > branch-over-check-for-CC-index that might be zero requiring > a punt to class header load and compare. Instead: > just one simple, consitent, linear thing. Hm... this would actually be a strong argument against introducing more immediates. They would have similar properties wouldn't they? [Forms] > Something is itching me about making Forms 64-bits deep while > retaining 31-bit integers. I can't really put my finger on it, > but the phrase "LargeInteger explosion" is floating around my > head for some reason... Not "64 bits deep" but "64 bits aligned". The form bits still contain 32bit integers (otherwise we'd pay horrible prices in 32bit BitBlt versions). [Anthony's closure work] > Is this a modified St-80 Compiler, or one based on SmaCC? SmaCC. Cheers, - Andreas |
|
From: Ian P. <ian...@in...> - 2004-04-08 23:27:12
|
Hi Andreas, On 09 Apr 2004, at 01:10, Andreas Raab wrote: >> just one simple, consistent, linear thing. > > Hm... this would actually be a strong argument against introducing more > immediates. They would have similar properties wouldn't they? Yes, absolutely. (At least in the lower magnitudes of the price/performance equation.) > Not "64 bits deep" but "64 bits aligned". Gotcha. Thanks for the clarification. > SmaCC. Anyone know if SmaCC-generated parsers can "coroutine" (not really, but you get the idea) to each other (or coexist in parallel under a common abstract scanner) with the same source stream? Thanks! Ian |
|
From: Ian P. <ian...@in...> - 2004-04-08 23:52:51
|
Hi Ian,
On 09 Apr 2004, at 01:27, Ian Piumarta wrote:
>> Hm... this would actually be a strong argument against introducing
>> more
>> immediates. They would have similar properties wouldn't they?
>
> Yes, absolutely. (At least in the lower magnitudes of the
> price/performance equation.)
No, not necessarily.
We have an array containing the classes for each tag combination (init
from splObj at startup, ensure remap during GC) then (for a 2-bit tag
field):
Interp
instVars: '... immediateClasses ...'
Interp>>classOf: anObj
| tag |
self inline: true.
^(tag _ anObj bitAnd: 3) == 0
ifTrue: [self classHeaderOf: anObj]
ifFalse: [immediateClasses at: tag]
or something similar. Costs the same as if we only had one tag bit
(with ClassSmallInt sucked out of splObj).
Cheers,
Ian
|
|
From: Dan I. <Da...@Sq...> - 2004-04-09 17:00:10
|
>[Anthony's closure work] >> Is this a modified St-80 Compiler, or one based on SmaCC? > >SmaCC. Hi, Guys - Just anticipating the release process... 1. I haven't played with SMACC. I assume it's good. I gather Marcus has also used this to do some fun work with Parse Trees. 2. Is there a pretty clean package that converts a current Squeak to use SMACC with decent reliability? 2a. Is this a part of Anthony's Closure package? 3. Is there general agreement on this list that SMACC is the way to go? ... If so, then I want to work with the Guides to get SMACC into 3.8alpha asap, so that it's not an added cause of confusion later, when we merge the other (lower-level) V4 stuff with 3.8. Thanks - Dan |
|
From: Marcus D. <de...@ac...> - 2004-04-09 17:37:34
|
Am 09.04.2004 um 19:00 schrieb Dan Ingalls:
>> [Anthony's closure work]
>>> Is this a modified St-80 Compiler, or one based on SmaCC?
>>
>> SmaCC.
>
> Hi, Guys -
>
> Just anticipating the release process...
>
> 1. I haven't played with SMACC. I assume it's good. I gather Marcus
> has also used this to do some fun work with Parse Trees.
>
I did some experiments regarding scripting syntax last year. That is,
my Squeak accept happily a method like
function testForIn2() {
var a,c,i;
a = new OrderedCollection;
a.add(1);
a.add(2);
c = 0;
for (i in a) { c += i; }
this.assert(c == 3);
}
or
to exampleTurtle
repeat 36 [repeat 4 [
fd 90
rt 90
] rt 10 ]
end
or
def testWhile():
a = 1
while a < 10:
a = a + 1
b = a
self.assert(b == 10)
> 2. Is there a pretty clean package that converts a current Squeak to
> use SMACC with decent reliability?
>
Yes. Anthony provides all the changesets needed.
I used this while hacking on AOStA, no problems so far.
The new compiler is installed in a way that it can be disabled via a
preference.
We need to really put some work into cleaning up the old compiler and
make everything use
the new parsetrees.
The SmaCC-based compiler uses the RB Parsenodes. They are really nice
and a clear improvement, IMHO.
But some changes are needed to make everthing use them. We need to make
both Slang and eToys to work
with the new Parsenodes instead of the old ones. (I think Anthony
already did some work on Slang).
> 2a. Is this a part of Anthony's Closure package?
>
Yes.
Parts of the closure package already were added to 3.7 (e.g. the
IRBuilder). I need to make a
package with all the rest.
> 3. Is there general agreement on this list that SMACC is the way to
> go?
>
I vote for this.
> ... If so, then I want to work with the Guides to get SMACC into
> 3.8alpha asap, so that it's not an added cause of confusion later,
> when we merge the other (lower-level) V4 stuff with 3.8.
>
Ok. I'l help with that.
Marcus
--
Marcus Denker de...@ac...
|
|
From: Andreas R. <and...@gm...> - 2004-04-08 23:03:58
|
Dan,
This sounds all good to me. Couple of extra notes:
> Order of evaluation
> I know we've chatted about this in the past, and believe me
> I *want* to do this. Unfortunately, I think for Squeak it
> would be a mistake. There is a fair amount of useful
> software (mainly compilers) in the community that has
> the left-to-right convention built in. Then there is the
> occasional bad usage of the form (strm next * 256 + strm next),
> but they deserve what they get.
Uh, ah. That wasn't quite clear to me. Would this mean that, say:
self foo: self arg1
bar: self arg2
baz: self arg3.
Changes the order of evaluation for arg1-arg3? Would this be inavoidable? If
so I'm absolutely against changing order of evaluation to right-to-left.
Reading direction is *so* important for people that evaluating the above
"the other way around" would almost certainly lead to the most horrible
subtle bugs.
> Decoupling GC from Primitives
> This seems like an all-around win, although not really V4-related.
> i would love it if someone would take it upon themselves to spec
> this and estimate difficulty, so we can review it, and then actually
> do it.
Sounds good. I'm not sure I'll find the time to do it myself but I'll see if
I can stick it in one lazy afternoon (say, wasn't there an easter weekend
somewhere? hm...)
> Separate Allocation of some Bits Objects
> I understand that this one matters. I'm just the wrong person
> to spec it or do it (since I know relatively little about media
> engines and OS resource management). Most other serious
> implementations have such a facility, and I'm sure it would
> help with any serious media work. I'm happy to manage it as
> part of this project.
The "most important" thing is to have some field in the image header that's
set to zero - then we got plenty of time to figure out what we stick in when
we have it :-)
> Flat Rectangles
> I never wrote about this, but I came very close do doing
> this in the early history of Squeak. It saves time getting
> to BitBlt, doing intersections, etc, and it saves a little
> space, too. I wouldn't do this for me right now, but I'm
> wondering if Mr. Graphics has ever wished for it.
> I think it's not a hugely hard change to make (at least it
> wasn't back then).
Hm... I don't think there are too many advantages to it. BitBlt is the only
object in the image that uses a "fat" primitive interface - almost all other
prims dealing with rectangles just pass points or numbers verbatim. In
short, I've never truly wished for it, but I don't think it would do any
harm either. The most important part might be the space savings (avoids two
object headers for points) but that's about it.
Cheers,
- Andreas
|
|
From: Tim R. <ti...@su...> - 2004-04-08 23:57:55
Attachments:
errorReturns.18Aug606pm.cs
|
In message <017b01c41dbd$c16ac180$6688b8d9@R22>
"Andreas Raab" <and...@gm...> wrote:
> > Decoupling GC from Primitives
> > This seems like an all-around win, although not really V4-related.
> > i would love it if someone would take it upon themselves to spec
> > this and estimate difficulty, so we can review it, and then actually
> > do it.
>
> Sounds good. I'm not sure I'll find the time to do it myself but I'll see if
> I can stick it in one lazy afternoon (say, wasn't there an easter weekend
> somewhere? hm...)
allocateChunk:, sufficientSpaceAfterGC: are first targets of course,
then the callers to provide a back out route. I think the VM changes are
quite small.
I've managed to track down the (ancient) code for returning error codes
from prims - from 2.7 days believe it or not. It's mostly Parser changes
but the curent compiler stuff has changed enought that there appears to be
quite a disconnect. Essentially the idea is to add an optional named temp to
each prim calling method with a default (the brilliantly named
'primErrorReturnVariable') if the writer can't be bothered. This temp is
always the first one and so the VM know where to stick the return object.
Image code can use it to find out what happened and obviously one of the main
interests here is to know that a GC is needed.
Since it's such a tiny piece of code I've attached it FYI.
I hope we can get round to this: I've been asking for it for several
years...
>
> > Flat Rectangles
Chief advantage is a reduction in allocation traffic; one object
instead of three etc. You do have to be careful about some inadvertent
semantic changes though; currently
aRect origin x:foo
will actually change the x value of aRect's origin point. A flat
rectangle would almost certainly implement #origin as
^originX@originY
and thus aRect origin x: foo would have a rather different effect.
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
Strange OpCodes: IOP: Insult OPerator
|
|
From: Bert F. <be...@im...> - 2004-04-09 10:40:08
|
Am 09.04.2004 um 01:57 schrieb Tim Rowledge: >>> Flat Rectangles > Chief advantage is a reduction in allocation traffic; one object > instead of three etc. You do have to be careful about some inadvertent > semantic changes though; currently > aRect origin x:foo > will actually change the x value of aRect's origin point. A flat > rectangle would almost certainly implement #origin as > ^originX@originY > and thus aRect origin x: foo would have a rather different effect. Actually, Point>>x: does not exist (can't remember when we removed it). So this should be no problem at all. - Bert - |
|
From: Cees de G. <cg...@cd...> - 2004-04-09 12:02:27
|
> Immutability bit > This sounds intriguing. Has anyone here thought seriously about what is needed to support it? > Well, as far as you can classify my thinking on this subject as being serious (me being a total nitwit in this area), pseudocode-wise it would be: Object>>instVarAt: anInteger put: anObject self isImmutable ifTrue: [ AttemptToWriteImmutableObjectException new receiver: self; index: anInteger; value: anObject; raise]. <original code> Object>>beImmutable: aBoolean <primitive...> that's probably all support you'd need, an extra bit and some code in #primitiveInstVarAtPut. I'm using this feature in VisualWorks to support automatic dirty marking, and it is really nice - it allows you to build very complex persistent object models that you'd normally avoid because of the dirty marking hassles. |
|
From: Andreas R. <and...@gm...> - 2004-04-09 12:31:06
Attachments:
DisablePrimitiveGC-ar.3.cs
|
Hi Guys, Here's a cheap little proposal for evaluation and comments. It does just what I said - disabling GC in #primitiveResponse and checking if we have one pending afterwards. AFAICT, this should do the trick. Cheers, - Andreas |
|
From: Tim R. <ti...@su...> - 2004-04-09 01:02:54
|
In message <p05210605bc99dfc4d28a@[192.168.1.100]>
Dan Ingalls <Da...@Sq...> wrote:
> First of these is the split primitive field of method headers. I think there's good agreement on returning to a 9-bit field and doing away with (or converting to named access) all those that were using higher indices. I have a specific proposal which I will document soon on the web site, but it is essentially what Tim suggested, except I would drop of the loadInstVar range so we leave room for 50 or so unused inedexed primitives at the top. My reasoning is that, ugly or not, indexed primitives are the easiest way for curious hackers to experiment with new capabilities in Squeak.
>
> [By the way, it looks like "External primitive support primitives" 570-574 are still in use and would need to be slid down]
Whoops, forgot about them for a moment. Still, this shouldn't be a
problem since there appears to be 110 primitiveObsoleteIndexedPrimitive
in the primitiveTable as well as 50 primitiveFail (some of which are
needed I guess).
> CompiledMethods
> A major question to decide is whether to split up bytecodes and literals (and source pointer) as in Tim's NewCompiledMethod work. This adds a little space overhead, but it certainly makes things simpler. I'm going to assume that we go with Tim's 2-object format, and throw in a "serendipity" field for fun this summer.
IIRC the difference is:-
header
bytecodes
literals
other ivars
or
header
bytecodes
other ivars
indexed literals
The difference is that to allow adding ivars in subclasses the second
option means the VM has to check the number of fixed fields so it can
find the literals correctly. So it's an extra object or an extra size
check on every send (well one could cache it in some limited
implementations but you lose subclassability). I'd go for the extra
object, personally.
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
Bug? That's not a bug, that's a feature. -T. John Wendel
|
|
From: Dan I. <Da...@Sq...> - 2004-04-09 01:29:09
|
> > CompiledMethods >> A major question to decide is whether to split up bytecodes and literals (and source pointer) as in Tim's NewCompiledMethod work. This adds a little space overhead, but it certainly makes things simpler. I'm going to assume that we go with Tim's 2-object format, and throw in a "serendipity" field for fun this summer. >IIRC the difference is:- >header >bytecodes >literals >other ivars > >or >header >bytecodes >other ivars >indexed literals > >The difference is that to allow adding ivars in subclasses the second >option means the VM has to check the number of fixed fields so it can >find the literals correctly. So it's an extra object or an extra size >check on every send (well one could cache it in some limited >implementations but you lose subclassability). I'd go for the extra >object, personally. Not to choose sides yet, but I figured this offset would go in the method cache, where it's hardly significant added to the cost of setting up the base reg anyway. And I don't wee why "you lose subclassability". - D |
|
From: Tim R. <ti...@su...> - 2004-04-09 02:21:56
|
In message <p0521061dbc9bab05c832@[192.168.1.100]>
Dan Ingalls <Da...@Sq...> wrote:
>
>
> Not to choose sides yet, but I figured this offset would go in the method cache, where it's hardly significant added to the cost of setting up the base reg anyway.
Hmm, caching it. Must admit I hadn't thought of caching that particular
value in the mcache.
> And I don't wee why "you lose subclassability".
If you cache the offset as a vm 'global' (which is what we did on
one of the Interval versions) then you can only manage the one value and
thus all your cm's must be the same shape. IIRC it cost a bit less than
1% on the benchmarks used back then to look up fixedFieldsOf each time.
But we different machines and different benchmarks now so who knows?
tim
--
Tim Rowledge, ti...@su..., http://sumeru.stanford.edu/tim
Emacs is a nice operating system, but I prefer UNIX. - Tom Christiansen
|
|
From: Hans-Martin M. <hm...@he...> - 2004-04-09 12:33:02
|
Cees de Groot wrote: >>Immutability bit >>This sounds intriguing. Has anyone here thought seriously about what is needed to support it? >> >> >> >Well, as far as you can classify my thinking on this subject as being serious (me being a total nitwit in this area), pseudocode-wise it would be: > >Object>>instVarAt: anInteger put: anObject > self isImmutable ifTrue: [ > AttemptToWriteImmutableObjectException new > receiver: self; > index: anInteger; > value: anObject; > raise]. > <original code> > > > > The instVarAt:put: primitive (and the at:put: primitive for indexed access) can just check the immutability bit - the additional overhead should be small. The inst var assignments in methods are more complicated. One possible optimization would be to allow immutability only for old objects - when assigning into old objects, we need additional code anyway. The beImmutable primitive would make sure that the receiver is either old already, or it would move the oldspace boundary to point after the objects. Does that sound sane? Cheers, Hans-Martin |