## [a530bb]: doc / cmucl / internals / back.tex  Maximize  Restore  History

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 % -*- Dictionary: design -*- \chapter{Copy propagation} File: {\tt copyprop} This phase is optional, but should be done whenever speed or space is more important than compile speed. We use global flow analysis to find the reaching definitions for each TN. This information is used here to eliminate unnecessary TNs, and is also used later on by loop invariant optimization. In some cases, VMR conversion will unnecessarily copy the value of a TN into another TN, since it may not be able to tell that the initial TN has the same value at the time the second TN is referenced. This can happen when ICR optimize is unable to eliminate a trivial variable binding, or when the user does a setq, or may also result from creation of expression evaluation temporaries during VMR conversion. Whatever the cause, we would like to avoid the unnecessary creation and assignment of these TNs. What we do is replace TN references whose only reaching definition is a Move VOP with a reference to the TN moved from, and then delete the Move VOP if the copy TN has no remaining references. There are several restrictions on copy propagation: \begin{itemize} \item The TNs must be `ordinary'' TNs, not restricted or otherwise unusual. Extending the life of restricted (or wired) TNs can make register allocation impossible. Some other TN kinds have hidden references. \item We don't want to defeat source-level debugging by replacing named variables with anonymous temporaries. \item We can't delete moves that representation selected might want to change into a representation conversion, since we need the primitive types of both TNs to select a conversion. \end{itemize} Some cleverness reduces the cost of flow analysis. As for lifetime analysis, we only need to do flow analysis on global packed TNs. We can't do the real local TN assignment pass before this, since we allocate TNs afterward, so we do a pre-pass that marks the TNs that are local for our purposes. We don't care if block splitting eventually causes some of them to be considered global. Note also that we are really only are interested in knowing if there is a unique reaching definition, which we can mash into our flow analysis rules by doing an intersection. Then a definition only appears in the set when it is unique. We then propagate only definitions of TNs with only one write, which allows the TN to stand for the definition. \chapter{Representation selection} File: {\tt represent} Some types of object (such as {\tt single-float}) have multiple possible representations. Multiple representations are useful mainly when there is a particularly efficient non-descriptor representation. In this case, there is the normal descriptor representation, and an alternate non-descriptor representation. This possibility brings up two major issues: \begin{itemize} \item The compiler must decide which representation will be most efficient for any given value, and \item Representation conversion code must be inserted where the representation of a value is changed. \end{itemize} First, the representations for TNs are selected by examining all the TN references and attempting to minimize reference costs. Then representation conversion code is introduced. This phase is in effect a pre-pass to register allocation. The main reason for its existence is that representation conversions may be farily complex (e.g. involving memory allocation), and thus must be discovered before register allocation. VMR conversion leaves stubs for representation specific move operations. Representation selection recognizes {\tt move} by name. Argument and return value passing for call VOPs is controlled by the {\tt :move-arguments} option to {\tt define-vop}. Representation selection is also responsible for determining what functions use the number stack. If any representation is chosen which could involve packing into the {\tt non-descriptor-stack} SB, then we allocate the NFP register throughout the component. As an optimization, permit the decision of whether a number stack frame needs to be allocated to be made on a per-function basis. If a function doesn't use the number stack, and isn't in the same tail-set as any function that uses the number stack, then it doesn't need a number stack frame, even if other functions in the component do. \chapter{Lifetime analysis} File: {\tt life} This phase is a preliminary to Pack. It involves three passes: -- A pre-pass that computes the DEF and USE sets for live TN analysis, while also assigning local TN numbers, splitting blocks if necessary. \#\#\# But not really... -- A flow analysis pass that does backward flow analysis on the component to find the live TNs at each block boundary. -- A post-pass that finds the conflict set for each TN. \#| Exploit the fact that a single VOP can only exhaust LTN numbers when there are large more operands. Since more operand reference cannot be interleaved with temporary reference, the references all effectively occur at the same time. This means that we can assign all the more args and all the more results the same LTN number and the same lifetime info. |\# \section{Flow analysis} It seems we could use the global-conflicts structures during compute the inter-block lifetime information. The pre-pass creates all the global-conflicts for blocks that global TNs are referenced in. The flow analysis pass just adds always-live global-conflicts for the other blocks the TNs are live in. In addition to possibly being more efficient than SSets, this would directly result in the desired global-conflicts information, rather that having to create it from another representation. The DFO sorted per-TN global-conflicts thread suggests some kind of algorithm based on the manipulation of the sets of blocks each TN is live in (which is what we really want), rather than the set of TNs live in each block. If we sorted the per-TN global-conflicts in reverse DFO (which is just as good for determining conflicts between TNs), then it seems we could scan though the conflicts simultaneously with our flow-analysis scan through the blocks. The flow analysis step is the following: If a TN is always-live or read-before-written in a successor block, then we make it always-live in the current block unless there are already global-conflicts recorded for that TN in this block. The iteration terminates when we don't add any new global-conflicts during a pass. We may also want to promote TNs only read within a block to always-live when the TN is live in a successor. This should be easy enough as long as the global-conflicts structure contains this kind of info. The critical operation here is determining whether a given global TN has global conflicts in a given block. Note that since we scan the blocks in DFO, and the global-conflicts are sorted in DFO, if we give each global TN a pointer to the global-conflicts for the last block we checked the TN was in, then we can guarantee that the global-conflicts we are looking for are always at or after that pointer. If we need to insert a new structure, then the pointer will help us rapidly find the place to do the insertion.] \section{Conflict detection} [\#\#\# Environment, :more TNs.] This phase makes use of the results of lifetime analysis to find the set of TNs that have lifetimes overlapping with those of each TN. We also annotate call VOPs with information about the live TNs so that code generation knows which registers need to be saved. The basic action is a backward scan of each block, looking at each TN-Ref and maintaining a set of the currently live TNs. When we see a read, we check if the TN is in the live set. If not, we: -- Add the TN to the conflict set for every currently live TN, -- Union the set of currently live TNs with the conflict set for the TN, and -- Add the TN to the set of live TNs. When we see a write for a live TN, we just remove it from the live set. If we see a write to a dead TN, then we update the conflicts sets as for a read, but don't add the TN to the live set. We have to do this so that the bogus write doesn't clobber anything. [We don't consider always-live TNs at all in this process, since the conflict of always-live TNs with other TNs in the block is implicit in the global-conflicts structures. Before we do the scan on a block, we go through the global-conflicts structures of TNs that change liveness in the block, assigning the recorded LTN number to the TN's LTN number for the duration of processing of that block.] Efficiently computing and representing this information calls for some cleverness. It would be prohibitively expensive to represent the full conflict set for every TN with sparse sets, as is done at the block-level. Although it wouldn't cause non-linear behavior, it would require a complex linked structure containing tens of elements to be created for every TN. Fortunately we can improve on this if we take into account the fact that most TNs are "local" TNs: TNs which have all their uses in one block. First, many global TNs will be either live or dead for the entire duration of a given block. We can represent the conflict between global TNs live throughout the block and TNs local to the block by storing the set of always-live global TNs in the block. This reduces the number of global TNs that must be represented in the conflicts for local TNs. Second, we can represent conflicts within a block using bit-vectors. Each TN that changes liveness within a block is assigned a local TN number. Local conflicts are represented using a fixed-size bit-vector of 64 elements or so which has a 1 for the local TN number of every TN live at that time. The block has a simple-vector which maps from local TN numbers to TNs. Fixed-size vectors reduce the hassle of doing allocations and allow operations to be open-coded in a maximally tense fashion. We can represent the conflicts for a local TN by a single bit-vector indexed by the local TN numbers for that block, but in the global TN case, we need to be able to represent conflicts with arbitrary TNs. We could use a list-like sparse set representation, but then we would have to either special-case global TNs by using the sparse representation within the block, or convert the local conflicts bit-vector to the sparse representation at the block end. Instead, we give each global TN a list of the local conflicts bit-vectors for each block that the TN is live in. If the TN is always-live in a block, then we record that fact instead. This gives us a major reduction in the amount of work we have to do in lifetime analysis at the cost of some increase in the time to iterate over the set during Pack. Since we build the lists of local conflict vectors a block at a time, the blocks in the lists for each TN will be sorted by the block number. The structure also contains the local TN number for the TN in that block. These features allow pack to efficiently determine whether two arbitrary TNs conflict. You just scan the lists in order, skipping blocks that are in only one list by using the block numbers. When we find a block that both TNs are live in, we just check the local TN number of one TN in the local conflicts vector of the other. In order to do these optimizations, we must do a pre-pass that finds the always-live TNs and breaks blocks up into small enough pieces so that we don't run out of local TN numbers. If we can make a block arbitrarily small, then we can guarantee that an arbitrarily small number of TNs change liveness within the block. We must be prepared to make the arguments to unbounded arg count VOPs (such as function call) always-live even when they really aren't. This is enabled by a panic mode in the block splitter: if we discover that the block only contains one VOP and there are still too many TNs that aren't always-live, then we promote the arguments (which we'd better be able to do...). This is done during the pre-scan in lifetime analysis. We can do this because all TNs that change liveness within a block can be found by examining that block: the flow analysis only adds always-live TNs. When we are doing the conflict detection pass, we set the LTN number of global TNs. We can easily detect global TNs that have not been locally mapped because this slot is initially null for global TNs and we null it out after processing each block. We assign all Always-Live TNs to the same local number so that we don't need to treat references to them specially when making the scan. We also annotate call VOPs that do register saving with the TNs that are live during the call, and thus would need to be saved if they are packed in registers. We adjust the costs for TNs that need to be saved so that TNs costing more to save and restore than to reference get packed on the stack. We would also like more often saved TNs to get higher costs so that they are packed in more savable locations. \chapter{Packing} File: {\tt pack} \#| Add lifetime/pack support for pre-packed save TNs. Fix GTN/VMR conversion to use pre-packed save TNs for old-cont and return-PC. (Will prevent preference from passing location to save location from ever being honored?) We will need to make packing of passing locations smarter before we will be able to target the passing location on the stack in a tail call (when that is where the callee wants it.) Currently, we will almost always pack the passing location in a register without considering whether that is really a good idea. Maybe we should consider schemes that explicitly understand the parallel assignment semantics, and try to do the assignment with a minimum number of temporaries. We only need assignment temps for TNs that appear both as an actual argument value and as a formal parameter of the called function. This only happens in self-recursive functions. Could be a problem with lifetime analysis, though. The write by a move-arg VOP would look like a write in the current env, when it really isn't. If this is a problem, then we might want to make the result TN be an info arg rather than a real operand. But this would only be a problem in recursive calls, anyway. [This would prevent targeting, but targeting across passing locations rarely seems to work anyway.] [\#\#\# But the :ENVIRONMENT TN mechanism would get confused. Maybe put env explicitly in TN, and have it only always-live in that env, and normal in other envs (or blocks it is written in.) This would allow targeting into environment TNs. I guess we would also want the env/PC save TNs normal in the return block so that we can target them. We could do this by considering env TNs normal in read blocks with no successors. ENV TNs would be treated totally normally in non-env blocks, so we don't have to worry about lifetime analysis getting confused by variable initializations. Do some kind of TN costing to determine when it is more trouble than it is worth to allocate TNs in registers. Change pack ordering to be less pessimal. Pack TNs as they are seen in the LTN map in DFO, which at least in non-block compilations has an effect something like packing main trace TNs first, since control analysis tries to put the good code first. This could also reduce spilling, since it makes it less likely we will clog all registers with global TNs. If we pack a TN with a specified save location on the stack, pack in the specified location. Allow old-cont and return-pc to be kept in registers by adding a new "keep around" kind of TN. These are kind of like environment live, but are only always-live in blocks that they weren't referenced in. Lifetime analysis does a post-pass adding always-live conflicts for each "keep around" TN to those blocks with no conflict for that TN. The distinction between always-live and keep-around allows us to successfully target old-cont and return-pc to passing locations. MAKE-KEEP-AROUND-TN (ptype), PRE-PACK-SAVE-TN (tn scn offset). Environment needs a KEEP-AROUND-TNS slot so that conflict analysis can find them (no special casing is needed after then, they can be made with :NORMAL kind). VMR-component needs PRE-PACKED-SAVE-TNS so that conflict analysis or somebody can copy conflict info from the saved TN. Note that having block granularity in the conflict information doesn't mean that a localized packing scheme would have to do all moves at block boundaries (which would clash with the desire the have saving done as part of this mechanism.) All that it means is that if we want to do a move within the block, we would need to allocate both locations throughout that block (or something). Load TN pack: A location is out for load TN packing if: The location has TN live in it after the VOP for a result, or before the VOP for an argument, or The location is used earlier in the TN-ref list (after) the saved results ref or later in the TN-Ref list (before) the loaded argument's ref. To pack load TNs, we advance the live-tns to the interesting VOP, then repeatedly scan the vop-refs to find vop-local conflicts for each needed load TN. We insert move VOPs and change over the TN-Ref-TNs as we go so the TN-Refs will reflect conflicts with already packed load-TNs. If we fail to pack a load-TN in the desired SC, then we scan the Live-TNs for the SB, looking for a TN that can be packed in an unbounded SB. This TN must then be repacked in the unbounded SB. It is important the load-TNs are never packed in unbounded SBs, since that would invalidate the conflicts info, preventing us from repacking TNs in unbounded SBs. We can't repack in a finite SB, since there might have been load TNs packed in that SB which aren't represented in the original conflict structures. Is it permissible to "restrict" an operand to an unbounded SC? Not impossible to satisfy as long as a finite SC is also allowed. But in practice, no restriction would probably be as good. We assume all locations can be used when an sc is based on an unbounded sb. ] TN-Refs are be convenient structures to build the target graph out of. If we allocated space in every TN-Ref, then there would certainly be enough to represent arbitrary target graphs. Would it be enough to allocate a single Target slot? If there is a target path though a given VOP, then the Target of the write ref would be the read, and vice-versa. To find all the TNs that target us, we look at the TN for the target of all our write refs. We separately chain together the read refs and the write refs for a TN, allowing easy determination of things such as whether a TN has only a single definition or has no reads. It would also allow easier traversal of the target graph. Represent per-location conflicts as vectors indexed by block number of per-block conflict info. To test whether a TN conflicts on a location, we would then have to iterate over the TNs global-conflicts, using the block number and LTN number to check for a conflict in that block. But since most TNs are local, this test actually isn't much more expensive than indexing into a bit-vector by GTN numbers. The big win of this scheme is that it is much cheaper to add conflicts into the conflict set for a location, since we never need to actually compute the conflict set in a list-like representation (which requires iterating over the LTN conflicts vectors and unioning in the always-live TNs). Instead, we just iterate over the global-conflicts for the TN, using BIT-IOR to combine the conflict set with the bit-vector for that block in that location, or marking that block/location combination as being always-live if the conflict is always-live. Generating the conflict set is inherently more costly, since although we believe the conflict set size to be roughly constant, it can easily contain tens of elements. We would have to generate these moderately large lists for all TNs, including local TNs. In contrast, the proposed scheme does work proportional to the number of blocks the TN is live in, which is small on average (1 for local TNs). This win exists independently from the win of not having to iterate over LTN conflict vectors. [\#\#\# Note that since we never do bitwise iteration over the LTN conflict vectors, part of the motivation for keeping these a small fixed size has been removed. But it would still be useful to keep the size fixed so that we can easily recycle the bit-vectors, and so that we could potentially have maximally tense special primitives for doing clear and bit-ior on these vectors.] This scheme is somewhat more space-intensive than having a per-location bit-vector. Each vector entry would be something like 150 bits rather than one bit, but this is mitigated by the number of blocks being 5-10x smaller than the number of TNs. This seems like an acceptable overhead, a small fraction of the total VMR representation. The space overhead could also be reduced by using something equivalent to a two-dimensional bit array, indexed first by LTN numbers, and then block numbers (instead of using a simple-vector of separate bit-vectors.) This would eliminate space wastage due to bit-vector overheads, which might be 50% or more, and would also make efficient zeroing of the vectors more straightforward. We would then want efficient operations for OR'ing LTN conflict vectors with rows in the array. This representation also opens a whole new range of allocation algorithms: ones that store allocate TNs in different locations within different portions of the program. This is because we can now represent a location being used to hold a certain TN within an arbitrary subset of the blocks the TN is referenced in. Pack goals: Pack should: Subject to resource constraints: -- Minimize use costs -- "Register allocation" Allocate as many values as possible in scarce "good" locations, attempting to minimize the aggregate use cost for the entire program. -- "Save optimization" Don't allocate values in registers when the save/restore costs exceed the expected gain for keeping the value in a register. (Similar to "opening costs" in RAOC.) [Really just a case of representation selection.] -- Minimize preference costs Eliminate as many moves as possible. "Register allocation" is basically an attempt to eliminate moves between registers and memory. "Save optimization" counterbalances "register allocation" to prevent it from becoming a pessimization, since saves can introduce register/memory moves. Preference optimization reduces the number of moves within an SC. Doing a good job of honoring preferences is important to the success of the compiler, since we have assumed in many places that moves will usually be optimized away. The scarcity-oriented aspect of "register allocation" is handled by a greedy algorithm in pack. We try to pack the "most important" TNs first, under the theory that earlier packing is more likely to succeed due to fewer constraints. The drawback of greedy algorithms is their inability to look ahead. Packing a TN may mess up later "register allocation" by precluding packing of TNs that are individually "less important", but more important in aggregate. Packing a TN may also prevent preferences from being honored. Initial packing: Pack all TNs restricted to a finite SC first, before packing any other TNs. One might suppose that Pack would have to treat TNs in different environments differently, but this is not the case. Pack simply assigns TNs to locations so that no two conflicting TNs are in the same location. In the process of implementing call semantics in conflict analysis, we cause TNs in different environments not to conflict. In the case of passing TNs, cross environment conflicts do exist, but this reflects reality, since the passing TNs are live in both the caller and the callee. Environment semantics has already been implemented at this point. This means that Pack can pack all TNs simultaneously, using one data structure to represent the conflicts for each location. So we have only one conflict set per SB location, rather than separating this information by environment environment. Load TN packing: We create load TNs as needed in a post-pass to the initial packing. After TNs are packed, it may be that some references to a TN will require it to be in a SC other than the one it was packed in. We create load-TNs and pack them on the fly during this post-pass. What we do is have an optional SC restriction associated with TN-refs. If we pack the TN in an SC which is different from the required SC for the reference, then we create a TN for each such reference, and pack it into the required SC. In many cases we will be able to pack the load TN with no hassle, but in general we may need to spill a TN that has already been packed. We choose a TN that isn't in use by the offending VOP, and then spill that TN onto the stack for the duration of that VOP. If the VOP is a conditional, then we must insert a new block interposed before the branch target so that the value TN value is restored regardless of which branch is taken. Instead of remembering lifetime information from conflict analysis, we rederive it. We scan each block backward while keeping track of which locations have live TNs in them. When we find a reference that needs a load TN packed, we try to pack it in an unused location. If we can't, we unpack the currently live TN with the lowest cost and force it into an unbounded SC. The per-location and per-TN conflict information used by pack doesn't need to be updated when we pack a load TN, since we are done using those data structures. We also don't need to create any TN-Refs for load TNs. [??? How do we keep track of load-tn lifetimes? It isn't really that hard, I guess. We just remember which load TNs we created at each VOP, killing them when we pass the loading (or saving) step. This suggests we could flush the Refs thread if we were willing to sacrifice some flexibility in explicit temporary lifetimes. Flushing the Refs would make creating the VMR representation easier.] The lifetime analysis done during load-TN packing doubles as a consistency check. If we see a read of a TN packed in a location which has a different TN currently live, then there is a packing bug. If any of the TNs recorded as being live at the block beginning are packed in a scarce SB, but aren't current in that location, then we also have a problem. The conflict structure for load TNs is fairly simple, the load TNs for arguments and results all conflict with each other, and don't conflict with much else. We just try packing in targeted locations before trying at random. \chapter{Code generation} This is fairly straightforward. We translate VOPs into instruction sequences on a per-block basis. After code generation, the VMR representation is gone. Everything is represented by the assembler data structures. \chapter{Assembly} In effect, we do much of the work of assembly when the compiler is compiled. The assembler makes one pass fixing up branch offsets, then squeezes out the space left by branch shortening and dumps out the code along with the load-time fixup information. The assembler also deals with dumping unboxed non-immediate constants and symbols. Boxed constants are created by explicit constructor code in the top-level form, while immediate constants are generated using inline code. [\#\#\# The basic output of the assembler is: A code vector A representation of the fixups along with indices into the code vector for the fixup locations A PC map translating PCs into source paths This information can then be used to build an output file or an in-core function object. ] The assembler is table-driven and supports arbitrary instruction formats. As far as the assembler is concerned, an instruction is a bit sequence that is broken down into subsequences. Some of the subsequences are constant in value, while others can be determined at assemble or load time. Assemble Node Form* Allow instructions to be emitted during the evaluation of the Forms by defining Inst as a local macro. This macro caches various global information in local variables. Node tells the assembler what node ultimately caused this code to be generated. This is used to create the pc=>source map for the debugger. Assemble-Elsewhere Node Form* Similar to Assemble, but the current assembler location is changed to somewhere else. This is useful for generating error code and similar things. Assemble-Elsewhere may not be nested. Inst Name Arg* Emit the instruction Name with the specified arguments. Gen-Label Emit-Label (Label) Gen-Label returns a Label object, which describes a place in the code. Emit-Label marks the current position as being the location of Label. \chapter{Dumping} So far as input to the dumper/loader, how about having a list of Entry-Info structures in the VMR-Component? These structures contain all information needed to dump the associated function objects, and are only implicitly associated with the functional/XEP data structures. Load-time constants that reference these function objects should specify the Entry-Info, rather than the functional (or something). We would then need to maintain some sort of association so VMR conversion can find the appropriate Entry-Info. Alternatively, we could initially reference the functional, and then later clobber the reference to the Entry-Info. We have some kind of post-pass that runs after assembly, going through the functions and constants, annotating the VMR-Component for the benefit of the dumper: Resolve :Label load-time constants. Make the debug info. Make the entry-info structures. Fasl dumper and in-core loader are implementation (but not instruction set) dependent, so we want to give them a clear interface. open-fasl-file name => fasl-file Returns a "fasl-file" object representing all state needed by the dumper. We objectify the state, since the fasdumper should be reentrant. (but could fail to be at first.) close-fasl-file fasl-file abort-p Close the specified fasl-file. fasl-dump-component component code-vector length fixups fasl-file Dump the code, constants, etc. for component. Code-Vector is a vector holding the assembled code. Length is the number of elements of Vector that are actually in use. Fixups is a list of conses (offset . fixup) describing the locations and things that need to be fixed up at load time. If the component is a top-level component, then the top-level lambda will be called after the component is loaded. load-component component code-vector length fixups Like Fasl-Dump-Component, but directly installs the code in core, running any top-level code immediately. (???) but we need some way to glue together the componenents, since we don't have a fasl table. Dumping: Dump code for each component after compiling that component, but defer dumping of other stuff. We do the fixups on the code vectors, and accumulate them in the table. We have to grovel the constants for each component after compiling that component so that we can fix up load-time constants. Load-time constants are values needed my the code that are computed after code generation/assembly time. Since the code is fixed at this point, load-time constants are always represented as non-immediate constants in the constant pool. A load-time constant is distinguished by being a cons (Kind . What), instead of a Constant leaf. Kind is a keyword indicating how the constant is computed, and What is some context. Some interesting load-time constants: (:label .