## [a530bb]: doc / cmucl / internals / architecture.tex Maximize Restore History

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 \part{System Architecture}% -*- Dictionary: int:design -*- \chapter{Package and File Structure} \section{RCS and build areas} The CMU CL sources are maintained using RCS in a hierarchical directory structure which supports: \begin{itemize} \item shared RCS config file across a build area, \item frozen sources for multiple releases, and \item separate system build areas for different architectures. \end{itemize} Since this organization maintains multiple copies of the source, it is somewhat space intensive. But it is easy to delete and later restore a copy of the source using RCS snapshots. There are three major subtrees of the root \verb|/afs/cs/project/clisp|: \begin{description} \item[rcs] holds the RCS source (suffix \verb|,v|) files. \item[src] holds checked out'' (but not locked) versions of the source files, and is subdivided by release. Each release directory in the source tree has a symbolic link named {\tt RCS}'' which points to the RCS subdirectory of the corresponding directory in the {\tt rcs} tree. At top-level in a source tree is the {\tt RCSconfig}'' file for that area. All subdirectories also have a symbolic link to this RCSconfig file, allowing the configuration for an area to be easily changed. \item[build] compiled object files are placed in this tree, which is subdivided by machine type and version. The CMU CL search-list mechanism is used to allow the source files to be located in a different tree than the object files. C programs are compiled by using the \verb|tools/dupsrcs| command to make symbolic links to the corresponding source tree. \end{description} On order to modify an file in RCS, it must be checked out with a lock to produce a writable working file. Each programmer checks out files into a personal play area'' subtree of \verb|clisp/hackers|. These tree duplicate the structure of source trees, but are normally empty except for files actively being worked on. See \verb|/afs/cs/project/clisp/pmax_mach/alpha/tools/| for various tools we use for RCS hacking: \begin{description} \item[rcs.lisp] Hemlock (editor) commands for RCS file manipulation \item[rcsupdate.c] Program to check out all files in a tree that have been modified since last checkout. \item[updates] Shell script to produce a single listing of all RCS log entries in a tree since a date. \item[snapshot-update.lisp] Lisp program to generate a shell script which generates a listing of updates since a particular RCS snapshot ({\tt RCSSNAP}) file was created. \end{description} You can easily operate on all RCS files in a subtree using: \begin{verbatim} find . -follow -name '*,v' -exec {} \; \end{verbatim} \subsection{Configuration Management} config files are useful, especially in combinarion with {\tt snapshot}''. You can shapshot any particular version, giving an RCSconfig that designates that configuration. You can also use config files to specify the system as of a particular date. For example: \begin{verbatim} <3-jan-91 \end{verbatim} in the the config file will cause the version as of that 3-jan-91 to be checked out, instead of the latest version. \subsection{RCS Branches} Branches and named revisions are used together to allow multiple paths of development to be supported. Each separate development has a branch, and each branch has a name. This project uses branches in two somewhat different cases of divergent development: \begin{itemize} \item For systems that we have imported from the outside, we generally assign a {\tt cmu}'' branch for our local modifications. When a new release comes along, we check it in on the trunk, and then merge our branch back in. \item For the early development and debugging of major system changes, where the development and debugging is expected to take long enough that we wouldn't want the trunk to be in an inconsistent state for that long. \end{itemize} \section{Releases} We name releases according to the normal alpha, beta, default convention. Alpha releases are frequent, intended primarily for internal use, and are thus not subject to as high high documentation and configuration management standards. Alpha releases are designated by the date on which the system was built; the alpha releases for different systems may not be in exact correspondence, since they are built at different times. Beta and default releases are always based on a snapshot, ensuring that all systems are based on the same sources. A release name is an integer and a letter, like 15d''. The integer is the name of the source tree which the system was built from, and the letter represents the release from that tree: a'' is the first release, etc. Generally the numeric part increases when there are major system changes, whereas changes in the letter represent bug-fixes and minor enhancements. \section{Source Tree Structure} A source tree (and the master {\tt rcs}'' tree) has subdirectories for each major subsystem: \begin{description} \item[{\tt assembly/}] Holds the CMU CL source-file assembler, and has machine specific subdirectories holding assembly code for that architecture. \item[{\tt clx/}] The CLX interface to the X11 window system. \item[{\tt code/}] The Lisp code for the runtime system and standard CL utilities. \item[{\tt compiler/}] The Python compiler. Has architecture-specific subdirectories which hold backends for different machines. The {\tt generic} subdirectory holds code that is shared across most backends. \item[{\tt hemlock/}] The Hemlock editor. \item[{\tt lisp/}] The C runtime system code and low-level Lisp debugger. \item[{\tt pcl/}] CMU version of the PCL implementation of CLOS. \item[{\tt tools/}] System building command files and source management tools. \end{description} \section{Package structure} Goals: with the single exception of LISP, we want to be able to export from the package that the code lives in. \begin{description} \item[Mach, CLX...] --- These Implementation-dependent system-interface packages provide direct access to specific features available in the operating system environment, but hide details of how OS communication is done. \item[system] contains code that must know about the operating system environment: I/O, etc. Hides the operating system environment. Provides OS interface extensions such as {\tt print-directory}, etc. \item[kernel] hides state and types used for system integration: package system, error system, streams (?), reader, printer. Also, hides the VM, in that we don't export anything that reveals the VM interface. Contains code that needs to use the VM and SYSTEM interface, but is independent of OS and VM details. This code shouldn't need to be changed in any port of CMU CL, but won't work when plopped into an arbitrary CL. Uses SYSTEM, VM, EXTENSIONS. We export "hidden" symbols related to implementation of CL: setf-inverses, possibly some global variables. The boundary between KERNEL and VM is fuzzy, but this fuzziness reflects the fuzziness in the definition of the VM. We can make the VM large, and bring everything inside, or we make make it small. Obviously, we want the VM to be as small as possible, subject to efficiency constraints. Pretty much all of the code in KERNEL could be put in VM. The issue is more what VM hides from KERNEL: VM knows about everything. \item[lisp] Originally, this package had all the system code in it. The current ideal is that this package should have {\it no} code in it, and only exist to export the standard interface. Note that the name has been changed by x3j13 to common-lisp. \item[extensions] contains code that any random user could have written: list operations, syntactic sugar macros. Uses only LISP, so code in EXTENSIONS is pure CL. Exports everything defined within that is useful elsewhere. This package doesn't hide much, so it is relatively safe for users to use EXTENSIONS, since they aren't getting anything they couldn't have written themselves. Contrast this to KERNEL, which exports additional operations on CL's primitive data structures: PACKAGE-INTERNAL-SYMBOL-COUNT, etc. Although some of the functionality exported from KERNEL could have been defined in CL, the kernel implementation is much more efficient because it knows about implementation internals. Currently this package contains only extensions to CL, but in the ideal scheme of things, it should contain the implementations of all CL functions that are in KERNEL (the library.) \item[VM] hides information about the hardware and data structure representations. Contains all code that knows about this sort of thing: parts of the compiler, GC, etc. The bulk of the code is the compiler back-end. Exports useful things that are meaningful across all implementations, such as operations for examining compiled functions, system constants. Uses COMPILER and whatever else it wants. Actually, there are different {\it machine}{\tt -VM} packages for each target implementation. VM is a nickname for whatever implementation we are currently targeting for. \item[compiler] hides the algorithms used to map Lisp semantics onto the operations supplied by the VM. Exports the mechanisms used for defining the VM. All the VM-independent code in the compiler, partially hiding the compiler intermediate representations. Uses KERNEL. \item[eval] holds code that does direct execution of the compiler's ICR. Uses KERNEL, COMPILER. Exports debugger interface to interpreted code. \item[debug-internals] presents a reasonable, unified interface to manipulation of the state of both compiled and interpreted code. (could be in KERNEL) Uses VM, INTERPRETER, EVAL, KERNEL. \item[debug] holds the standard debugger, and exports the debugger \end{description} \chapter{System Building} It's actually rather easy to build a CMU CL core with exactly what you want in it. But to do this you need two things: the source and a working CMU CL. Basically, you use the working copy of CMU CL to compile the sources, then run a process call genesis'' which builds a kernel'' core. You then load whatever you want into this kernel core, and save it. In the \verb|tools/| directory in the sources there are several files that compile everything, and build cores, etc. The first step is to compile the C startup code. {\bf Note:} {\it the various scripts mentioned below have hard-wired paths in them set up for our directory layout here at CMU. Anyone anywhere else will have to edit them before they will work.} \section{Compiling the C Startup Code} There is a circular dependancy between lisp/internals.h and lisp/lisp.map that causes bootstrapping problems. To the easiest way to get around this problem is to make a fake lisp.nm file that has nothing in it by a version number: \begin{verbatim} % echo "Map file for lisp version 0" > lisp.nm \end{verbatim} and then run genesis with NIL for the list of files: \begin{verbatim} * (load ".../compiler/generic/new-genesis") ; compile before loading * (lisp::genesis nil ".../lisp/lisp.nm" "/dev/null" ".../lisp/lisp.map" ".../lisp/lisp.h") \end{verbatim} It will generate a whole bunch of warnings about things being undefined, but ignore that, because it will also generate a correct lisp.h. You can then compile lisp producing a correct lisp.map: \begin{verbatim} % make \end{verbatim} and the use \verb|tools/do-worldbuild| and \verb|tools/mk-lisp| to build \verb|kernel.core| and \verb|lisp.core| (see section \ref[building-cores].) \section{Compiling the Lisp Code} The \verb|tools| directory contains various lisp and C-shell utilities for building CMU CL: \begin{description} \item[compile-all*] Will compile lisp files and build a kernel core. It has numerous command-line options to control what to compile and how. Try -help to see a description. It runs a separate Lisp process to compile each subsystem. Error output is generated in files with {\tt .log}'' extension in the root of the build area. \item[setup.lisp] Some lisp utilities used for compiling changed files in batch mode and collecting the error output Sort of a crude defsystem. Loads into the user'' package. See {\tt with-compiler-log-file} and {\tt comf}. \item[{\it foo}com.lisp] Each system has a \verb|.lisp|'' file in \verb|tools/| which compiles that system. \end{description} \section{Building Core Images} \label{building-cores} Both the kernel and final core build are normally done using shell script drivers: \begin{description} \item[do-worldbuild*] Builds a kernel core for the current machine. The version to build is indicated by an optional argument, which defaults to alpha''. The \verb|kernel.core| file is written either in the \verb|lisp/| directory in the build area, or in \verb|/usr/tmp/|. The directory which already contains \verb|kernel.core| is chosen. You can create a dummy version with e.g. touch'' to select the initial build location. \item[mk-lisp*] Builds a full core, with conditional loading of subsystems. The version is the first argument, which defaults to alpha''. Any additional arguments are added to the \verb|*features*| list, which controls system loading (among other things.) The \verb|lisp.core| file is written in the current working directory. \end{description} These scripts load Lisp command files. When \verb|tools/worldbuild.lisp| is loaded, it calls genesis with the correct arguments to build a kernel core. Similarly, \verb|worldload.lisp| builds a full core. Adding certain symbols to \verb|*features*| before loading worldload.lisp suppresses loading of different parts of the system. These symbols are: \begin{description} \item[:no-compiler] don't load the compiler. \item[:no-clx] don't load CLX. \item[:no-hemlock] don't load hemlock. \item[:no-pcl] don't load PCL. \item[:runtime] build a runtime code, implies all of the above, and then some. \end{description} Note: if you don't load the compiler, you can't (successfully) load the pretty-printer or pcl. And if you compiled hemlock with CLX loaded, you can't load it without CLX also being loaded.