[5d4902]: doc / beyond-ansi.sgml Maximize Restore History

Download this file

beyond-ansi.sgml    338 lines (275 with data), 15.7 kB

<!-- -*- mode: SGML; sgml-parent-document: ("user-manual.sgml" "BOOK") -*- -->
<chapter id="beyond-ansi"><title>Beyond The &ANSI; Standard</>

<para>&SBCL; is mostly an implementation of the &ANSI; standard for
Common Lisp. However, there's some important behavior which extends
or clarifies the standard, and various behavior which outright
violates the standard.
</para>

<sect1 id="non-conformance"><title>Non-Conformance With The &ANSI; Standard</>

<para>
Essentially every type of non-conformance is considered a bug.
(The exceptions involve internal inconsistencies in the standard.)
In &SBCL; 0.7.6, the master record of known bugs is in
the <filename>BUGS</> file in the distribution.
Some highlight information about bugs may also be found in the
manual page. The recommended way to report bugs is through the sbcl-help or
sbcl-devel mailings lists.
</para>

</sect1>

<sect1 id="idiosyncrasies"><title>Idiosyncrasies</>

<para>The information in this section describes some of the ways
that &SBCL; deals with choices that the &ANSI; standard 
leaves to the implementation.</para>

<para>Declarations are generally treated as assertions. This general
principle, and its implications, and the bugs which still keep the
compiler from quite satisfying this principle, are discussed in the
<link linkend="compiler">chapter on the compiler</link>.</para>

<para>&SBCL; is essentially a compiler-only implementation of
&CommonLisp;. That is, for all but a few special cases,
<function>eval</> creates a
lambda expression, calls <function>compile</> on the lambda
expression to create a compiled function, and then calls
<function>funcall</> on the resulting function object. This 
is explicitly allowed by the &ANSI; standard, but leads to some
oddities, e.g. collapsing <function>functionp</> and 
<function>compiled-function-p</> into the same predicate.</para>

<para>&SBCL; is quite strict about ANSI's definition of
<function>defconstant</>. ANSI says that doing <function>defconstant</>
of the same symbol more than once is undefined unless the new value
is <function>eql</> to the old value. Conforming to this specification
is a nuisance when the "constant" value is only constant under some
weaker test like <function>string=</> or <function>equal</>. It's
especially annoying because <function>defconstant</> takes effect
not only at load time but also at compile time, so that just 
compiling and loading reasonable code like 
<programlisting>(defconstant +foobyte+ '(1 4))</>
runs into this undefined behavior. Many
implementations of Common Lisp try to help the programmer around
this annoyance by silently accepting the undefined code and 
trying to do what the programmer probably meant. &SBCL; instead
treats the undefined behavior as an error. Often
such code can be rewritten
in portable &ANSI; Common Lisp which has the desired behavior.
E.g., the code above can be given an exactly defined meaning by replacing
<function>defconstant</> either with <function>defparameter</> or 
with a customized macro which does the right thing, possibly along the
lines of the <function>defconstant-eqx</> macro used internally in the
implementation of SBCL itself.</para>

<para>&SBCL; gives style warnings about various kinds of perfectly
legal code, e.g.
<itemizedlist>
  <listitem><para><function>defmethod</> without
    <function>defgeneric</></para></listitem>
  <listitem><para>multiple <function>defun</>s of the same
    symbol</para></listitem>
  <listitem><para>special variables not named in the conventional
    <varname>*foo*</> style, and lexical variables unconventionally named
    in the <varname>*foo*</> style</para></listitem>
</itemizedlist>
This causes friction with people
who point out that other ways of organizing code (especially
avoiding the use of <function>defgeneric</>)
are just as aesthetically stylish.
However, these warnings should be read not
as "warning, bad aesthetics detected, you have no style" but
"warning, this style keeps the compiler from understanding
the code as well as you might like." That is, 
unless the compiler warns about such conditions, there's no
way for the compiler to warn 
about some programming errors which would otherwise be
easy to overlook. (related bug: The warning about
multiple <function>defun</>s is pointlessly annoying when you compile
and then load a function containing <function>defun</> wrapped
in <function>eval-when</>, and ideally should be suppressed in 
that case, but still isn't as of &SBCL; 0.7.6.)</para>

</sect1>

<sect1 id="extensions"><title>Extensions</>

<para>&SBCL; is derived from &CMUCL;, which implements many extensions
to the &ANSI; standard. &SBCL; doesn't support as many extensions as
&CMUCL;, but it still has quite a few.</para>

<sect2><title>Things Which Might Be In The Next &ANSI; Standard</>

<para>&SBCL; provides extensive support for 
calling external C code, described 
<link linkend="ffi">in its own chapter</link>.</para>

<para>&SBCL; provides additional garbage collection functionality not
specified by &ANSI;. Weak pointers allow references to objects to be
maintained without keeping them from being GCed. And "finalization"
hooks are available to cause code to be executed when an object is
GCed.</para> <!-- FIXME: Actually documenting these would be good.:-| -->

<para>&SBCL; supports Gray streams, user-overloadable CLOS classes
whose instances can be used as Lisp streams (e.g. passed as the
first argument to <function>format</>).</para>

<para>&SBCL; supports a MetaObject Protocol which is intended to be
compatible with &AMOP;; exceptions to this (as distinct from current
bugs<!-- Such as the distinction between CL:FIND-CLASS and
SB-PCL::FIND-CLASS :-( -->) are that
<function>compute-effective-method</> only returns one value, not
two<!-- FIXME: anything else? What about extensions? (e.g. COMPUTE-SLOTS
behaviour) -->.</para>

</sect2>

<sect2><title>Threading (a.k.a Multiprocessing)</>

<para>&SBCL; (as of version 0.x.y, on Linux x86 only) supports a
fairly low-level threading interface that maps onto the host operating
system's concept of threads or lightweight processes.  

<sect3><title>Lisp-level view</title>

<para>A rudimentary interface to creating and managing multiple threads
can be found in the <literal>sb-thread</literal> package.  This is
intended for public consumption, so look at the exported symbols and
their documentation strings.  

<para>Dynamic bindings to symbols are per-thread.   Signal handlers
are per-thread.

<para><function>sb-ext:quit</function> exits the current thread, not
necessarily the whole environment.  The environment will be shut down
when the last thread exits.

<para>Threads arbitrate between themselves for the user's attention.
A thread may be in one of three notional states: foreground,
background, or stopped.  When a background process attempts to print a
repl prompt or to enter the debugger, it will stop and print a message
saying that it has stopped.  The user at his leisure may switch to
that thread to find out what it needs.  If a background thread enters
the debugger, selecting any restart will put it back into the
background before it resumes.

<para>If the user has multiple views onto the same Lisp image (for
example, using multiple terminals, or a windowing system, or network
access) they are typically set up as multiple `sessions' such that each 
view has its own collection of foreground/background/stopped threads.
<function>sb-thread:make-listener-thread</function> can be used to
start a new thread in its own `session'.

<para>A small selection of locking primitives are available for
managing access to shared data: see <programlisting>(apropos "mutex"
:sb-thread)</programlisting> and the documentation strings for the
functions it lists.

<sect3><title>Implementation (Linux x86)</title>

<para>On Linux x86, this is implemented using
<function>clone()</function> and does not involve pthreads.  This is
not because there is anything wrong with pthreads <emphasis>per
se</emphasis>, but there is plenty wrong (from our perspective) with
LinuxThreads.  &SBCL; threads are mapped 1:1 onto Linux tasks which
share a VM but nothing else - each has its own process id and can be
seen in e.g. <command>ps</command> output.

<para>Per-thread local bindings for special variables is achieved
using the %gs segment register to point to a per-thread storage area.
This may cause interesting results if you link to foreign code that
expects threading or creates new threads, and the thread library in
question uses %gs in an incompatible way.

<para>Threads waiting for locks are put to sleep using
<function>sigtimedwait()</function> and woken with SIGALRM.  

<para>&SBCL; at present will alway have at least two tasks running as
seen from Linux: when the first process has done startup
initialization (mapping files in place, installing signal handlers
etc) it creates a new thread to run the Lisp startup and initial listener.
The original thread is then used to run GC and to reap dead subthreads
when they exit.

<para>Garbage collection is done with the existing Conservative
Generational GC.  Allocation is done in small (typically 8k) regions :
each thread has its own region so this involves no stopping. However,
when a region fills, a lock must be obtained while another is
allocated, and when a collection is required, all processes are
stopped.  This is achieved using <function>ptrace()</function>, so you
should be very careful if you wish to examine an &SBCL; worker thread
using <command>strace</command>, <command>truss</command>,
<command>gdb</command> or similar.  It may be prudent to disable GC
before doing so.

<para>Large amounts of the &SBCL; library have not been inspected for
thread-safety.  Some of the obviously unsafe areas have large locks
around them, so compilation and fasl loading, for example, cannot be
parallelized.  Work is ongoing in this area.

<para>A new thread by default is created in the same POSIX process
group and session as the thread it was created by.  This has an impact
on keyboard interrupt handling: pressing your terminal's intr key
(typically Control-C) will interrupt all processes in the foreground
process group, including Lisp threads that &SBCL; considers to be
notionally `background'.  This is undesirable, so background threads
are set to ignore the SIGINT signal.  Arbitration for the input stream
is managed by locking on sb-thread::*session-lock*

<para>A thread can be created in a new Lisp 'session' (new terminal or
window) using <function>sb-thread:make-listener-thread</function>.
These sessions map directly onto POSIX sessions, so that pressing
Control-C in the wrong window will not interrupt them - this has been
found to often be embarrassing.

<sect2><title>Support For Unix</>

<para>The UNIX command line can be read from the variable
<varname>sb-ext:*posix-argv*</>. The UNIX environment can be queried with the
<function>sb-ext:posix-getenv</> function.</para>

<para>The &SBCL; system can be terminated with <function>sb-ext:quit</>,
optionally returning a specified numeric value to the calling Unix
process. The normal Unix idiom of terminating on end of file on input
is also supported.</para>

</sect2>

<sect2><title>Customization Hooks for Users</title>

<para>The behaviour of <function>require</function> when called with only
one argument is implementation-defined.  In &SBCL; it calls functions
on the user-settable list <varname>sb-ext:*module-provider-functions*</varname>
- see the <function>require</function> documentation string for details.

<para>The toplevel repl prompt may be customized, and the function
that reads user input may be replaced completely.  <!-- FIXME but I 
don't currently remember how -->

<sect2><title>Tools To Help Developers</title>

<para>&SBCL; provides a profiler and other extensions to the &ANSI;
<function>trace</> facility. See the online function documentation for
<function>trace</> for more information.</para>

<para>The debugger supports a number of options. Its documentation is
accessed by typing <userinput>help</> at the debugger prompt.</para>
<!-- FIXME:
     A true debugger section in the manual would be good. Start
     with CMU CL's debugger section, but remember:
       * no QUIT command (TOPLEVEL restart instead)
       * no GO command (CONTINUE restart instead)
       * Limitations of the x86 port of the debugger should be 
         documented or fixed where possible.
       * Discuss TRACE and its unification with PROFILE. -->

<para>Documentation for <function>inspect</> is accessed by typing
<userinput>help</> at the <function>inspect</> prompt.</para>

</sect2>

<sect2><title>Interface To Low-Level &SBCL; Implementation</title>

<para>&SBCL; has the ability to save its state as a file for later
execution. This functionality is important for its bootstrapping
process, and is also provided as an extension to the user See the
documentation for <function>sb-ext:save-lisp-and-die</> for more
information.</para>

<note><para>&SBCL; has inherited from &CMUCL; various hooks to allow
the user to tweak and monitor the garbage collection process. These
are somewhat stale code, and their interface might need to be cleaned
up. If you have urgent need of them, look at the code in
<filename>src/code/gc.lisp</filename> and bring it up on the
developers' mailing list.</para></note>

<note><para>&SBCL; has various hooks inherited from &CMUCL;, like
<function>sb-ext:float-denormalized-p</>, to allow a program to take
advantage of &IEEE; floating point arithmetic properties which aren't
conveniently or efficiently expressible using the &ANSI; standard. These
look good, and their interface looks good, but &IEEE; support is
slightly broken due to a stupid decision to remove some support for
infinities (because it wasn't in the &ANSI; spec and it didn't occur to
me that it was in the &IEEE; spec). If you need this stuff, take a look
at the code and bring it up on the developers' mailing
list.</para></note>

</sect2>

<sect2><title>Efficiency Hacks</title>

<para>The <function>sb-ext:purify</function> function causes &SBCL;
first to collect all garbage, then to mark all uncollected objects as
permanent, never again attempting to collect them as garbage. This can
cause a large increase in efficiency when using a primitive garbage
collector, or a more moderate increase in efficiency when using a more
sophisticated garbage collector which is well suited to the program's
memory usage pattern. It also allows permanent code to be frozen at
fixed addresses, a precondition for using copy-on-write to share code
between multiple Lisp processes. is less important with modern
generational garbage collectors. </para>

<para>The <function>sb-ext:truly-the</> operator does what the
<function>cl:the</> operator does in a more conventional
implementation of &CommonLisp;, declaring the type of its argument
without any runtime checks. (Ordinarily in &SBCL;, any type
declaration is treated as an assertion and checked at runtime.)</para>

<para>The <function>sb-ext:freeze-type</> declaration declares that a
type will never change, which can make type testing
(<function>typep</>, etc.) more efficient for structure types.</para>

<para>The <function>sb-ext:constant-function</> declaration specifies
that a function will always return the same value for the same
arguments, which may allow the compiler to optimize calls
to it. This is appropriate for functions like <function>sqrt</>, but
is <emphasis>not</> appropriate for functions like <function>aref</>,
which can change their return values when the underlying data are
changed.</para>

</sect2>

</sect1>

</chapter>