Readable Lisp S-expressions Wiki

Readable Lisp/S-expressions with infix, functions, and indentation

Brought to you by: dwheeler

Rationale-miscellaneous

Rationale - Miscellaneous

Here are some miscellaneous points.

Writing out results

An obvious question is, "how do you write them out?" After all, with these notations there is more than one way to present expressions.

But no Lisp guarantees that what it writes out is the same sequence of characters that was written. For example, on most systems, "(quote x)" would be written back as 'x. Similarly, if you enter "(x . (y . ()))", Lisps will generally write that back as "(x y)". So nothing has fundamentally changed; as always, you should implement your Lisp expression writer so that it presents a format convenient to both human and machine readers.

For the moment, we've not been changing writers. Always showing prefix form is helpful for people learning the notation, and not all readers will be able to accept what other writers could produce.

It'd be useful to have functions to write out expressions in other forms. In most cases, detecting likely infix operators (and using curly-infix), and writing out neoteric-expressions if the first item in a list is a symbol, would be just fine. See the "sweeten" program for one way to do this. You'd probably want to invoke a separate function to output indented sweet-expressions, in the same way that people specially invoke pretty-printers now. You probably want several functions for writing things out, depending on what format the user wants to see... but since that is already true, that is no real change.

Miscelleneous Comments on the Notations

Note that usual Lisp quoting rules still work, so 'a still maps to (quote a). But they work with the new capabilities, so 'f(x) maps to (quote (f x)). Same with quasiquoting and comma-lifting. A ";" still begins a comment that continues to the end of a line, and "#" still begins special processing.

Implementations might call underlying implementations when they encounter "#", but in those cases, an expression begun by "#" will not continue to suport sweet-expressions. For example, in Scheme, use vector(...) instead of #(...). Many Scheme implementations have nonstandard extensions for "#", so a portable sweet-reader can't easily reimplement the functionality of a local "#". Nor can the sweet-reader easily call on the underlying implementation of "#" on some implementations, e.g., Scheme only supports a one-character peek with no unget character.

If an implementation called a "standard" s-expression reader when it encountered an open parenthesis, and you had to use [ ... ] for neoteric-expressions to keep working, it would be extremely backward-compatible with essentially all existing Lisp files. However, this would be hard to use; it would mean that you must use [ ... ] for lists, and failure to do so would produce mysterious errors. After some experimentation, I found that it was a bad idea and dropped it.

Notice that since all the transforms happen in the reader, all of them (including sweet-expressions) are highly compatible with macros. Sweet-expressions simply define new abbreviations, just as 'x became (over time) a standard abbreviation for (quote x). As long as simple infix expressions are used (ones that don't create nfx), after reading the expressions all expressions are normal s-expressions, with the operator at the initial position. So macros defined by Common Lisp's macros, etc., will work as expected. Common Lisp has some hideously confusing terminology, though. Common Lisp has macros, but it also has a completely different capability: "macro characters", which introduce "reader macros" - i.e., hooks into the reader used during read time. The Common Lisp Hyperspec clearly states in its glossary on macro characters, "macro characters have nothing to do with macros", but I think they should have chosen a name that had nothing to do with macros as well. Obviously sweet-expressions can affect macro characters, since they implement a different reading syntax. This doesn't affect most real Common Lisp programs, which often avoid macro characters anyway. Common Lisp macro functions (e.g., defmacro and macrolet) work just fine with sweet-expressions.

Backwards compatibility

Backwards compatibility with traditional Lisp notation is important. These notations are fully compatible with well-formatted Lisp, which is almost (but not quite) the same thing. Having multiple tiers helps, too:

Curly-expressions are completely compatible, since {...} isn't in traditional Lisp.
Neoteric-expressions are compatible for what I'd call "normal" formatting. The key issue is that it changes the meaning of an opening paren after a character other than whitespace or another opening paren. So if you're using to saving whitespace that will be different. Basically, a(b) becomes "(a b)", not "a (b)". There are millions of lines of Lisp code that would never see the difference. So if you wrote "a(b)" expecting it to be "a (b)", you will need to insert the space.
Sweet-expressions add indentation processing, but since indentation is disabled inside (...), ordinary Lisp expressions immediately disable this & typically don't cause issues. Nevertheless, in rare cicumstances problems can occur. Note that:
- If you have a top-level expression when more than one datum on a line and the line doesn't begin with space/tab. Thus, at the topmost level, "(a) (b)" on one line is interpreted as two datums "(a)" followed by "(b)" in traditional Lisp, but this is a single "((a) (b))" in sweet-expressions.
- Sweet-expressions also count "!" at the beginning of a line as an indent character. This rarely causes any issue, since once you use an open parenthesis to start an expression, this meaning for "!" is disabled. Generally, you'd have to have a symbol whose name starts with "!" before any issue could come up.
- You can disable sweet-expression processing with a space indent, so just inserting a space on every line ensures compatibility with the sweet-expression processing.

Transitioning

You can transition code incrementally, since these notations are basically backwards compatible. If you want to do it manually:

Curly-infix-expressions: Whenever you'd like to use infix, just replace (infix-op arg1 arg2 ...) with {arg1 infix-op arg2 ...}.
Neoteric-expressions: First, pretty-print to get rid of odd expressions like (A)(B) as they would be interpreted differently. Then, where convenient, replace any (NAME ...) you'd like with NAME(...).
Sweet-expressions: First, pretty-print, both for the neoteric-expression reasons and because some expressions on the same line might (in certain odd circumstances) be interpreted differently. The remove the parentheses, outside in, that are no longer needed because the indentation will do it for you.

We also provide the "sweeten" tool that can automatically reformat a file into sweet-expression form. You can then modify its results to your liking.

Alternative: Q2

An extremely interesting experimentation notation, "Q2", was developed by Per Bothner:
http://per.bothner.com/blog/2010/Q2-extensible-syntax/

This has somewhat similar goals to the "readable" project, though with a different approach. The big difference is that David A. Wheeler decided it was important to have a generic notation for any s-expression.

It's probably fairest to compare Q2 to the "sweet-expression" notation:

Sweet-expressions have infix, though not precedence. Actually, precedence could be added, as discussed in SRFI-105... I just don't think it's worth it.
Both have "juxtaposition for function application"
Q2 has "Naming a zero-argument function applies it" but this is awkward, indeed, "The exact rule for a distinguishing between a variable reference and a zero-argument function application isn't decided yet." In sweet-expressions, a zero-argument function name is called by adding () after it or around it, e.g., pi().
"Flexible token format" - both require operators to be delimited.
"Use indentation for grouping" - both use indentation for grouping
"Block expressions yield multiple values" - In sweet-expressions, you use usual Scheme procedures, including value, instead of having special syntax.
REPL: In sweet-expressions, you usually end a line with ENTER ENTER. Q2 doesn't, but I worry that you have to be careful or it'll end where it syntactically might not need to.

David A. Wheeler thinks Q2 is interesting, but believes that for many purposes (e.g., for use in many different Lisps), the "readable" notations described here are a better trade-off.

Philip Wadler's critique

"A critique of Abelson and Sussman -or- Why calculating is better than scheming" by Philip Wadler ( http://www.cs.kent.ac.uk/people/staff/dat/miranda/wadler87.pdf ) has interesting comments about Lisp-based languages, and argues that KRC and Miranda (on which Haskell was later based) are better ofr teaching programming. The key point here is that these authors represent the view of many, namely, that there are serious problems with Lisp's traditional s-expression syntax.

Here are some examples of comments in the paper:

"The definition (above) is also harder to read because of the syntax (or, rather, lack of syntax) of Lisp."
"... the prefix notation (of Lisp) means more mental effort is needed for each step. This effect is strongest for the associative law, but in general any algebraic manipulation is easier in an infix notation; this is one reason such notations have evolved."
"Some people may wish to dismiss many of the issues raised in this paper as being 'just syntax'. It is true that much debate over syntax is of little value. But it is also true that a good choice of notation can greatly aid learning and thought, and a poor choice can hinder it... (for example,) mathematical notation is easier to manipulate algebraically than Lisp."
"Lisp programs often have much more sheer bult than the corresponding Miranda programs. Also, as noted above, S-expression notation hinders reasoning with algebraic properties, such as associativity. Perhaps most importantly, the unfamiliarity of Lisp syntax can be a real stumbling block to beginning students.

Conclusions

In conclusion, these three notations make it easy to create much more readable expressions in Lisp-like languages, without losing the power of these languages.

Wiki: Rationale