Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

Rationale

Here is the detailed rationale for the various "readable" notations. If you just want to use the readable results, you don't need all this, but if you want to understand the rationale behind them, here it is.

Problem

As discussed in [Problem], software in Lisp-based programming language have traditionally been written using s-expressions. But many software developers consider s-expression notation difficult to read when used for programs. For example, s-expressions do not directly support infix operators, fail to support traditional math function notation such as f(x), and require a large number of parentheses for even simple operations.

Past work to create readable formats

There have been a huge number of past efforts to create readable formats, going all the way back to the original M-expression syntax that Lisp's creator expected to be used when programming. Generally, they've been unsuccessful, or they end up creating a completely different language that lacks the advantages of Lisp-based languages. After examining a huge number of them, David A. Wheeler noticed a pattern in most of these failures: most failed to be general and homoiconic:

  • A readable Lisp format must be general. A general format is not tied to some specific underlying semantic. Most readability efforts focused on creating special syntax for each language construct of an underlying language. But since Lisp-based languages can trivially create new semantic constructs (via macros), and are often used to process fragments of other languages, these did not work well. It was often difficult to keep updating the parser to match the underlying system, so the parser was always less capable than using s-expressions... leading to its abandonment. Sometimes the parser was continuously maintained, but soon the parser led to the development of a completely new language that was less suitable for self-analysis of program fragments and similar tasks (and thus no longer a suitable "Lisp"). It's easy to create a new "operator" in a Lisp, yet many infix systems cannot work without having its precedence predefined.

  • A readable Lisp format must be homoiconic. A homoiconic format is a surface format in which the human reader can easily determine what the underlying representation is. It is very difficult to take advantage of Lisp capabilities, such as macros, without a homoiconic format. Yet many past readability efforts made it difficult to determine exactly what structures were being created by the notation. Typical infix notations with precedence were especially common examples of this problem - they would quietly create multiple lists without obvious indications that this was happening. Top Down Operator Precedence by Douglas Crockford (2007-02-21), for example, discusses Vaughan Pratt's "Top Down Operator Precedence" and shows how important homoiconicity is. He stated that "parsing techniques are not greatly valued in the LISP community, which celebrates the Spartan denial of syntax. There have been many attempts since LISP's creation to give the language a rich ALGOL-like syntax, including Pratt's CGOL, LISP 2, MLISP, Dylan, Interlisp's Clisp, and McCarthy's original M-expressions. All failed to find acceptance. That community found the correspondence between programs and data to be much more valuable than expressive syntax. But the mainstream programming community likes its syntax, so LISP has never been accepted by the mainstream."

Now that this pattern has been identified, new notations can be devised that are general and homoiconic - avoiding the problems of past efforts.

See http://www.dwheeler.com/readable/readable-s-expressions.html for a longer discussion on past efforts.

Why these three tiers?

We have three tiers, each of which builds on the previous one, as described in [Solution]. We have devised them to try to meet our [Goals]. First, let's discuss why these three tiers.

Each tier improves the notation, but has a trade-off; by creating three tiers, people can choose the tier they are comfortable with, yet use them together:

  1. Curly-infix-expressions (c-expressions) add infix notation, the notation people are trained in and most programming languages support out-of-the-box. This notation uses {...}, which are not used by many Lisps (including Common Lisp and Scheme), so in most cases there is no compatibility issue adding this. It's also trivial to add to Common Lisp (just modify the readtable). Thus, this creates a simple "first step" that people can adopt without concern.
  2. Neoteric-expressions (n-expressions) allow people to use a function name outside of the parentheses, which is the notation most people are taught in school and is used by most programming languages. This involves a subtle incompatible change in most Lisp readers, so a few might hesitate doing this. But code where this difference matters is considered extremely poor style, and is unlikely in modern code - and a pretty-printer would eliminate any such problems (just apply it first). Neoteric-expressions build on curly infix, since the prefixed {} requires handling curly infix.
  3. Sweet-expressions (t-expressions) add indentation support, eliminating the need for many parentheses. However, this adds another subtle change, and some people won't want indentation to be relevant. By making this a separate tier, people can adopt neoteric-expressions if they don't want syntactically meaningful indentation.

It would be possible to mix-and-match each idea, but that makes it more complicated for users to know what's allowed (they would have to answer three questions, instead of simply answering "which notation is supported?"). Neoteric-expressions add support for prefix {...}, which doesn't make sense without curly infix. But perhaps more importantly, each tier has an obvious additional cost in terms of implementation effort and compatibility; it would make less sense to add one of the later tiers without adding the former ones. Thus, to make things simple for users, it's best to define this as a set of 3 tiers that build on each other.

More detailed rationale

Now, let's focus on each one, and see why they've been defined the way they are:


Related

Wiki: Goals
Wiki: Home
Wiki: Hubris
Wiki: Join
Wiki: Problem
Wiki: Rationale-curly-infix
Wiki: Rationale-miscellaneous
Wiki: Rationale-neoteric
Wiki: Rationale-sweet
Wiki: Solution