Here is the detailed rationale for the various "readable" notations. If you just want to use the readable results, you don't need all this, but if you want to understand the rationale behind them, here it is. This essay was originally written by David A. Wheeler (the "I" below), though there may be some edits by others.
As discussed in [Problem], software in Lisp-based programming language have traditionally been written using s-expressions. But many software developers consider s-expression notation difficult to read when used for programs. For example, s-expressions do not directly support infix operators, fail to support traditional math function notation such as f(x), and require a large number of parentheses for even simple operations.
There have been a huge number of past efforts to create readable formats, going all the way back to the original M-expression syntax that Lisp's creator expected to be used when programming. Generally, they've been unsuccessful, or they end up creating a completely different language that lacks the advantages of Lisp-based languages. After examining a huge number of them, David A. Wheeler noticed a pattern in most of these failures: most failed to be general and homoiconic:
A readable Lisp format must be general. A general format is not tied to some specific underlying semantic. Most readability efforts focused on creating special syntax for each language construct of an underlying language. But since Lisp-based languages can trivially create new semantic constructs (via macros), and are often used to process fragments of other languages, these did not work well. It was often difficult to keep updating the parser to match the underlying system, so the parser was always less capable than using s-expressions... leading to its abandonment. Sometimes the parser was continuously maintained, but soon the parser led to the development of a completely new language that was less suitable for self-analysis of program fragments and similar tasks (and thus no longer a suitable "Lisp"). It's easy to create a new "operator" in a Lisp, yet many infix systems cannot work without having its precedence predefined.
A readable Lisp format must be homoiconic. A homoiconic format is a surface format in which the human reader can easily determine what the underlying representation is. It is very difficult to take advantage of Lisp capabilities, such as macros, without a homoiconic format. Yet many past readability efforts made it difficult to determine exactly what structures were being created by the notation. Typical infix notations with precedence were especially common examples of this problem - they would quietly create multiple lists without obvious indications that this was happening. Top Down Operator Precedence by Douglas Crockford (2007-02-21), for example, discusses Vaughan Pratt's "Top Down Operator Precedence" and shows how important homoiconicity is. He stated that "parsing techniques are not greatly valued in the LISP community, which celebrates the Spartan denial of syntax. There have been many attempts since LISP's creation to give the language a rich ALGOL-like syntax, including Pratt's CGOL, LISP 2, MLISP, Dylan, Interlisp's Clisp, and McCarthy's original M-expressions. All failed to find acceptance. That community found the correspondence between programs and data to be much more valuable than expressive syntax. But the mainstream programming community likes its syntax, so LISP has never been accepted by the mainstream."
Now that this pattern has been identified, new notations can be devised that are general and homoiconic - avoiding the problems of past efforts.
See http://www.dwheeler.com/readable/readable-s-expressions.html for a longer discussion on past efforts.
We have three tiers, each of which builds on the previous one, as described in [Solution]. We have devised them to try to meet our [Goals]. First, let's discuss why these three tiers.
Each tier improves the notation, but has a trade-off; by creating three tiers, people can choose the tier they are comfortable with, yet use them together:
It would be possible to mix-and-match each idea, but that makes it more complicated for users to know what's allowed (they would have to answer three questions, instead of simply answering "which notation is supported?"). Neoteric-expressions add support for prefix {...}, which doesn't make sense without curly infix. But perhaps more importantly, each tier has an obvious additional cost in terms of implementation effort and compatibility; it would make less sense to add one of the later tiers without adding the former ones. Thus, to make things simple for users, it's best to define this as a set of 3 tiers that build on each other.
Now, let's focus on each one, and see why they've been defined the way they are.
The reality is that nearly everyone prefers infix notation where it's traditionally used. People will specifically avoid Lisp-based systems for some problems, solely because they lack built-in infix support. Even Paul Graham, a well-known Lisp advocate, admits that "Sometimes infix syntax is easier to read. This is especially true for math expressions. I've used Lisp my whole programming life and I still don't find prefix math expressions natural." Paul Prescod remarked, “[Regarding] infix versus prefix... I have more faith that you could convince the world to use esperanto than prefix notation.” Nearly all developers prefer to read infix for many operations. I believe Lisp-based systems have often been specifically ignored even where they were generally the best tool for the job, solely because there was no built-in support for infix operations. After all, if language creators can’t be bothered to support the standard notation for mathematical operations, then clearly it isn’t very powerful (as far as they are concerned). So let’s see some ways we can support infix, yet with minimal changes to s-expression notation.
Many previous systems have implemented "infix" systems as a named macro or function (often "nfx" meaning infix). This looks ugly, and it does the wrong thing - the resulting list has "nfx" at the beginning, not the operator. Many of these systems also created a whole new notation which simultaneously lost Lisp's abilities for quoting, quasiquoting, and so on. Therefore, they haven't caught on.
Many "infix" systems in Lisp also implemented precedence, as precedence is usually baked into languages with infix support. However, it is not possible to preset the precedence rules for all uses. Lisp systems often process other languages, freely mixing different types of language, and thus the same symbol may have different meanings. What's worse, these precedence systems hid where lists were being created, losing homoiconicity. So having an infix system that forces the use of precedence makes the system harder, not easier, to use.
The key insight here is that although other languages implement precedence, building precedence into a language is not necessary to have a useful infix system. You can easily add infix notation if you're willing to not force precedence into the reader, and it turns out that is enough for real-world use. Even in languages with precedence, people often parenthesize to make things clear, so not having precedence systems is actually not a big impact. (It's even less of an impact because of the nfx rule described below).
By intentionally not building in a precedence system, we make things amazingly simple - we don't need to register functions, decide their order, or anything like it - making programming much simpler and easier. There's no need to memorize a precedence system, code transfers easily, and code is generally easy to read too (again, because you don't have to memorize a precedence system). The reader implementation is easier to get "obviously right" since it has less to do. As discussed later, curly-infix supports adding an "nfx" operator that can provide precedence in those few cases where it's valuable, without harm to the language or complex infrastructure.
Actually, there's a reasonable case to be made for precedence for just a few operators, in particular, +, -, *, and /. Even when these operators don't mean add, subtract, multiply, and divide, people using them as infix operators would typically be willing to agree on their precedence. And these operators are certainly widely used.
But there are several arguments against support for limited precedence:
If you really want precedence, see the page [Precedence] which describes an approach for adding precedence to curly-infix.
An older version of my infix notation (version 0.1) tried to automatically detect when an operator was infix, but this turned out to be a poor choice. It's hard to express good rules, e.g., most operators typically used as infix are punctuation-only, but some (like "and" and "or") are not. More sophisticated rules made it harder, not easier, to use. And for automatic detection to be useful, there must be an escape mechanism anyway. After experimentation, it was determined that letting the user quickly express exactly which lists should be interpreted as infix was far more effective. The curly brace pair {...} makes that easy to express, resulting in a simple, clear rule.
A "simple infix" expression (an expression that can represent a single list with one operator and a list of two or more parameters) are actually the common case; and supporting them is enough to have a useful infix notation. Now you can write {a + b} and it has its obvious meaning. Expressions like {a < b < c} also have the obvious meaning. Notice that once it is read in, the operator has its correct place without macro processing or other complications, making these easy to use for common cases.
At first I considered reporting an error if a simple infix expression isn't sent, but prepending "nfx" is much more flexible. This way, if you do want a precedence system, you can build one. And because it's not locked inside the reader, you can choose whatever precedence system you want.
It is trivial in most Lisps to define a macro (let's call it INFIX though its name is irrelevant) to do macros, but this simply doesn't do the job.
First of all, it is obviously not real infix. The expression "(INFIX a + b)" is visibly worse notation than practically all other programming languages; Fortran, Basic, Java, C, C#, and many other languages manage to do this with "a + b". But even if you used {...} to notate (INFIX ...) in all cases, it would still be wrong for the usual case; this interferes with general list processing (including quoting, quasiquoting, and so on). Someone who enters '{3 + 4} probably didn't want "(quote (INFIX 3 + 4))", they wanted "(quote (+ 3 4))". Macro processing is simply too late in most Lisps. You really want infix baked into the reader, so that the reader will transform the infix expression before macros, quoting, and so on get hold of it.
Some Lisps implement a completely different and complex language in their readers. For example, Gambit has its "Scheme infix syntax extension (SIX)", as described at: http://www.iro.umontreal.ca/~gambit/doc/gambit-c.html#Scheme-infix-syntax-extension
But these have problems on a number of fronts:
Some mechanisms don't work well with other macros because even in simple cases they don't return same structure a "regular" s-expression would return. For example, the SIX extensions insert many new function names like "six.x+y", instead of simply returning "+" for addition.
However, mechanisms like Gambit's SIX show that there is a desire to support infix.
There is no perfect character. However, you really want a balanced pair of characters to identify this, since you can have infix-in-infix. Parentheses are already spoken for. R6RS Scheme already uses up square brackets as a synonym for parentheses. Angle brackets are already used for comparison. The curly braces are visually pleasant pairs. It makes sense to use these precious characters on something extremely common: infix notation. They are also widely available for use in many (though not all) Lisp implementations. We're sorry to interfere with the implementations that use {...} for something else, but there are fewer such Lisps. For example, they are not standard in Common Lisp or Scheme, so at most they are local extensions in those languages (which could be enabled and disabled, and code that uses them isn't portable anyway).
It's true that {...} are often used in math for set notation. But infix notation is far more basic, and common, than sets. Also, traditional function call notation and infix are helpful when working with sets, so infix notation is the more important need. Once you allow neoteric-expressions, the notation set(...) is a reasonable alternative.
And yes, some Lisps already use {...} for other reasons. Clojure, for example, uses them for maps. Some Lisps won't be able to switch to {...}, but some may decide that they can use prefix forms such as map(...) for operations they formerly used braces for. That's a decision specific implementations will have to make, but other characters won't be any better. We know that these are available in Common Lisp, the Scheme standard (including many implementations of it), and many others.
Racket allows an infix notation of the form (a . > . b), as defined here:
http://docs.racket-lang.org/guide/Pairs__Lists__and_Racket_Syntax.html
A pro is that it doesn't need to use up {}, so it might be easier to implement in some Lisps which already define {} for use in a local extension.
However, it has many cons:
In short, infix is extremely common, so its notation should be convenient. The Racket "infix convention" may be the next-best notation for infix notation after curly-infix, but it's next-best, and we should strive for the best available for a common need.
Curly-infix does not conflict with the Racket infix convention; implementations could implement both. We recommend that an implementation that implements the Racket infix convention should allow multiple operands and use curly-infix semantics, pretending that . op . is a single parameter. In that case, (a . + . b . + . c) would map to (+ a b c), and (a . + . b . * . c) would map to (nfx a + b * c).
Curly-infix requires that infix operators be delimited (e.g., by spaces). This is consistent with Lisp history; operators are always delimited in traditional s-expressions (typically by left parentheses on the left, and space on the right). It's also impractical to do otherwise; most Lisps allow and predefine symbols that include characters (like "-") that are typically used for infix operators. And while many other languages permit infix operators to be used without delimiters, many developers will put space around infix operators even in languages that don't require them. Thus, it is difficult to allow infix operators without delimiters, and the visual results are common in real-world use of other languages, making the result appear quite customary to typical software developers.
This use of {...} is highly compatible with various Lisps. I think this rule would be a great backwards-compatible addition to the standard reader of any Scheme and Common Lisp implementation. Scheme specifically reserves {...} for future use (R5RS section 2.3, R6RS section 4.21). Common Lisp does not define {} (see section 2.4 of the Common Lisp Hyperspec, based on ANSI Common Lisp X3.226), but notes its potential use by users. BitC spec version 0.10 (June 17, 2006) section 2.4.3 also reserves {...}. It is trivial to implement this in Common Lisp.
It's important to note that inside the infix expression you can do anything you can do in normal Lisp. This is different from nearly all other Lisp infix systems, which have their own incompatible language inside that can't handle arbitrary s-expressions. You can use arbitrary s-expressions with quasi-quoting, unquote-splicing, or whatever inside, and all without "registering" anything.
Surprisingly, this simple mechanism is actually enough to do what people actually want in an infix mechanism for Lisp. You can add things, like {x + 1}, or compare values, like {x \<= 5}.
This is an unusually simple mechanism, but like much of Lisp, its power comes from its simplicity.
Scheme output is not predictable and can change from version to version, whether or not curly-infix exists. That is true of both the Scheme spec and its implementations.
The Scheme specification R7RS draft 7 for "write" is "Writes a representation of obj to the given textual output port". Note that this is a representation, not the representation, as there are many possible representations without curly-infix. Similar text exists for R6RS (library section 8.2.12 on put-datum), and R5RS (section 6.6.3); it always says "a" not "the" and does not proscribe a particular representation.
Different Scheme implementations do write the same list differently, too. Let's run the trivial program (write (read)) and give the program the input ''x (x quoted twice). The scsh version 0.6.7 implementation reports ''x, while guile version 1.8.7 reports (quote (quote x)) - obviously different from scsh.
Since Scheme does not guarantee a particular format for a list - and permits implementations to use abbreviations when they choose - curly-infix represents no change in this matter.
The "sweeten" heuristic is to write infix form if there are 3-6 parameters, and the operator is punctuation, "and", or "or". It's a longer heuristic, but it means that users can see {a > b} instead of (> a b), and a lot of users prefer the infix representation.
Lisp’s standard notation is different from “normal” notation in that the parentheses precede the function name, rather than follow it. Others have commented that it'd be valueable to be able to say name(x) instead of (name x):
Neoteric-expressions build on curly-infix's use of {...}. If (...), {...}, or [ ... ] are prefixed with a symbol or list (i.e., have no whitespace between them), they have a new meaning in neoteric-expressions:
These combine well with curly-infix forms of {...}. For example, {-(x) * y} maps to (* (- x) y).
Neoteric-expressions require that (. x) must mean x.
It is critically important that expressions like read(. port) be supported so you can represent, in the obvious way, (read . port). If (. x) didn't mean x, then it would be easy to get this case wrong. What's more, implementing this would require special treatment. Also, if someone wanted to build on top of an existing reader, they would have to reimplement parts of the list-processing system if this wasn't handled.
This also provides a simple way to escape certain constructs, in particular, some special symbols in sweet-expressions. Sweet-expressions (to be discussed later) need an escape mechanism for characters and symbols like !, \\, and $, so that they can be directly represented. This means that (. $) just means the symbol $, even if $ by itself would normally have some other meaning when being processed by a higher-level reader (such as a sweet-expression reader).
It was far easier to define this escape mechanism as part of neoteric-expressions. This rule could have been defined as part of sweet-expressions, but that would create an implementation problem. We fully expect that implementers will work in stages; in particular, some may not want to build in indentation processing, but they might be willing to build a neoteric-expression system into their reader. If you built a sweet-expression reader on top of a neoteric-expression reader, but that reader didn't implement (. e), then you'd have to re-implement the whole reader underneath anyway. But if all neoteric-expression readers support (. e), then a sweet-expression reader is far more trivial to build on top.
It is already true that (. x) is x in guile, so there was already a working example that this is a reasonable extension. In fact, in a typical implementation of a list reader, it takes extra effort to prevent this extension, so this is a relatively easy extension to include.
It would be possible to define neoteric-expressions to have comma-separated values in a function call; this would make it even more similar to traditional function call notation. If you simply threw out commas, this would interfere with ,-lifting - and this was quickly rejected.
A better rule, that would indeed work, would be to require each parameter to end with a comma, and then remove that ending comma. However, this rule:
Many other languages do use commas, but they are required in those languages because infix operators are not surrounded by any marker. Since infix operations are already surrounded by {...} in our notation, there is no need for the additional commas for parameter separation.
Experimentation found that separating parameters solely by whitespace worked well, so that approach was selected.
Originally the prefix had to be a symbol or list. The theory was that by ignoring others, the sweet-reader would be backwards-compatible with some badly-formatted code, and some errors might not result in incorrectly-interpreted expressions. But this was an odd limitation, and in some cases other prefixes made sense (e.g., for strings). This was changed to eliminate the inconsistency.
At one time it was required that unprefixed [ ... ] be the same as ( ... ), but some Lisps interpret unprefixed [ ... ] specially (e.g., Arc). Thus, it was decided that it'd be better to simply leave [ ... ] unchanged in interpretation. Note that in Scheme R6, [ ... ] does have the same meaning as ( ... ) when unprefixed, but this is a property of Scheme R6 not of neoteric-expressions.
Neoteric-expressions used to be called "modern-expressions". But some people didn't like that name, and obvious abbreviation ("m-expression") was easily confused with the original Lisp M-expression language. So the name was changed to neoteric, which has a similar meaning and abbreviates nicely. It wasn't called "function-expressions" because "f-expressions" are previously used (and can sound bad if said quickly), and they weren't called "prefix-expressions" because "p-expressions" sound like "pee-expressions". It's not called "name-prefix" because the prefix need not be a name.
The neoteric rules do introduce the risk of someone inserting a space between the function name and the opening “(”. But whitespace is already significant as a parameter separator, so this is consistent with how the system works anyway... this is not really a change at all.
Obviously, this is trivial to parse. We don’t lose any power, because this is completely optional -- we only use it when we want to, and we can switch back to the traditional s-expression notation if we want to. It’s trivially quoted... if you quote a symbol followed by “(”, just keep going until its matching “)” -- essentially the same rule as before!
The article Improving lisp syntax is harder than it looks discusses name-prefixing systems like this, but it makes a number of errors:
Technically, this is a change from some official Lisp s-expression notations and implementations. For example, “a(b)” in traditional Scheme or Common Lisp is the same as “a (b)” -- its parser tries to return the value of a, followed by running the function b. But it’s not clear it’s a big change in practice; commonly accepted style always separates parameters (including the first function call name) with whitespace. So normally, what follows a function call’s name is whitespace or “)”, and this is enforced by pretty-printers. Thus, many large existing Lisp programs could go through this kind of parsing without resulting in a change in meaning!
With neoteric-expressions you can easily use the traditional Lisp read-eval-print loop (at the command line), e.g., as a calculator. Just remember to surround infix expressions with {...} and surround infix operators with whitespace. For example, "{3 + 4}" will be mapped to (+ 3 4), which when executed will produce "7". Use normal function notation for unary functions, e.g., "{-(x) / 2}" maps to "(/ (- x) 2)". Nest {...} when you need to, e.g., "{3 + {4 * 5}}" will map to "(+ 3 (* 4 5))".
Neoteric-expressions are also very compatible with most existing text editors for Lisp. Editors do not "understand" the code, but many work to match (...), {...}, and [ ... ], and that is enough to be useful. After all, Common Lisp readers are designed to allow { ... } to be overridden, so many text editors are designed to support this.
Sweet-expressions start with neoteric-expressions and add indentation as meaningful.
These eliminate many parentheses, thus making them more readable, by making indentation itself meaningful. Real Lisp programs are already indented, and tools (like editors and pretty-printers) are used to try to keep the indentation (used by humans) and parentheses (used by the computers) in sync. By making the indentation (which humans depend on) actually used by the computer, they are automatically kept in sync, and many parentheses become unnecessary.
The page http://www.gregslepak.com/on-lisps-readability shows one of the many examples of endless closing parentheses and brackets to close an expression, and the confusion that happens when indentation does not match the parentheses. bhurt's response to that article is telling: "I'm always somewhat amazed by the claim that the parens 'just disappear', as if this is a good thing. Bugs live in the difference between the code in your head and the code on the screen - and having the parens in the wrong place causes bugs. And autoindenting isn't the answer- I don't want the indenting to follow the parens, I want the parens to follow the indenting. The indenting I can see, and can see is correct."
An IDE can help keep the indentation consistent with the parentheses, but needing IDEs is considered by some a language smell. If you need special tools to work around problems with the notation, then the notation itself is a problem.
A solution, of course, is to make the indentation actually matter: Now you don't need an endless march of parentheses, and indentation can't be confusing because it is actually used.
"In praise of mandatory indentation..." notes that it can be helpful to have mandatory indentation: