Rationale

There is a newer version of this page. You can find it here.

Here is the detailed rationale for the various "readable" notations. If you just want to use the readable results, you don't need all this, but if you want to understand the rationale behind them, here it is. This essay was originally written by David A. Wheeler (the "I" below), though there may be some edits by others.

Problem

As discussed in [Problem], software in Lisp-based programming language have traditionally been written using s-expressions. But many software developers consider s-expression notation difficult to read when used for programs. For example, s-expressions do not directly support infix operators, fail to support traditional math function notation such as f(x), and require a large number of parentheses for even simple operations.

Past work to create readable formats

There have been a huge number of past efforts to create readable formats, going all the way back to the original M-expression syntax that Lisp's creator expected to be used when programming. Generally, they've been unsuccessful, or they end up creating a completely different language that lacks the advantages of Lisp-based languages. After examining a huge number of them, David A. Wheeler noticed a pattern in most of these failures: most failed to be general and homoiconic:

  • A readable Lisp format must be general. A general format is not tied to some specific underlying semantic. Most readability efforts focused on creating special syntax for each language construct of an underlying language. But since Lisp-based languages can trivially create new semantic constructs (via macros), and are often used to process fragments of other languages, these did not work well. It was often difficult to keep updating the parser to match the underlying system, so the parser was always less capable than using s-expressions... leading to its abandonment. Sometimes the parser was continuously maintained, but soon the parser led to the development of a completely new language that was less suitable for self-analysis of program fragments and similar tasks (and thus no longer a suitable "Lisp"). It's easy to create a new "operator" in a Lisp, yet many infix systems cannot work without having its precedence predefined.

  • A readable Lisp format must be homoiconic. A homoiconic format is a surface format in which the human reader can easily determine what the underlying representation is. It is very difficult to take advantage of Lisp capabilities, such as macros, without a homoiconic format. Yet many past readability efforts made it difficult to determine exactly what structures were being created by the notation. Typical infix notations with precedence were especially common examples of this problem - they would quietly create multiple lists without obvious indications that this was happening. Top Down Operator Precedence by Douglas Crockford (2007-02-21), for example, discusses Vaughan Pratt's "Top Down Operator Precedence" and shows how important homoiconicity is. He stated that "parsing techniques are not greatly valued in the LISP community, which celebrates the Spartan denial of syntax. There have been many attempts since LISP's creation to give the language a rich ALGOL-like syntax, including Pratt's CGOL, LISP 2, MLISP, Dylan, Interlisp's Clisp, and McCarthy's original M-expressions. All failed to find acceptance. That community found the correspondence between programs and data to be much more valuable than expressive syntax. But the mainstream programming community likes its syntax, so LISP has never been accepted by the mainstream."

Now that this pattern has been identified, new notations can be devised that are general and homoiconic - avoiding the problems of past efforts.

See http://www.dwheeler.com/readable/readable-s-expressions.html for a longer discussion on past efforts.

Why these three tiers?

We have three tiers, each of which builds on the previous one, as described in [Solution]. We have devised them to try to meet our [Goals]. First, let's discuss why these three tiers.

Each tier improves the notation, but has a trade-off; by creating three tiers, people can choose the tier they are comfortable with, yet use them together:

  1. Curly-infix-expressions (c-expressions) add infix notation, the notation people are trained in and most programming languages support out-of-the-box. This notation uses {...}, which are not used by many Lisps (including Common Lisp and Scheme), so in most cases there is no compatibility issue adding this. It's also trivial to add to Common Lisp (just modify the readtable). Thus, this creates a simple "first step" that people can adopt without concern.
  2. Neoteric-expressions (n-expressions) allow people to use a function name outside of the parentheses, which is the notation most people are taught in school and is used by most programming languages. This involves a subtle incompatible change in most Lisp readers, so a few might hesitate doing this. But code where this difference matters is considered extremely poor style, and is unlikely in modern code - and a pretty-printer would eliminate any such problems (just apply it first). Neoteric-expressions build on curly infix, since the prefixed {} requires handling curly infix.
  3. Sweet-expressions (t-expressions) add indentation support, eliminating the need for many parentheses. However, this adds another subtle change, and some people won't want indentation to be relevant. By making this a separate tier, people can adopt neoteric-expressions if they don't want meaningful indentation.

It would be possible to mix-and-match each idea, but that makes it more complicated for users to know what's allowed (they would have to answer three questions, instead of simply answering "which notation is supported?"). Neoteric-expressions add support for prefix {...}, which doesn't make sense without curly infix. But perhaps more importantly, each tier has an obvious additional cost in terms of implementation effort and compatibility; it would make less sense to add one of the later tiers without adding the former ones. Thus, to make things simple for users, it's best to define this as a set of 3 tiers that build on each other.

Now, let's focus on each one, and see why they've been defined the way they are.

Curly-infix-expressions (c-expressions)

The reality is that nearly everyone prefers infix notation where it's traditionally used. People will specifically avoid Lisp-based systems for some problems, solely because they lack built-in infix support. Even Paul Graham, a well-known Lisp advocate, admits that "Sometimes infix syntax is easier to read. This is especially true for math expressions. I've used Lisp my whole programming life and I still don't find prefix math expressions natural." Paul Prescod remarked, “[Regarding] infix versus prefix... I have more faith that you could convince the world to use esperanto than prefix notation.” Nearly all developers prefer to read infix for many operations. I believe Lisp-based systems have often been specifically ignored even where they were generally the best tool for the job, solely because there was no built-in support for infix operations. After all, if language creators can’t be bothered to support the standard notation for mathematical operations, then clearly it isn’t very powerful (as far as they are concerned). So let’s see some ways we can support infix, yet with minimal changes to s-expression notation.

Many previous systems have implemented "infix" systems as a named macro or function (often "nfx" meaning infix). This looks ugly, and it does the wrong thing - the resulting list has "nfx" at the beginning, not the operator. Many of these systems also created a whole new notation which simultaneously lost Lisp's abilities for quoting, quasiquoting, and so on. Therefore, they haven't caught on.

Key insight: No built-in precedence

Many "infix" systems in Lisp also implemented precedence, as precedence is usually baked into languages with infix support. However, it is not possible to preset the precedence rules for all uses. Lisp systems often process other languages, freely mixing different types of language, and thus the same symbol may have different meanings. What's worse, these precedence systems hid where lists were being created, losing homoiconicity. So having an infix system that forces the use of precedence makes the system harder, not easier, to use.

The key insight here is that although other languages implement precedence, building precedence into a language is not necessary to have a useful infix system. You can easily add infix notation if you're willing to not force precedence into the reader, and it turns out that is enough for real-world use. Even in languages with precedence, people often parenthesize to make things clear, so not having precedence systems is actually not a big impact. (It's even less of an impact because of the nfx rule described below).

By intentionally not building in a precedence system, we make things amazingly simple - we don't need to register functions, decide their order, or anything like it - making programming much simpler and easier. There's no need to memorize a precedence system, code transfers easily, and code is generally easy to read too (again, because you don't have to memorize a precedence system). The reader implementation is easier to get "obviously right" since it has less to do. As discussed later, curly-infix supports adding an "nfx" operator that can provide precedence in those few cases where it's valuable, without harm to the language or complex infrastructure.

Why not limited precedence for a short list of operators?

Actually, there's a reasonable case to be made for precedence for just a few operators, in particular, +, -, *, and /. Even when these operators don't mean add, subtract, multiply, and divide, people using them as infix operators would typically be willing to agree on their precedence. And these operators are certainly widely used.

But there are several arguments against support for limited precedence:

  • Any such system harms homoiconicity, which is a serious concern.
  • Any useful rule takes more time to explain, and I worry that this would inhibit - not encourage - acceptance.
  • Supporting precedence turns out to be less important than you'd think. It's trivial to group operations so that explicitly surrounding them is easy, e.g., {{a + b + c} - d - e - f}. Similarly, although it'd be nice to be able to say {a < b <= c}, many widely-accepted languages work just fine with the equivalent of {{a < b} and {b <= c}}.
  • The call out to "nfx" also makes this less important.
  • Where do you end? Once you allow precedence, it's much harder to determine where to stop. Stopping at these 4 operators is reasonable enough, but then someone asks about power-of (often spelled ** or ^), comparisons (<, <=, =, ==, !=, /=, <>, >=, >), and so on - all of which could have precedence. You need to stop somewhere, and it's harder to agree on that stopping point.
  • In general, it would be much more difficult to get widespread agreement on it. There would be many arguments about the operators to include, their relative precedence, whether or not to support right-to-left (and if so, for which operators), and so on.
  • This creates a little more implementation work. It's not much - implementing precedence is well-understood - but we want to get this approach widely adopted, so minimizing implementation effort is useful.
  • Precedence could be added later to curly-infix, once the curly-infix rules as given have been accepted, so there is no need to add it now.

If you really want precedence, see the page [Precedence] which describes an approach for adding precedence to curly-infix.

Detecting infix

An older version of my infix notation (version 0.1) tried to automatically detect when an operator was infix, but this turned out to be a poor choice. It's hard to express good rules, e.g., most operators typically used as infix are punctuation-only, but some (like "and" and "or") are not. More sophisticated rules made it harder, not easier, to use. And for automatic detection to be useful, there must be an escape mechanism anyway. After experimentation, it was determined that letting the user quickly express exactly which lists should be interpreted as infix was far more effective. The curly brace pair {...} makes that easy to express, resulting in a simple, clear rule.

A "simple infix" expression (an expression that can represent a single list with one operator and a list of two or more parameters) are actually the common case; and supporting them is enough to have a useful infix notation. Now you can write {a + b} and it has its obvious meaning. Expressions like {a < b < c} also have the obvious meaning. Notice that once it is read in, the operator has its correct place without macro processing or other complications, making these easy to use for common cases.

At first I considered reporting an error if a simple infix expression isn't sent, but prepending "nfx" is much more flexible. This way, if you do want a precedence system, you can build one. And because it's not locked inside the reader, you can choose whatever precedence system you want.

Why not just use a macro name?

It is trivial in most Lisps to define a macro (let's call it INFIX though its name is irrelevant) to do macros, but this simply doesn't do the job.

First of all, it is obviously not real infix. The expression "(INFIX a + b)" is visibly worse notation than practically all other programming languages; Fortran, Basic, Java, C, C#, and many other languages manage to do this with "a + b". But even if you used {...} to notate (INFIX ...) in all cases, it would still be wrong for the usual case; this interferes with general list processing (including quoting, quasiquoting, and so on). Someone who enters '{3 + 4} probably didn't want "(quote (INFIX 3 + 4))", they wanted "(quote (+ 3 4))". Macro processing is simply too late in most Lisps. You really want infix baked into the reader, so that the reader will transform the infix expression before macros, quoting, and so on get hold of it.

Why not a completely different language?

Some Lisps implement a completely different and complex language in their readers. For example, Gambit has its "Scheme infix syntax extension (SIX)", as described at: http://www.iro.umontreal.ca/~gambit/doc/gambit-c.html#Scheme-infix-syntax-extension

But these have problems on a number of fronts:

  • They typically fail to be general. These typically only allow certain present operations to be infix. If you define a new +!+ operator, they have no way of knowing that you want to use it as infix. Thus, they are never as capable as the "real" reader, and slowly become obsolete.
  • They typically fail to be homoiconic. These often create many complex lists, in a non-obvious way.
  • They are complex. Definitions of SIX and similar are often very long and complex, even though they cannot automatically handle new operators.

Some mechanisms don't work well with other macros because even in simple cases they don't return same structure a "regular" s-expression would return. For example, the SIX extensions insert many new function names like "six.x+y", instead of simply returning "+" for addition.

However, mechanisms like Gambit's SIX show that there is a desire to support infix.

Why use curly brace characters for infix?

There is no perfect character. However, you really want a balanced pair of characters to identify this, since you can have infix-in-infix. Parentheses are already spoken for. R6RS Scheme already uses up square brackets as a synonym for parentheses. Angle brackets are already used for comparison. The curly braces are visually pleasant pairs. It makes sense to use these precious characters on something extremely common: infix notation. They are also widely available for use in many (though not all) Lisp implementations. We're sorry to interfere with the implementations that use {...} for something else, but there are fewer such Lisps. For example, they are not standard in Common Lisp or Scheme, so at most they are local extensions in those languages (which could be enabled and disabled, and code that uses them isn't portable anyway).

It's true that {...} are often used in math for set notation. But infix notation is far more basic, and common, than sets. Also, traditional function call notation and infix are helpful when working with sets, so infix notation is the more important need. Once you allow neoteric-expressions, the notation set(...) is a reasonable alternative.

And yes, some Lisps already use {...} for other reasons. Clojure, for example, uses them for maps. Some Lisps won't be able to switch to {...}, but some may decide that they can use prefix forms such as map(...) for operations they formerly used braces for. That's a decision specific implementations will have to make, but other characters won't be any better. We know that these are available in Common Lisp, the Scheme standard (including many implementations of it), and many others.

Why not use the Racket "infix convention" (a . > . b)?

Racket allows an infix notation of the form (a . > . b), as defined here:

http://docs.racket-lang.org/guide/Pairs__Lists__and_Racket_Syntax.html

A pro is that it doesn't need to use up {}, so it might be easier to implement in some Lisps which already define {} for use in a local extension.

However, it has many cons:

  • This notation is much longer and more awkward. Every infix operator adds 6 characters, including "." characters not used in any other infix notation. Infix operations are a common operation, so convenience matters. An expression like (1 . + . 2) is far longer, and less convenient, than {1 + 2}.
  • It doesn't look like other languages or math. A human notation should be maximally understandable to people given what they already know. {a + b} is much more similar to standard notation than (a . + . b).
  • It is easy to make mistakes. If you forget a "." somewhere, you end up with the wrong thing. This would also make it harder to see improper lists; improper lists are important but rarer, so it's good to make them obvious, and this notation doesn't do that. The Racket documentation even goes out of its way to emphasize that infix convention use is unrelated to improper lists.. which suggests that they are easily confused.
  • We could redefine prefixed f to instead allow f{a . + . b}, for consistency, but it's still ugly.
  • We'd lose {x} as an escape mechanism. We could revert to (. x) as the escape mechanism, at which point dots-in-lists becomes rather busy (!).
  • Racket's implementation does not allow multiple operations, e.g., (a . + . b . + . c . + . d). That could be added, but this makes the notation even more unwieldy; compare this to {a + b + c + d}.
  • Even Racket users don't use it much. Its documentation says that "Racket programmers use the infix convention sparingly—mostly for asymmetric binary operators such as < and is-a?."

In short, infix is extremely common, so its notation should be convenient. The Racket "infix convention" may be the next-best notation for infix notation after curly-infix, but it's next-best, and we should strive for the best available for a common need.

Curly-infix does not conflict with the Racket infix convention; implementations could implement both. We recommend that an implementation that implements the Racket infix convention should allow multiple operands and use curly-infix semantics, pretending that . op . is a single parameter. In that case, (a . + . b . + . c) would map to (+ a b c), and (a . + . b . * . c) would map to (nfx a + b * c).

Delimiters required

Curly-infix requires that infix operators be delimited (e.g., by spaces). This is consistent with Lisp history; operators are always delimited in traditional s-expressions (typically by left parentheses on the left, and space on the right). It's also impractical to do otherwise; most Lisps allow and predefine symbols that include characters (like "-") that are typically used for infix operators. And while many other languages permit infix operators to be used without delimiters, many developers will put space around infix operators even in languages that don't require them. Thus, it is difficult to allow infix operators without delimiters, and the visual results are common in real-world use of other languages, making the result appear quite customary to typical software developers.

Curly-infix utility

This use of {...} is highly compatible with various Lisps. I think this rule would be a great backwards-compatible addition to the standard reader of any Scheme and Common Lisp implementation. Scheme specifically reserves {...} for future use (R5RS section 2.3, R6RS section 4.21). Common Lisp does not define {} (see section 2.4 of the Common Lisp Hyperspec, based on ANSI Common Lisp X3.226), but notes its potential use by users. BitC spec version 0.10 (June 17, 2006) section 2.4.3 also reserves {...}. It is trivial to implement this in Common Lisp.

It's important to note that inside the infix expression you can do anything you can do in normal Lisp. This is different from nearly all other Lisp infix systems, which have their own incompatible language inside that can't handle arbitrary s-expressions. You can use arbitrary s-expressions with quasi-quoting, unquote-splicing, or whatever inside, and all without "registering" anything.

Surprisingly, this simple mechanism is actually enough to do what people actually want in an infix mechanism for Lisp. You can add things, like {x + 1}, or compare values, like {x \<= 5}.

This is an unusually simple mechanism, but like much of Lisp, its power comes from its simplicity.

Does this mean that there's more than one way to write a list?

Scheme output is not predictable and can change from version to version, whether or not curly-infix exists. That is true of both the Scheme spec and its implementations.

The Scheme specification R7RS draft 7 for "write" is "Writes a representation of obj to the given textual output port". Note that this is a representation, not the representation, as there are many possible representations without curly-infix. Similar text exists for R6RS (library section 8.2.12 on put-datum), and R5RS (section 6.6.3); it always says "a" not "the" and does not proscribe a particular representation.

Different Scheme implementations do write the same list differently, too. Let's run the trivial program (write (read)) and give the program the input ''x (x quoted twice). The scsh version 0.6.7 implementation reports ''x, while guile version 1.8.7 reports (quote (quote x)) - obviously different from scsh.

Since Scheme does not guarantee a particular format for a list - and permits implementations to use abbreviations when they choose - curly-infix represents no change in this matter.

The "sweeten" heuristic is to write infix form if there are 3-6 parameters, and the operator is punctuation, "and", or "or". It's a longer heuristic, but it means that users can see {a > b} instead of (> a b), and a lot of users prefer the infix representation.

Neoteric-Expressions (n-expressions)

Lisp’s standard notation is different from “normal” notation in that the parentheses precede the function name, rather than follow it. Others have commented that it'd be valueable to be able to say name(x) instead of (name x):

  • Jorgen ‘forcer’ Schaefer argues that this is a more serious problem than the lack of infix notation; on July 2000 he said “I think most people would like Scheme a lot better if they could say lambda (expression) ... instead of (lambda (expression) ...”
  • Peter Norvig had a reader implementation in which “if a function name ends with an open parentheses, move it inside the list (when converting to an s-expression)”. This means that “(fact x)” and “fact(x)” will mean the same thing.
  • Skill from Cadence, a proprietary Lisp-based extension language, also supports name-prefixing.

Basic neoteric forms

Neoteric-expressions build on curly-infix's use of {...}. If (...), {...}, or [ ... ] are prefixed with a symbol or list (i.e., have no whitespace between them), they have a new meaning in neoteric-expressions:

  1. Prefixed (...). Syntax of the form e(...) - with no whitespace between e and the open parenthesis - are mapped to (e ...). Any parameters in "..." are space-separated. This produces another expression, so this can be repeated (left-to-right). This adds support for traditional function notation. For example, "cos(x)" maps to "(cos x)", "max(3 4)" maps to "(max 3 4)", and "f(x)(a b)" maps to "((f x) a b)". Note that this is especially convenient for certain styles of functional programming, including lambda expressions; in Scheme, lambda((x) {x + x})(4) would compute as 8.
  2. Prefixed {...}. A prefixed expression e{...} is an abbreviation for e({...}). ? This rule simplifies combining function calls and infix expressions when there is only one parameter to the function call. This is a common case; for example, "not" (which is normally given only one parameter) often encloses infix "and" and "or". Thus, f{n - 1} maps to (f (- n 1)). When there is more than one function parameter, use the normal term-prefixing format instead, e.g., f({x - 1} {y - 1}) maps to (f (- x 1) (- y 1)).
  3. Prefixed [...]. Prefixed square brackets e[...] maps to (bracketaccess e ...). ? Thus, "t[x]" maps to "(bracketaccess t x)". This is intended to simplify use of indexed arrays, associative arrays, and similar constructs. You could even define bracketaccess as a macro that simply returns its arguments; in this case f[5] would eventually map to (f 5).

These combine well with curly-infix forms of {...}. For example, {-(x) * y} maps to (* (- x) y).

Why the (. e) rule?

Neoteric-expressions require that (. x) must mean x.

It is critically important that expressions like read(. port) be supported so you can represent, in the obvious way, (read . port). If (. x) didn't mean x, then it would be easy to get this case wrong. What's more, implementing this would require special treatment. Also, if someone wanted to build on top of an existing reader, they would have to reimplement parts of the list-processing system if this wasn't handled.

This also provides a simple way to escape certain constructs, in particular, some special symbols in sweet-expressions. Sweet-expressions (to be discussed later) need an escape mechanism for characters and symbols like !, \\, and $, so that they can be directly represented. This means that (. $) just means the symbol $, even if $ by itself would normally have some other meaning when being processed by a higher-level reader (such as a sweet-expression reader).

It was far easier to define this escape mechanism as part of neoteric-expressions. This rule could have been defined as part of sweet-expressions, but that would create an implementation problem. We fully expect that implementers will work in stages; in particular, some may not want to build in indentation processing, but they might be willing to build a neoteric-expression system into their reader. If you built a sweet-expression reader on top of a neoteric-expression reader, but that reader didn't implement (. e), then you'd have to re-implement the whole reader underneath anyway. But if all neoteric-expression readers support (. e), then a sweet-expression reader is far more trivial to build on top.

It is already true that (. x) is x in guile, so there was already a working example that this is a reasonable extension. In fact, in a typical implementation of a list reader, it takes extra effort to prevent this extension, so this is a relatively easy extension to include.

Comma-separated parameters

It would be possible to define neoteric-expressions to have comma-separated values in a function call; this would make it even more similar to traditional function call notation. If you simply threw out commas, this would interfere with ,-lifting - and this was quickly rejected.

A better rule, that would indeed work, would be to require each parameter to end with a comma, and then remove that ending comma. However, this rule:

  • would obscure any comma used for ,-lifting (making them hard to find).
  • is inconsistent with "normal" Lisp lists, which do not use commas this way, as well as being inconsistent with the simple space-separated parameters of sweet-expressions described below. This would make it harder to switch formats and possibly hamper adoption.
  • is completely unnecessary. Whitespace is quite sufficient (and clear) for syntactically separating parameters
  • Creates clutter. Parameters are very common, so creating an additional extra character for every parameter, to write and read, appeared to be a poor approach.

Many other languages do use commas, but they are required in those languages because infix operators are not surrounded by any marker. Since infix operations are already surrounded by {...} in our notation, there is no need for the additional commas for parameter separation.

Experimentation found that separating parameters solely by whitespace worked well, so that approach was selected.

Neoteric alternatives

Originally the prefix had to be a symbol or list. The theory was that by ignoring others, the sweet-reader would be backwards-compatible with some badly-formatted code, and some errors might not result in incorrectly-interpreted expressions. But this was an odd limitation, and in some cases other prefixes made sense (e.g., for strings). This was changed to eliminate the inconsistency.

At one time it was required that unprefixed [ ... ] be the same as ( ... ), but some Lisps interpret unprefixed [ ... ] specially (e.g., Arc). Thus, it was decided that it'd be better to simply leave [ ... ] unchanged in interpretation. Note that in Scheme R6, [ ... ] does have the same meaning as ( ... ) when unprefixed, but this is a property of Scheme R6 not of neoteric-expressions.

Neoteric-expressions used to be called "modern-expressions". But some people didn't like that name, and obvious abbreviation ("m-expression") was easily confused with the original Lisp M-expression language. So the name was changed to neoteric, which has a similar meaning and abbreviates nicely. It wasn't called "function-expressions" because "f-expressions" are previously used (and can sound bad if said quickly), and they weren't called "prefix-expressions" because "p-expressions" sound like "pee-expressions". It's not called "name-prefix" because the prefix need not be a name.

Comments on neoteric rules

The neoteric rules do introduce the risk of someone inserting a space between the function name and the opening “(”. But whitespace is already significant as a parameter separator, so this is consistent with how the system works anyway... this is not really a change at all.

Obviously, this is trivial to parse. We don’t lose any power, because this is completely optional -- we only use it when we want to, and we can switch back to the traditional s-expression notation if we want to. It’s trivially quoted... if you quote a symbol followed by “(”, just keep going until its matching “)” -- essentially the same rule as before!

Errors that have made people reject this in the past

The article Improving lisp syntax is harder than it looks discusses name-prefixing systems like this, but it makes a number of errors:

  • It first claims that this would be hard to integrate with macros, which isn't true. He says, "Under the current syntax macros can treat the code as a series of nested lists, which makes it easy to write intuitive looking macro expansions, for example if a macro expands into '(display "text") it is pretty obvious what it does. Although in theory you could keep this macro system with a new Lisp syntax it would look strange, and basically force users of the language to know both the old syntax and the new syntax. Thus we would expect the macros to read in the function call under this kind of syntax as some new kind of object, with one operation returning the function symbol, and another operation returning the list of parameters. Thus a macro expansion would have to look something like this: build-call('display '("text"))." Absolutely false. There's no reason you have have those kinds of "two object" semantics, and once you realize there are easier ways to handle it, integration with macros is trivial. And "having to know the old and new syntax" is no big deal, it's already true that people need to know that 'x and (quote x) mean the same thing. If the reader transforms a(x) into (a x), then when the macro has a chance to run all it sees is (a x) - exactly what it was expecting to see.
  • "The real motivation for leaving Lisp syntax as it is comes from macros. Not only does this expansion fail to visually look the same it also is much more complicated. One could try to get around this by altering the quasiquote operator, as in TwinLisp, so that the expansion becomes `display("text"), but then we have sacrificed the simplicity of the quasiquote, which no longer operates on lists. No matter how you solve the problem you end up in a bit of a bind." Not true. Quasiquote still works on lists, and display("text") is just an alternate way to express a list. After all, (display "text") is actually not the real representation of a list either; a more accurate representation would be (display . ("text" . ())).
  • He also says that "Another disadvantage to this change of syntax is that it makes functional programming much more odd looking. Let's say you have a list containing functions and you want to call the first one. In Scheme you write ((car lst) params) and in Common Lisp (funcall (car lst) params). However in our new syntax it looks like: car(lst)(params) and funcall(car(lst) (params)). Neither of these is very elegant, and it only gets worse if that call in turn returns a function, which would look like: car(lst)(params)(params2) and funcall(funcall(car(lst) (params)) (params2))." But I think this argument is backwards; I find this notation remarkably elegant, and better than the traditional notation. This way, to do functional programming, just cuddle up the parentheses. It's much easier to understand sequential parentheses compared to a deeply nested list. But again this misses the point; if you prefer to write ((car lst) params) then do so; these prefixed forms are simply convenient notations that you can use when you find them useful, just as you can always write (quote x) if you don't find it helpful to write 'x.

Technically, this is a change from some official Lisp s-expression notations and implementations. For example, “a(b)” in traditional Scheme or Common Lisp is the same as “a (b)” -- its parser tries to return the value of a, followed by running the function b. But it’s not clear it’s a big change in practice; commonly accepted style always separates parameters (including the first function call name) with whitespace. So normally, what follows a function call’s name is whitespace or “)”, and this is enforced by pretty-printers. Thus, many large existing Lisp programs could go through this kind of parsing without resulting in a change in meaning!

User impact of neoteric-expressions

With neoteric-expressions you can easily use the traditional Lisp read-eval-print loop (at the command line), e.g., as a calculator. Just remember to surround infix expressions with {...} and surround infix operators with whitespace. For example, "{3 + 4}" will be mapped to (+ 3 4), which when executed will produce "7". Use normal function notation for unary functions, e.g., "{-(x) / 2}" maps to "(/ (- x) 2)". Nest {...} when you need to, e.g., "{3 + {4 * 5}}" will map to "(+ 3 (* 4 5))".

Neoteric-expressions are also very compatible with most existing text editors for Lisp. Editors do not "understand" the code, but many work to match (...), {...}, and [ ... ], and that is enough to be useful. After all, Common Lisp readers are designed to allow { ... } to be overridden, so many text editors are designed to support this.

Sweet-expressions (t-expressions)

Sweet-expressions start with neoteric-expressions and add indentation as meaningful.

These eliminate many parentheses, thus making them more readable, by making indentation itself meaningful. Real Lisp programs are already indented, and tools (like editors and pretty-printers) are used to try to keep the indentation (used by humans) and parentheses (used by the computers) in sync. By making the indentation (which humans depend on) actually used by the computer, they are automatically kept in sync, and many parentheses become unnecessary.

The page http://www.gregslepak.com/on-lisps-readability shows one of the many examples of endless closing parentheses and brackets to close an expression, and the confusion that happens when indentation does not match the parentheses. bhurt's response to that article is telling: "I'm always somewhat amazed by the claim that the parens 'just disappear', as if this is a good thing. Bugs live in the difference between the code in your head and the code on the screen - and having the parens in the wrong place causes bugs. And autoindenting isn't the answer- I don't want the indenting to follow the parens, I want the parens to follow the indenting. The indenting I can see, and can see is correct."

An IDE can help keep the indentation consistent with the parentheses, but needing IDEs is considered by some a language smell. If you need special tools to work around problems with the notation, then the notation itself is a problem.

A solution, of course, is to make the indentation actually matter: Now you don't need an endless march of parentheses, and indentation can't be confusing because it is actually used.

"In praise of mandatory indentation..." notes that it can be helpful to have mandatory indentation:


MongoDB Logo MongoDB