Homoiconicity isn't the point

reikonomusha · on April 17, 2012

Here's a duplication of the comment I made on the site. I speak as if I'm speaking directly to the author.

---

> it’s possible to read it without parsing.

Yes, it's possible to perform lexical analysis without parsing too.

> the front end of every interpreter and compiler looks pretty much the same: [diagram]

You forgot the step where lexical analysis occurs. This is akin to what Lisp's READ does. Except the output isn't a stream of tokens, it's a tree of tokens. So, actually, even superficially, Lisp and $OTHER_LANGUAGE front-ends are not all that different.

> it knows nothing about the contents of the expression. All it knows is the macro definition of for.

This is because the `for`-statement is akin to a special form in Lisp. When Scheme encounters IF, and assuming it doesn't get transformed into a COND, then what does it do? It does not pass the arguments to some IF macro. IF is fundamental.

> What should the expander do when it sees: quux (mumble, flarg) [1, 2, 3] { foo: 3 } grunch /wibble/i

It expands it. The expander should not see a stream of characters, as you show in that example, it should see a tree. The expander does not care what the original stream of characters was, nor should it.

READ isn't the point. READ is a way of turning a stream of characters into a data structure. The only time an expander cares about the stream is when it's a reader macro, like in Common Lisp. Otherwise, once the stream has been converted into some tree-like structure, syntax is gone. It just so happens we can convert the tree back into syntax in a standard way (most of the time, at least).

This is what makes Lisp/Scheme different from C. Macros in C operate on streams of characters, and do string insertion and deletion. This isn't true with Lisp, which does node insertion/deletion/manipulation on a tree. In other words, Lisp makes changes to structure/semantics, not syntax.

> If you’ve ever wondered why Lisp weirdos are so inexplicably attached to their parentheses, this is what it’s all about. Parentheses make it unambiguous for the expander to understand what the arguments to a macro are, because it’s always clear where the arguments begin and end.

No, it's not that Lisp weirdos are attached to parentheses. They're attached to regularity. As said above, the expander doesn't give a single flying fsck about parentheses. An expander could expand DOTIMES[{x, 5}, SETQ[a, a+5]] just as well as it could (DOTIMES (X 5) (SETQ a (+ a 5)). In fact, Lisp started as this first syntax (to a degree), and it worked fine. See the book Anatomy of Lisp, which is written entirely using this syntax.

The field of lexical analysis and parsing is well developed. Lexers and parsers can amazingly be written in a way that things aren't ambiguities to the computer. Maybe to the human, but not the computer.

In conclusion, I do think homoiconicity is the point: that the language's representation as a data structure is accessible and transformable. The stream of characters used to represent this structure does not matter at all.

dherman · on April 18, 2012

I replied in a comment on my post, but Disqus is having issues today and anyway I want to say a bit more.

>> What should the expander do when it sees:

    quux (mumble, flarg) [1, 2, 3] { foo: 3 } grunch /wibble/i

> It expands it. The expander should not see a stream of characters, as you show in that example, it should see a tree.

This is the point: what tree? How many arguments does `quux` have? The answer isn't obvious, because `quux` isn't surrounded by any pair of delimiters. The net result: there isn't a clear way to write a JavaScript "lexer" that produces trees---i.e., a reader. JavaScript's syntax was not designed with this in mind, so there isn't a clear tree structure inherent in the syntax.

(And even if you solve that problem, there are aspects of the JS syntax that depend on knowing the parsing context, such as whether

    grunch /wibble/i

lexes as an identifier followed by a regexp literal or identifier-slash-identifier-slash-identifier. So the number of sub-nodes is not even clear until you expand enough to know what context that fragment appears in.)

These are all issues you'd have to sort out to figure out how to make JS amenable to macros. But what it amounts to is figuring out how to design a reader for JS. Or at least, if you wanted to design a Lisp-like macro system for JS, you would.

The point is, the fact that Lisp macros operate on trees (though syntactic trees, not "semantic" trees as you claim---that's a debate for another day) is directly enabled by being able to generate a tree rather than a stream of tokens from the reader. And that's only possible because the syntax was designed to support that. There's of course not just a single unique syntax that has this property, but Lispy languages have syntaxes that are designed to support a reader, and that's at the heart of what makes the Lisp approach to macros work.