As promised, I'll describe my solution to the Pretty Printer Puzzle I proposed last week. To recap, we wish to pretty print a Lisp form to a string and identify the textual positions of arbitrary subforms therein.
First attemptA couple of folks proposed a clever solution that goes like this: (1) replace the
CARof each subform with some unique token (a gensym should be close enough), (2) pretty-print that, (3) find the token positions and replace them with the original
Problem with this solution: changing the form affects pretty-printing. In particular, it will no longer be able to properly indent macros and special forms.
Second attemptAnother approach is to pretty-print then read the form back and track positions by either using a custom reader that keeps track of form positions (such as hu.dwim.reader) or instrumenting the standard readtable by wrapping the
#\(reader-macro and doing the reading from a so-called form-tracking-stream.
Problem with this solution: it breaks down if the form contains unreadable objects.
Third attempt, getting closerThe pretty printer is customisable through a pprint-disptach-table. It is analogous to the reader's
readtable. So, we try and instrument it like in the previous approach. Each time a list is about to be pretty-printed, we store the current position in the output stream.
Problem: we have been defeated by the pretty printer's intermediate buffer. Turns out the pretty printer only writes to the output stream at the very end of the process. Back to the drawing board.
Fourth and final attemptBut these attempts have not been in vain, and my final solution involves elements from all three. It goes like this:
- Pretty print the form normally.
- Pretty print the form again, this time instrumenting the
pprint-dispatch-tableto wrap lists with some token identifying the subform being printed. (I decided to use the unicode range
U+E000..U+F8FFwhich is reserved for private-use, which seemed neat.) This messes up the pretty-printing a little bit, but not too much, it turns out.
- Cross-reference the token positions in #2 with #1 by taking advantage of the fact these outputs differ by whitespace (and tokens) only!
And that's it!
With this tool in hand, there are some interesting utilities that can be built in SLIME, but that's another blog post. :-)