Kvardek Du: 2015-05

As promised, I'll describe my solution to the Pretty Printer Puzzle I proposed last week. To recap, we wish to pretty print a Lisp form to a string and identify the textual positions of arbitrary subforms therein.

First attempt

A couple of folks proposed a clever solution that goes like this: (1) replace the CAR of each subform with some unique token (a gensym should be close enough), (2) pretty-print that, (3) find the token positions and replace them with the original CARs.

Problem with this solution: changing the form affects pretty-printing. In particular, it will no longer be able to properly indent macros and special forms.

Second attempt

Another approach is to pretty-print then read the form back and track positions by either using a custom reader that keeps track of form positions (such as hu.dwim.reader) or instrumenting the standard readtable by wrapping the #\( reader-macro and doing the reading from a so-called form-tracking-stream.

Problem with this solution: it breaks down if the form contains unreadable objects.

Third attempt, getting closer

The pretty printer is customisable through a pprint-disptach-table. It is analogous to the reader's readtable. So, we try and instrument it like in the previous approach. Each time a list is about to be pretty-printed, we store the current position in the output stream.

Problem: we have been defeated by the pretty printer's intermediate buffer. Turns out the pretty printer only writes to the output stream at the very end of the process. Back to the drawing board.

Fourth and final attempt

But these attempts have not been in vain, and my final solution involves elements from all three. It goes like this:

Pretty print the form normally.
Pretty print the form again, this time instrumenting the pprint-dispatch-table to wrap lists with some token identifying the subform being printed. (I decided to use the unicode range U+E000..U+F8FF which is reserved for private-use, which seemed neat.) This messes up the pretty-printing a little bit, but not too much, it turns out.
Cross-reference the token positions in #2 with #1 by taking advantage of the fact these outputs differ by whitespace (and tokens) only!

And that's it!

With this tool in hand, there are some interesting utilities that can be built in SLIME, but that's another blog post. :-)

This past week, I came across a Lisp challenge that turned out to be trickier than one might expect at first. I needed to pretty-print a Lisp form to a string and identify the positions of certain subforms. Here's an example:

CL-USER> (with-output-to-string (*standard-output*)
           (pprint '(defun factorial (x) (if (zerop x) 1 (* x (factorial (1- x)))))))
"
(DEFUN FACTORIAL (X)
  (IF (ZEROP X)
      1
      (* X (FACTORIAL (1- X)))))"

In this, output the bounding indices of the IF form are 24 and 77. In other words:

CL-USER> (subseq * 24 77)
"(IF (ZEROP X)
      1
      (* X (FACTORIAL (1- X))))"

The challenge, then, is to write a function that, given a form and list of subforms, returns a string with the pretty-printed output and a list of bounding indices for each subform. E.g.

CL-USER> (pprinted-bounds '(defun factorial (x) #1=(if (zerop x) 1 (* x #2=(factorial (1- x)))))
                          (list '#1# '#2#))
"
(DEFUN FACTORIAL (X)
  (IF (ZEROP X)
      1
      (* X (FACTORIAL (1- X)))))"
((24 77) (57 75))

I'll post my solution later. Have fun. :-)

Kvardek Du

2015-05-30

My answer to the Pretty Printer Puzzle

First attempt

Second attempt

Third attempt, getting closer

Fourth and final attempt

2015-05-24

Pretty printer puzzle

Arkivoj

Kategorioj

Ligoj