Pretty printer puzzle

This past week, I came across a Lisp challenge that turned out to be trickier than one might expect at first. I needed to pretty-print a Lisp form to a string and identify the positions of certain subforms. Here's an example:

CL-USER> (with-output-to-string (*standard-output*)
           (pprint '(defun factorial (x) (if (zerop x) 1 (* x (factorial (1- x)))))))
      (* X (FACTORIAL (1- X)))))"
In this, output the bounding indices of the IF form are 24 and 77. In other words:
CL-USER> (subseq * 24 77)
      (* X (FACTORIAL (1- X))))"
The challenge, then, is to write a function that, given a form and list of subforms, returns a string with the pretty-printed output and a list of bounding indices for each subform. E.g.
CL-USER> (pprinted-bounds '(defun factorial (x) #1=(if (zerop x) 1 (* x #2=(factorial (1- x)))))
                          (list '#1# '#2#))
      (* X (FACTORIAL (1- X)))))"
((24 77) (57 75))
I'll post my solution later. Have fun. :-)

2 komentoj:

Anonymous said...

Does your solution handle things like "(" in strings and symbols, too?

Eg. for the input

(defun |())())| (one &key #1=(two 1))
"comment string ()()((("
(if (plusp two)
(|())())| (1- one) (1- two))
(print #2=(+ one two))))

do you get correct output?

The "cleaner" approach of counting parenthesis would need to have syntactic knowledge (strings, comments, symbols, etc.); the other approach I can think of would be to change a symbol in each sought list, and to find the change in the resulting string output - this actually sounds easier.

Luís said...

My solution does not get confused by non-syntactic ('s no. I rely on the pretty printer dispatch table to know what is what.

Your second solution seems close to what I've got, but why change a symbol? What if the form doesn't have symbols? (Also, changing the symbol will probably screw the pretty printing.)