Wednesday, May 31, 2006

How to Write an Unhygienic Macro - Introducing an Identifier Into the Lexical Context of a Macro Call

The standard syntax-rules macro system of R5RS Scheme is hygienic
If a macro transformer inserts a binding for an identifier (variable or keyword), the identifier will in effect be renamed throughout its scope to avoid conflicts with other identifiers.
and referentially transparent
If a macro transformer inserts a free reference to an identifier, the reference refers to the binding that was visible where the transformer was specified, regardless of any local bindings that may surround the use of the macro.
These properties of the macro system make the majority of macros easy to write and understand. In some situations it is nevertheless convenient to break the rules. A classic example is the if-it macro.
Syntax: (if-it test consequent alternative)

Semantics: An if-it expression is evaluated as follows: first, test is evaluated and its result is bound to it. If the result is a true value, then consequent is evaluated and its value(s) is(are) returned. Otherwise is evaluated and its value(s) is(are) returned. In consequent and alternative references to it will be bound to the it inserted by if-it.

Examples:
(if-it 1 it 'bomb) ; => 1
(let ((it 'bomb)) (if-it 1 it it)) ; => 1
It isn't too difficult to write such a macro with the syntax-case macro system. It is however a little tricky to ensure the resulting if-it macro can be used by other macros without knowledge of how if-it is implemented - at least until one discovers the following easy-to-use technique.
  (define-syntax (if-it stx)
(syntax-case stx ()
[(if-it e1 e2 e3)
(with-syntax ([it (syntax-local-introduce
(syntax-local-get-shadower #'it))])
#'(let ([it e1])
(if it e2 e3)))]))
The solution uses two seldomly used functions namely syntax-local-introduce and syntax-local-get-shadower. The last of these are relatively unknown - a search reveals it is only used four times in total in the PLT code base - therefore a quote from the documentation is in order.
(syntax-local-get-shadower identifier) returns identifier if no binding in the current expansion context shadows identifier, if identifier has no module context, and if the current expansion context is not a module. If a binding of inner-identifier shadows identifier, the result is the same as (syntax-local-get-shadower inner-identifier), except that it has the location and properties of identifier. Otherwise, the result is the same as identifier with its module context (if any) removed and the current module context (if any) added. Thus, the result is an identifier corresponding to the innermost shadowing of identifier in the current context if its shadowed, and a module-contextless version of identifier otherwise.
In short syntax-local-get-shadower allows the macro writer to break referential transparency. The call (syntax-local-get-shadower #'it) will return the identifier to which it is bound at the site of the macro call (and not at the site of definition).

Inserting the result of (syntax-local-get-shadower #'it)directly into the template of if-it won't work as expected though. The macro system is hygienic by default, so the identifier will subjected to renaming and will therefore not bind uses at the call site. Preventing renaming is easy though, just call syntax-local-introduce.

To test that if-it behaves properly when used in the definition of other macros and also works with the module system I used the following tests. The four tests all return #t.

(module mod-if-it mzscheme
(provide if-it)
(define-syntax (if-it stx)
(syntax-case stx ()
[(if-it e1 e2 e3)
(with-syntax ([it (syntax-local-introduce
(syntax-local-get-shadower #'it))])
#'(let ([it e1])
(if it e2 e3)))])))

(module mod-cond-it mzscheme
(require mod-if-it)
(provide cond-it)
(define-syntax (cond-it stx)
(syntax-case stx (else)
[(cond-it)
#'(void)]
[(cond-it [else e])
#'e]
[(cond-it [q1 a1] more ...)
#'(if-it q1 a1 (cond-it more ...))])))

(require mod-cond-it)

(cond-it
[#t it]
[#t 'nope])

(let ((it 42))
(cond-it
[#t it]
[#t 'nope]))

(let ((if-it 43))
(let ((it 42))
(cond-it
[#t it]
[#t 'nope])))

(require mod-if-it)

(equal? '((b c) (y z))
(cond-it
[(memq 'b '(a b c))
(let ((it0 it))
(if-it (memq 'y '(x y z))
(list it0 it)
'nope0))]
[#t 'nope1]))

Labels: