Jones N.D.Partial evaluation and automatic program generation.1999
.pdf
352 Program Transformation
and traverses its rst argument us twice. To transform it into a more e cient one pass program, we instantiate h's de nition twice, followed by some unfolding andnally folding, as follows.
4. |
h |
Nil vs ws |
-> |
append (append Nil vs) ws |
(1,2) |
5. |
h |
(Cons u us) vs ws -> |
append(append (Cons u us)vs) ws (1,3) |
||
Unfolding the innermost append calls gives
6. |
h |
Nil vs ws |
-> |
append |
vs ws |
(4,2) |
7. |
h |
(Cons u us) vs ws -> |
append |
(Cons |
u(append us vs)) ws (5,3) |
|
and unfolding the outermost call in this gives
8. h (Cons u us) vs ws -> Cons u (append(append us vs) ws) (7,3)
Finally, folding with the original de nition of h gives
9. h (Cons u us) vs ws -> Cons u (h us vs ws) |
(8,1) |
Selecting a minimal complete set gives the desired program, which traverses therst argument of h only once:
6. |
h Nil vs ws |
-> |
append vs ws |
9. |
h (Cons u us) vs ws |
-> Cons u (h us vs ws) |
|
2. |
append Nil ys |
-> |
ys |
3.append (Cons x xs) ys -> Cons x (append xs ys)
First thoughts towards automation
The example above follows the cycle `de ne, instantiate, unfold, fold,' and some characteristics suggest how the process may be more fully automated. Suppose we think of the transformation process as beginning with a program and a `rule of interest' Linitial -> Rinitial, and that we separate the original program from the newly constructed rules.
In the example we began with rule of interest and added new rules which were then processed in turn. It is natural to describe this by a set Pending, of rules that have been selected for attention, together with a set Out, of nished rules that have been added to the transformed program. In the example just seen, initially Pending = fh us vs ws -> append(append us vs)wsg and Out = f g .
Transformation processes the rules in Pending. Some lead to new de nitions that are added to Pending for further processing, and others are added to Out at once, to appear in the nal transformed program. We have not yet described the core problem: how new de nitions are devised. Two special transformation algorithms, one for partial evaluation and one for deforestation, and each with its own de nition strategy, will be seen in later sections. To our knowledge, however, no one has successfully automated the full fold/unfold paradigm.
Fold/unfold transformations 353
Fibonacci
Applying the same approach to an exponential time program to compute the nth Fibonacci number yields a more dramatic speedup. For simplicity n is represented using constructors as 0+1+1...+1, writing the binary constructor + in in x notation. This example follows the cycle `de ne, instantiate, unfold, fold,' but a bit more inventively.
1. |
fib x |
-> f x |
||
2. |
f |
0 |
-> |
0 |
3. |
f |
0+1 |
-> |
0+1 |
4.f n+1+1 -> f(n+1) + f(n)
First introduce a new rule 5 by a so-called `eureka!' step, instantiate it twice, and then unfold some f calls. Here ( , ) is a pair constructor, written in `out x' notation.
5.g x -> (f(x+1),f(x))
6.g 0 -> (f(0+1),f(0)) -> (0+1,0)
7.g x+1 -> (f(x+1+1),f(x+1))
8.g x+1 -> (f(x+1) + f(x), f(x+1))
Call f(x+1) occurs twice, suggesting a new de nition to eliminate the common subexpression:
9.h (u,v) -> (u + v, u)
Folding with de nitions of h and then the original g gives:
10.g x+1 -> h(f(x+1), f(x))
11.g x+1 -> h(g(x))
The last transformation is to express fib in terms of g:
12. snd (u, v) |
-> v |
13.snd (f(x+1),f(x)) -> f(x)
14. snd (g(x)) |
-> f(x) |
15.fib x -> snd(g(x))
Selecting a minimal complete set gives the transformed program:
15.fib x -> snd(g(x))
6.g 0 -> (0+1,0)
11.g x+1 -> h(g(x))
9.h (u,v) -> (u + v, u)
A major improvement has occurred since the transformed program's running time has been reduced from exponential to linear, assuming either call-by-value or call- by-need evaluation.
354Program Transformation
17.2.2Correctness
For general optimization the rule of interest is just a call to the program's main function, as in the examples above. For partial evaluation we are interested in function specialization, so the rule of interest is a call giving values to some static inputs, e.g. f5;7 y z -> f 5 7 y z.
Correctness means that any call to Linitial in the original program gives exactly the same results in the transformed program. For some purposes this may be relaxed so a transformed program is acceptable if it terminates on at least all the inputs on which the original program terminates, and gives the same answers on them. It is less acceptable that a transformed program terminates less often than the original, and quite unacceptable if it gives di erent answers.
A full discussion of correctness is beyond the scope of this book; see Kott and Courcelle for formal treatments [154,63]. Intuitively, unfolding is correct since it follows execution as described in the rewriting rules, and de nition and instantiation cannot go wrong. On the other hand, folding in e ect `runs the program backwards' which can in some cases cause a loss of termination. To see this, consider rule f x -> x+1. Here term x+1 can be folded into the same rule's left side, yielding f x -> f x. This is a trivially complete and non-overlapping program, but never terminates.
Unfolding can increase termination under call-by-value, an example being
f x |
-> |
g |
x (h x) |
g u v -> |
u+1 |
||
h y |
-> |
h |
y |
where unfolding the g call yields f x -> x+1. This terminates for all x, though the original program never terminates under call-by-value.
17.2.3Steering program transformation
The rules of Figure 17.3 are highly non-deterministic, particularly with regard to making new de nitions and instantiation. We saw that one must take care to avoid changing program termination properties. On the other hand substantial payo can be realized, for instance improving the Fibonacci program from exponential to linear time.
In this section we describe a general goal-directed transformation method. It will rst be specialized to yield a simple online partial evaluator, and will later give a way to remove intermediate data structures, e.g. as in `double append' of Section 17.2.1.
An approach to deterministic transformation
The method uses two sets of rules in addition to the original program: Out, the rules that so far have been added to the transformed program; and Pending, a
356 Program Transformation
A partial evaluation algorithm
The algorithm is shown in Figure 17.4. It begins with an initial goal rule L0 -> R0, where R0 would typically be a function name applied to some constants and some ground terms. Attention points are leftmost innermost calls, and the rules added to Pending have form ... -> f P1 . . .Pn, where in the right side basic con gurations, each Pi is either a variable or a ground term.
In a call f t1. . . tn, if argument ti contains no variables, then it will be reduced until its value t0i is known (it could be or contain a call). This paves the way for de nition to create a specialized version of f, in case some ti 's evaluate to ground terms. Instantiation is done to ensure completeness | that all possibly applicable rules will also be processed.
Main program
Out := fg;
Pending := fLinitial -> Tinitialg;
while 9 an unmarked rule L0->R0 2 Pending do Mark it;
forall L -> R 2 Program and , 0 such that L = 0R0 do
Add 0L0 |
-> T [[ R]] to Out; |
De nition of T |
|
T [[x]] |
= x |
T [[c t1. . . tn ]] |
= c T [[t1]]. . . T [[tn]] |
T [[f t1. . . tn ]] |
= |
let u1 = T [[t1]],. . . , un = T [[tn]] in
let ( 0 , L0->R0) = Make new(f u1. . . un) in
if L0 -> R0 2 Pending then 0L0 else
if 9 rule L -> R 2 Pgm and with L = f u1 . . .un then T [[ R]] else Add L0 -> R0 to Pending;
result is 0L0
De nition of Make new
Make new(f t1. . . tn) = ( , g xi1. . . xim->f s1. . . sn) where
si = if ti is ground then ti else xi
(f s1. . . sn) = f t1. . .tn
g is a new function name, depending only on f and the ground si's fi1; . . .img = fi j ti is non , groundg
Figure 17.4: Partial evaluation by fold/unfold.
Explanation. In partial evaluation terms, we assume the initial call con guration contains some arguments which are ground terms and some variables. Variables
Partial evaluation by fold/unfold 357
that have not been rewritten as a result of unfolding remain unchanged, and arguments of constructors are transformed.
Function T [[ ]] unfolds a call whenever the statically available information is su cient to see which rule is to be applied. When dynamic information is needed, no left side will match at partial evaluation time, and so a new rule is created. The e ect of `Make new' is to create a new function, specialized to all those arguments of the current call which are ground terms. Thus new functions are created whenever a dynamic test is encountered, a strategy we have seen before.
In fold/unfold terms, Pending contains only de nitions L0 -> R0 of new functions, either the initial one or ones devised in T [[ ]] when no left side matched the current con guration.
In the main program, the condition that L -> R 2 Program and L = 0R0 implies that rule 0L0 -> R can be obtained by instantiation and unfolding.
The rules added to Out are obtained from 0L0 -> R by applying function T [[ ]], which either unfolds further, or folds with a function whose de nition has been placed in Pending.
Thus each step follows one of Burstall and Darlington's rules. It can be seen that Out will be a complete and non-overlapping set, provided the original program was.
Example
We specialize Ackermann's function ack m n to static values of m, writing 1,2,...
instead of 0+1,0+1+1,.... The initial rule is a2 x -> ack 2 x, and the original program is
ack 0 n |
-> |
n+1 |
|
ack |
m+1 |
0 -> |
ack m 1 |
ack |
m+1 |
n+1 -> |
ack m (ack m+1 n) |
Applying the algorithm above constructs in turn
Pending = f a2 |
x -> ack 2 |
xg |
Initial |
|
Out |
= f a2 |
0 -> 3g |
|
Match x to 0, evaluate completely |
Pending = f a1 |
y -> ack 1 |
y,. . . g |
Match x to n+1, add fcn. a1 |
|
Out |
= f a2 |
n+1 -> a1 (a2 n), . . . g |
Call the new function |
|
Out |
= f a1 |
0 -> 2, . . . g |
Match x to 0, evaluate completely |
|
Out |
= f a1 |
n+1 -> (a1 |
n)+1, . . .g |
By a1 n+1 -> ack 0 (ack 1 n) |
Final remarks. Thus at least one partial evaluation algorithm can be expressed using fold/unfold transformations. We hope the reader can see that the more complex o ine algorithms of earlier chapters can also be seen as instances of Burstall and Darlington's approach.
This scheme is deterministic, and folding is done in a su ciently disciplined way to avoid losing termination. On the other hand, it su ers from three weaknesses
358 Program Transformation
seen in earlier partial evaluators:
It can increase termination, since some unfolding is done when the call-by- value semantics would not do it.
Partial evaluation can produce in nitely many residual rules, or can loop non-productively while computing on ground values.
Subcomputations can be repeated.
17.4Supercompilation and deforestation
Supercompilation (in the sense of `supervised compilation') is a program transformation technique developed in the USSR in the 1970s by Valentin Turchin. It is of interest in this book for several reasons:
Historical: it appears that Turchin and, independently, Beckman et. al. were the rst to realize that the third Futamura projection would yield a generator of program generators (Futamura described only the rst two) [263,92,19].
Philosophical: supercompilation is a concrete manifestation of the very general concept of `metasystem transition' [266,98].
Power: supercompilation can both do partial evaluation and remove intermediate data structures as seen in the `double append' example of Section 17.2.1.
On the other hand, supercompilation in its full generality has not yet been selfapplied without hand annotations, partly due to problems concerning termination.
Turchin's work has only become appreciated in the West in recent years [267]. His earlier papers were in Russian, and expressed in terms of a functional language Refal not familiar to western readers. Many important ideas and insights go back to the 1970's, but that work has some loose ends, e.g. semantics is not always preserved exactly.
Wadler's `deforestation' [274], with a more solid semantic foundation, can be regarded as a limited application of supercompilation to a lazy language. In this section we shall present what in our conception is the essence of Turchin and Wadler's ideas, also using a lazy functional language.
An example
The earlier program using in nite lists with initial rule f n -> take n ones has a call which cannot immediately be unfolded. Instantiating n and unfolding ones yields rules that can be rewritten:
Supercompilation and deforestation 359
f |
Nil |
-> |
take |
Nil ones |
f |
(Cons x xs) |
-> |
take |
(Cons x xs) (Cons 1 ones) |
Unfolding the take calls yields:
f |
Nil |
-> |
Nil |
f |
(Cons x xs) |
-> |
Cons 1 (take xs ones) |
containing the con guration take xs ones, which has already been seen. This can be folded back into f to yield a simpler and faster program:
f |
Nil |
-> |
Nil |
f |
(Cons x xs) |
-> |
Cons 1 (f xs) |
Informal explanation. Computations are simulated to allow for all constructors in f n for any n. To nd the outermost constructor, it is necessary to nd the outermost constructor of take n ones. The rst take rule requires instantiating n to Nil, and the second requires instantiating n to Cons x xs plus a single unfolding of ones. Further unfolding of ones is not done, since this would not faithfully model lazy evaluation.
A syntactic restriction
The program transformation algorithm is easier to express if we assume there are only two function de nition forms, either h-functions or g-functions. The `generic' letter f will stand for either form.
h |
x1. . . xn -> Rh |
One rule, no pattern matching |
g |
P1 x1. . . xn -> Rg1 |
A rule set, with pattern matching |
... |
on the rst argument only |
|
g Pm x1. . . xn -> Rgm
where each Pi = ci y1. . . ypi is a simple pattern only one constructor deep. Augustsson describes how an arbitrary program may be put into this form without changing its semantics, even when overlapping rules are resolved by top-to-bottom rule application [15].
Note that append has g-form. The program using in nite lists needs a little rewriting to t the two forms (where f and ones are functions of form h):
f n -> take n ones ones -> Cons 1 ones
take Nil ys |
-> |
Nil |
take (Cons x xs) ys |
-> |
u ys xs |
u (Cons y ys) xs |
-> |
Cons y (take xs ys) |
