Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Jones N.D.Partial evaluation and automatic program generation.1999

.pdf
Скачиваний:
12
Добавлен:
23.08.2013
Размер:
1.75 Mб
Скачать

Program specialization techniques 81

transforms the static store vs into vs0, then use:

Code generation for a conditional: if exp goto pp0

else pp00

control component of pp

successors(pp, vs)

 

 

 

 

 

 

 

return

 

fg

 

 

goto pp0

 

f(pp0, vs0)g

 

 

if exp

 

f(pp0, vs0)g

 

if exp evaluates to true

goto pp0

else pp00

f(pp00, vs0)g

 

if exp evaluates to false

 

 

f(pp0, vs0), (pp00, vs0)g

 

if exp is dynamic

4.4.4Transition compression

When the subject program in Figure 4.2 is slavishly specialized using the algorithm in Figure 4.6, the following residual program is produced:

(search,

(z,

(x

y z))): goto (cont, (z,

(x y z)));

(cont,

(z, (x y z))): valuelist := tl (valuelist);

 

 

 

 

goto (search, (z, (y z)));

(search, (z, (y z)))

: goto (cont, (z, (y z)));

(cont,

(z, (y z)))

: valuelist := tl (valuelist);

 

 

 

 

goto (search, (z, (z)));

(search, (z, (z)))

: goto (found, (z, (z)));

(found,

(z, (z)))

: value := hd (valuelist);

Though correct this result is not very pleasing. We therefore apply the technique called transition compression to eliminate the redundant gotos.

De nition 4.5 Let pp be a label occurring in program p, and consider a jump goto pp. The replacement of this goto by a copy of the basic block labelled pp is called

transition compression.

2

When we compress the above program to remove super uous jumps, we obtain the natural residual program, except that the composite label (search, (z, (x y z))) should be replaced by a simple one (a number or an identi er):

(search, (z, (x y z))): valuelist := tl (valuelist); valuelist := tl (valuelist); value := hd (valuelist);

The bene ts of the compression are evident: the code becomes neater and more e cient. However, indiscriminate use of transition compression o ers two pitfalls: code duplication and in nite compression. Code duplication can occur when two distinct transitions to the same program point are both compressed. When the residual program contains a loop, the compression can even be continued in nitely.

82 Partial Evaluation for a Flow Chart Language

When should transition compression be done?

Transition compression can be performed as a separate phase after the whole residual program has been generated, making it easy to avoid the above{mentioned problems. A program ow chart can then be built and analysed to see which transitions can safely be compressed. It is, however, desirable in practice to do the compressions along with the code generation since this may be more e cient than generating a whole program containing super uous gotos just to compress many of the gotos afterwards. Doing compression on the y makes it more complicated to ensure safe compression. One solution is to annotate some gotos as `residual', and let mix compress transitions from all the remaining ones. (We elaborate on this approach in Chapter 5.)

In this chapter we use a simpler strategy which does not involve annotating gotos. We compress all transitions that are not a part of a residual conditional. Note that the language does not permit more extensive compressing than directed by our strategy, since the branches of an if-statement may only contain jumps and not any other commands. The strategy causes some code duplication, but experience indicates that it is a minor problem. More important is that the compression strategy will not cause the program specializer to loop in nitely unless the subject program (with the given initial static data) contains a potential `bomb', that is, a sequence of instructions that will certainly loop in nitely whenever executed, no matter what (dynamic) input data is supplied.

Doing transition compression and code generation at the same time improves the results of self-application signi cantly. The explanation of this phenomenon is a little subtle, and since the issue is not, for now, important, we postpone the treatment to a later section.

4.4.5Choosing the right division is tricky

The task of classifying variables as static or dynamic is more di cult than it might appear at rst sight. A natural strategy would be to denote as static all variables that are assigned values computed from constants and static input data. As the following program fragment demonstrates, this strategy might cause the program specializer to loop in nitely.

iterate: if Y =6 0 then begin

X := X + 1; Y := Y - 1; goto iterate; end;

This seemingly innocent program has two variables, X and Y. If the initial value of X is known to be 0 and Y is unknown, it seems natural to classify X as static and Y as dynamic. The reason is that the only value assigned to X is X + 1, which can be computed at program specialization time since X is known. But this does not work, as the following shows.

Program specialization techniques 83

As is usually done in practice we intermingle the computation of poly and the code generation. Initially poly contains the specialized program point (iterate, 0). The code generation yields

(iterate, 0): if Y =6 0 then begin

Y := Y - 1; goto (iterate, 1); end;

We see that (iterate, 1) should be added to poly, hence we should add to the residual program

(iterate, 1): if Y =6 0 then begin

Y := Y - 1; goto (iterate, 2); end;

and so forth ad nauseam. The problem is that poly becomes in nite:

poly = f(iterate, 0), (iterate, 1), (iterate, 2), . . . g

This happens because the value of X, though known, is unbounded since its values are computed under dynamic control. The problem did not arise when we specialized the example programs of this chapter (an interpreter and mix itself). The problem is handled by classifying the unbounded variable(s) as dynamic. A division which ensures niteness of poly is itself called nite.

In this example, X should be classi ed as dynamic to prevent the program specializer from making use of its value, disregarding that it could have be computed at partial evaluation time. The process of classifying of a variable X as dynamic, when congruence would have allowed X to be static, is called generalization. As just witnessed, generalization can be necessary to avoid non-terminating partial evaluation. Another purpose of generalization is to avoid useless specialization (see Section 4.9.2).

To classify a su cient number of the variables as dynamic to ensure niteness of poly, always avoiding classifying an unnecessarily large number, is not computable. We treat the problem and how to nd an acceptable approximate solution in Chapter 14.

4.4.6Simple binding-time analysis

By assuming that the same division is applicable for each program point and ignoring the problem of ensuring niteness, it is easy to compute the division of all program variables given a division of the input variables. This process is called binding-time analysis, often abbreviated BTA, since it determines at what time the value of a variable can be computed, that is, the time when the value can be bound to the variable.

Call the program variables X1, X2, . . ., XN and assume that the input variables are X1, . . ., Xn, where 0 n N. Assume that we are given the binding times

84 Partial Evaluation for a Flow Chart Language

 

 

for the input variables, where each

 

is either S (for static) or D (for

b1

; . . . ; bn

bj

dynamic). The task is now to compute a congruent division (Section 4.4.1) for all

 

 

 

 

= D for the

the program variables: B = (b1; . . .; bN ) which satis es bi = D ) bi

input variables. This is done by the following algorithm:

 

1.

 

 

 

 

Construct the initial division B

= (b1

; . . .; bn; S; . . . ; S) and set

B = B

2.

If the program contains an assignment

 

Xk = exp

where variable Xj appears in exp and bj = D in B then set bk = D in B.

3.Repeat step 2 until B does not change any longer. Then the algorithm terminates with congruent division B.

4.4.7Online and o ine partial evaluation

Above we have described partial evaluation as a process which has two (or more)

stages. First

 

compute a division B from the program and the initial division B,

without making use of the concrete values of the static input variables. Then the actual program specialization takes place, making use of the static inputs to the extent determined by the division, not by the concrete values computed during specialization. This approach is called o ine partial evaluation, as opposed to online partial evaluation.

A partial evaluator makes (at least) two kinds of decisions: which available values should be used for specialization and which transitions should be compressed. Each decision is made according to a strategy employed by the partial evaluator.

De nition 4.6 A strategy is said to be online if the concrete values computed during program specialization can a ect the choice of action taken. Otherwise the

strategy is o ine.

2

Almost all o ine strategies, including those to be presented in this book, base their decisions on the results of a preprocess, the binding-time analysis.

Many partial evaluators mix online and o ine methods, since both kinds of strategies have their advantages. The main advantage of online partial evaluation is that it can sometimes exploit more static information during specialization than o ine, thus yielding better residual programs. O ine techniques make generation of compilers, etc., by self-application feasible and yield faster systems using a simpler specialization algorithm.

Chapter 7 contains a more detailed comparison of online and o ine partial evaluation.

Algorithms used in mix 85

4.4.8Compiling without a compiler

We now return to compilation by specializing the Turing interpreter from Figure 4.4. The rst task is to determine a division of the interpreter's variables, given that the program to be interpreted (Q) is static while its initial input tape (Right) is dynamic. It is fairly easy to see that the following variables

Q, Qtail, Instruction, Operator, Symbol and Nextlabel

may be static (S) whereas Right and Left must be dynamic (D) in the division. This information is given to the program specializer along with the interpreter text. Suppose mix is given the Turing interpreter text, a division of the interpreter variables and the Turing program in Figure 4.3

Q = (0: if 0 goto 3

1: right 2: goto 0 3: write 1)

Then the residual program shown in Figure 4.5 is generated. All assignments X := exp, where X is static, and tests if exp . . . , where exp is static, have been reduced away; they were performed at program specialization time. The labels lab0, lab1, and lab2 seen in Figure 4.5 are in fact aliases for specialized program points (pp, vs), where pp is an interpreter label and vs holds the values of the static interpreter variables. In the table below we show the correspondence between the labels in the target program and the specialized program points. (Since the interpreter variable Q holds the whole Turing program as its value at every program point, it is omitted from the table. The variable Operator is omitted for space.) The ()'s are the values of uninitialized variables.

Target

Interpre-

Static interpreter variables (vs):

 

 

label

ter label

Instruction

Qtail

Symbol

Nextlabel

lab0

init

( )

( )

( )

( )

 

 

 

 

 

 

lab1

cont

right

(2:goto 0 3:write 1)

0

3

lab2

jump

if 0 goto 3

(1:right 2:goto 0 3:write 1)

0

3

 

 

 

 

 

 

4.5Algorithms used in mix

We have described techniques that together form program specialization. They were presented one at a time, and it is possible to build a program specializer that applies these techniques in sequence, yielding an algorithm like this:

Input: A subject program, a division of its variables into static and dynamic, and some of the program's input.

Output: A residual program.

86 Partial Evaluation for a Flow Chart Language

Algorithm:

Compute poly and generate code for all the specialized program points in poly;

Apply transition compression to shorten the code;

Relabel the specialized program points to use natural numbers as labels.

This structure re ects the principles of program specialization well, but we have found it ine cient in practice since it involves rst building up a large residual program, and then cutting it down to form the nal version.

A more e cient algorithm

We now present the algorithm we implemented, where the phases mentioned above are intermingled. Along with the computation of poly, we generate code and apply transition compression. Variable pending holds a set of specialized program points for which code has not yet been generated, while marked holds the set of specialized program points for which code has already been generated. The algorithm is shown in Figure 4.7.

For simplicity we have omitted the relabelling of the specialized program point

(pp, vs).

4.6The second Futamura projection: compiler generation

We have seen how specializing an interpreter with respect to a source program gave a compiled version of the source program. In this section we examine how a stand-alone compiler can be generated by specializing mix with respect to the interpreter. The theoretical basis for this is the second Futamura projection:

compiler = [[mix]]L [mix, int]

This equation states that when mix is specialized with respect to an interpreter, the residual program will be a compiler. (Our mix has in fact an extra argument, division, not made explicit in the Futamura projections to avoid cluttering up the equations.) For the proof, let int be an S-interpreter written in L and let s be an S-program.

[[s]]S d =

[[int]]L [s, d]

by de nition of interpreter

=

[[([[mix]]L [int, s] )]]L d

by the mix equation

=

[[([[([[mix]]L [mix,int] )]]L s )]]L d

by the mix equation

=

[[([[compiler]]L s )]]L d

by naming the residual program

This establishes compiler as an S-to-L-compiler written in L.

The second Futamura projection: compiler generation 87

read(program, division, vs0);

 

 

1

pending := f (pp0, vs0) g;

(* pp0 is program's initial program point

*)

2marked := fg;

3while pending =6 fg do

4begin

5 Pick an element (pp, vs) 2 pending and remove it;

6marked := marked [ f(pp, vs)g;

7

bb := lookup (pp, program); (* Find the basic block labeled by pp in program*)

 

 

 

 

 

(* Now generate residual code for bb given vs

*)

8

code := initial

 

code(pp, vs);(* An empty basic block with label (pp, vs) :

*)

 

9

while bb is not empty do

 

10

begin

 

11,12

command := first

 

command(bb); bb := rest (bb);

 

 

 

13case command of

14X := exp:

15if X is classi ed as static by division

16then vs := vs [X 7!eval(exp, vs)];

 

 

(* Static assignment

*)

17

else code := extend(code, X := reduce(exp, vs));

 

 

 

(* Dynamic assignment

*)

18

goto pp':

 

 

19

bb := lookup (pp', program);

(* Compress the transition

*)

20

if exp then goto pp' else goto pp'':

 

 

 

if exp is static by division

 

 

 

then begin

(* Static conditional

*)

21if eval (exp, vs) = true

22then bb := lookup (pp', program);

 

 

(* Compress the transition

*)

23

else bb := lookup (pp'', program);

 

 

 

(* Compress the transition

*)

 

end

 

 

24

else begin

(* Dynamic conditional

*)

25pending := pending [ ( f(pp', vs)g n marked );

26pending := pending [ ( f(pp'', vs)g n marked );

27 code := extend (code, if reduce(exp, vs) goto (pp', vs) else (pp'', vs) );

end return exp:

code := extend(code, return reduce(exp, vs)); otherwise error;

28 end; (* while bb is not empty *)

29

residual := extend(residual, code);

(* add new residual basic block

*)

30

end

(* while pending 6= fg *)

 

 

Figure 4.7: The mix algorithm.

88 Partial Evaluation for a Flow Chart Language

4.6.1Specializing mix

When we want to specialize mix with respect to int we have to determine a division of the variables of mix. We do not address this in full detail as we did with the Turing interpreter since mix is a somewhat larger program. In the following discussion we will refer to the mix algorithm as presented in Figure 4.7.

The question to ask now is: what information will be available to mix1 when the following run, the compiler generation, is performed (for accuracy we show also the arguments divmix and divint, which were left out above)?

compiler = [[mix1]]L [mix2, divmix, [int, divint]]

In this run mix1 is the active specializer that is actually run on its three arguments. The rst argument is the program text of mix2 which is identical to mix1. The second argument is a division of mix2's variables. The third argument is the initial values of mix2 's static input variables. Two of mix2's three input variables are static, namely program, whose value is the interpreter, and division, whose value is the division divint . Thus mix2 is given the interpreter text and a division of the interpreter's variables but not the initial values of the interpreter's input variables.

Recall that mix applied to an interpreter and a source program yields a target program. When [mix1 ]]L [mix2,divint,int] is run, only the interpreter is available to mix2, so it can only perform those actions that depend only on the interpreter text and not on the source program. It is vital for the e ciency of the generated compiler that mix2 can perform some of its computation at compiler generation time.

We shall now examine the most important mix2 variables to see which have values at hand during compiler generation time and so can be classi ed as static by the division.

To begin with, the variables program and division are static. Variables vs and vs0 are intended to hold the values of some interpreter variables: this information is not available before the source program is supplied, hence they are dynamic. The congruence principle now forces pending, marked, code, and residual to be classi ed as dynamic. These variables will thus not be reduced away by mix1, and so will appear in the residual program corresponding to int, namely, the compiler.

Now consider lines 5{7 in the algorithm. The variable pp gets its value from pending and is hence dynamic. The variable bb gets its value by looking up pp in program (= the interpreter). Even though the source program is clearly static and bb always a part of it, the congruence principle implies that bb must be dynamic since pp is dynamic. It would be quite unfortunate if it were so. The dependency principle would now classify command as dynamic, with the consequence that hardly any computation at all could be done at compiler generation time.

Variable pp can be said to be of bounded static variation, meaning that it can only assume one of nitely values; and that its possible value set is statically computable. Here pp must be one of the labels in the interpreter, enabling us

The second Futamura projection: compiler generation 89

to employ a programming `trick' with the e ect that bb, and thereby command, become static. The trick is seen so often in program specialization that we devote a later section (4.8.3) to an explicit treatment. For now, the reader is asked to accept without explanation that bb and command are static variables.

4.6.2The structure of a generated compiler

It turns out that the structure of the generated stand-alone compiler is close to that of a traditional recursive descent compiler. We have already seen an example of target code generated by specializing the interpreter, and by the mix equation the generated compiler works in exactly the same way. Our present concern is the structure and e ciency of the compiler.

Figure 4.8 shows the compiler generated from the Turing interpreter. (The compiler is syntactically sugared for readability.)

The generated compiler represents an interesting `mix' of the partial evaluator mix and the Turing interpreter. The inner while-loop, line 10{23, closely resembles the interpretation loop. The conditionals that perform the syntactic dispatch stem directly from the interpreter. The intervening code generating instructions are, of course, not like in the interpreter but the connection is tight; the code generated here is exactly the instructions that the interpreter would have performed.

The inner while-loop containing syntactic dispatch and code generation looks quite natural, save perhaps the actions for compiling if-statements. This di ers from a handwritten compiler using pure predictive parsing, which would be likely to perform one linear scan of the source program and generate code on the y, followed by backpatching.

This compiler is, on the other hand, derived automatically from an interpreter, and it has thus inherited some of the interpreter's characteristics. An interpreter does not perform a linear scan of the source program; it follows the ow of control as determined by the semantics. The compiler does the same. As long as control can be determined from the source program alone a linear code sequence is generated.

When an if-statement is encountered this is no longer possible, since code must be generated for both of the branches. The compiler uses pending and marked to keep track of which source program fragments have to be compiled. After compiling an if-statement, compilation has to go on from two di erent points. One (to be executed on a false condition) is characterized by Qtail, the other (to be executed when a jump is made) is characterized by lbl, the target of the conditional jump. Therefore the two tuples (cont, Qtail) and (jump, lbl) are added to pending provided they are not already there and that they have not already been processed (that is, they are not in marked).

One point needs further explanation: the pairs (init, Q), (cont, Qtail), and (jump, lbl) are claimed to be of form (pp, vs). This does not seem reasonable at rst sight since vs should contain the values of all of the interpreter's static

90 Partial Evaluation for a Flow Chart Language

read(Q);

1 pending := f ('init, Q) g;

2marked := fg;

3while pending =6 '() do

4begin

5 Pick an element (pp, vs) 2 pending and remove it;

6 marked := marked [ f(pp, vs)g;

7case pp of

8

init:Qtail :=

Q;

(* vs = Q *)

9

generate

initializing code;

 

10while Qtail 6= '()' do

11begin

12Instruction := hd(Qtail); Qtail := tl(Qtail);

13case Instruction of

14

right:

code := extend(code,

 

 

 

 

 

 

left := cons(firstsym(right),left),

 

 

 

 

 

right := tl(right))

15

left:

code := extend(code,

 

 

 

 

 

 

right:= cons(firstsym(left),right),

 

 

 

 

 

left := tl(left))

16

write s:

code :=

 

 

 

 

extend(code, right := cons(s, tl(right)))

17

goto lbl:

Qtail := new

 

tail(lbl, Q);

 

18

if s goto lbl: pending := pending [ f('cont, Qtail)g n marked;

19

 

 

pending := pending [ f('jump, lbl)g n marked;

20

 

 

code := extend(code,

if s = firstsym(right)

 

 

 

 

 

 

goto ('jump, lbl)

 

 

 

 

 

 

else ('cont, Qtail));

21,22

otherwise:

error

 

23

end;

 

 

 

 

24

cont:if Qtail =6 '() goto line 11

(* vs = Qtail *)

25

jump:Qtail := new

 

tail(lbl, Q); if Qtail =6 '() goto line 11

 

 

 

 

 

 

 

(* vs = lbl *)

26otherwise: error;

27residual := extend(residual, code)

28end;

Figure 4.8: A mix-generated compiler.

variables. The point is that the only static variables whose values can be referenced after the program points init, cont, and jump are Q, Qtail, and lbl. This is detected by a simple live static variable analysis described later on.

The variables pending and marked have two roles. First, pending keeps track of the advance of the compilation, in a way corresponding to the recursion stack in a recursive descent compiler. Secondly, pending and marked take care of the correspondence between labels in the source and target programs, as the symbol table does in a traditional compiler.

As to e ciency, computer runs show that target = [[compiler]]L source is computed about 9 times as fast as target = [[mix]]L [mix, source] .