Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Jones N.D.Partial evaluation and automatic program generation.1999

.pdf
Скачиваний:
9
Добавлен:
23.08.2013
Размер:
1.75 Mб
Скачать

 

 

 

 

Higher-order binding-time analysis

321

 

 

 

 

 

 

 

 

 

 

Be[[c]]

 

 

= S

 

 

 

 

 

 

Be[[x]]

 

 

= x

 

 

 

 

 

 

Be[[if e1 e2

e3]]

 

=

Be[[e1 ]]

t Be[[e2]] t Be [[e3]]

 

 

Be[[(call fi e1 . . . ea)]]

=

fi

 

 

 

 

 

 

Be[[(op` e1 . . . ea)]]

 

=

Fja=1 Be[[ej ]]

 

 

 

 

 

Be[[( x .e)]]

 

 

=

`

 

 

 

 

 

 

Be[[(e1 e2)]]

 

= Be[[e1 ]] t (F f ` j ` 2 Pe[[e1]] g)

 

Be[[(let (x e1) e)]]

 

=

Be[[e1 ]]

t Be[[e]]

 

 

Figure 15.3: The Scheme1 binding-time analysis function Be.

 

 

 

 

 

 

 

 

 

 

 

 

Bv[[c]] y

 

 

= S

 

 

 

 

 

 

Bv[[x]] y

 

 

= S

 

 

 

 

 

 

Bv[[if e1 e2 e3 ] y

= Bv [[e1 ]] y t Bv[[e2]] y t Bv [[e3]] y

Bv[[(call fi

e1 . . .

ea)]]

y = t t Be[[ej ]]

 

if y is xij

 

 

 

 

 

= t

 

 

 

otherwise

 

 

 

 

 

 

where t =

 

a

Bv[[ej ]] y

 

 

 

 

 

 

a

 

Fj=1

 

 

 

Bv[[(op` e1 . . . ea)]]

y

= Fj=1 Bv[[ej ]]

y

 

 

Bv[[( x .e)]]

y

 

= Bv [[e]]

y

 

 

 

 

Bv[[(e1 e2)]]

y

 

= t t Be[[e2]]

 

if y is x` and ` 2 Pe[[e1]]

 

 

 

 

= t

 

 

 

otherwise

 

 

 

 

 

 

where t = Bv [[e1 ]] y t Bv[[e2]]

y

 

Bv[[(let (x e1) e)]]

y

= t t Be[[e1]]

 

if y is x

 

 

 

 

 

= t

 

 

 

otherwise

 

 

 

 

 

 

where t = Bv [[e1 ]] y t Bv[[e2]]

y

 

 

 

 

Figure 15.4: The Scheme1 binding-time propagation function Bv.

 

 

in e, where t is the context of e.

An application Bd[[e]] ` returns the context

(S or D) of lambda x`.. . . in e, where e is in a static context, so Bd[[e]

` =

Bdd [[e]] `S.

Except for the complications due to lifting of lambda abstractions (see Section 10.1.4), the binding-time analysis functions Be and Bv for Scheme1 are rather similar to those for Scheme0. Also, the results of the closure analysis Pe[[e]] are used only in the higher-order applications (e1 e2). Closure analysis gives a very simple, essentially rst-order, extension of analyses to higher-order languages. This approach works reasonably well for monovariant binding-time analysis, as in Similix, but may be very imprecise for other program analyses.

322 Program Analysis

Bd[[c]] `

 

 

=

S

 

 

 

 

 

Bd[[x]] `

 

 

=

S

 

 

 

 

 

Bd[[if e1 e2 e3 ] `

 

=

Bd [e1]]

` t Bd[[e2 ]] ` t Bd[[e3]]

`

Bd[[(call fi e1 . . . ea)]]

` =

ja=1 Bdd[[ej ]]

`( xij )

 

Bd[[(op e1 . . . ea)]]

`

=

Fa

Bd[[ej]]

`

 

Bd[[( x

`0

`

 

=

Fj=1

 

 

0

)

 

.e)]]

 

Bdd[[e]]

`(`

 

Bd[[(e1

e2)]]

`

 

=

Bdd[[e1 ]]

`(Be[[e1]] ) t Bdd[[e2 ]]

`t

 

 

 

 

 

where t = Ff x`j` 2 Pe[[e1]]g

 

Bd[[(let (x e1) e)]]

`

=

Bdd[[e1 ]]

`( x) t Bd[[e2 ]] `

 

Bdd[[e]]

`t

 

 

=

D

 

 

 

if t = D and ` 2 Pe [[e]]

 

 

 

 

=

Bd [e]] `

 

otherwise

 

 

 

 

 

 

 

 

 

 

 

 

Figure 15.5: The Scheme1 dynamic context function Bd.

15.3.3Comparison with the real Similix

The present binding-time lattice is fS; Dg with S as least element. The bindingtime lattice used in Similix is f?; S; Cl; Dg, which distinguishes static rst-order values from static closure values [27, Section 5.7]. The new bottom element ?, which describes non-terminating expressions, is needed because the elements S and Cl are incomparable.

However, Cl plays the role of a type rather than a binding time in Similix. It allows the specializer to distinguish static closures from other static data without using type tags during specialization. Thus the distinction between S and Cl is not important for pure binding-time reasons and has been left out here. Indeed, in a binding-time analysis for Similix developed recently, the type and binding-time aspects have been separated into two di erent analyses.

15.3.4Binding-time analysis of a Scheme1 program

Binding-time analysis of a program should produce a safe description ( ; ; ) which is as precise as possible. This is obtained as the least solution to a set of equations specifying the safety requirement.

First, must map a lambda label ` to the binding time Be[[e]] of the body of

x`.e.

Secondly, should map a function variable x to its binding time, that is, the least upper bound (lub) of the binding times of the values that it can be bound to. But this is the lub of Bv [[e]] x over all function bodies bodyi in the program. Also, should map a lambda variable x` to its binding time, that is, the lub of

Projections and partially static data 323

Bv[[bodyi]] x` over all function bodies bodyi. Third, should map a lambda label ` to D if the lambda x`.e is in a dynamic context, that is, the lub of Bd[[bodyi]] ` over all function bodies bodyi in the program.

These requirements are embodied in the equations below:

fi

=

Be[[bodyi ]]

 

where bodyi is the body of fi

 

`

=

Be[[e]]

 

where e is the body of x`.e

 

x

=

in=1 Bv [[bodyi ]]

x

for every function variable x

 

x`

=

F` t ( in=1 Bv [bodyi ] x`)

for every lambda variable x`

 

x

=

in=1 BFv [[bodyi ]]

x

for every let-variable x

 

 

 

 

 

Fn

 

 

 

 

 

`

=

Fi=1 Bd [[bodyi ]]

`

 

`

 

`

Note that if lambda ` is dynamic (` = D), then so is its variable x

 

(that is x

 

= D), by virtue of the second equation.

Any solution to these simultaneous equations is a safe analysis result. The least solution is the most precise one.

15.4Projections and partially static data

In the binding-time analyses shown so far, only functional values have been partially static, whereas rst-order values have been considered either completely static, or else dynamic. However, as outlined in Section 10.6, it is possible to allow partially static rst-order data structures also.

For instance, a value may be a pair whose rst component is static and whose second component is dynamic. Another typical case is an association list used to represent the environment in an interpreter. This is a list of (name, value)-pairs, each with static left component and dynamic right component.

Below we explain a projection-based approach to binding-time analysis of partially static data structures in strongly typed languages. The method is due to Launchbury, and this description is based on his thesis and book [167]. Essentially we shall exemplify Launchbury's approach, and for simplicity our rendering will be less precise than his.

15.4.1Static projections

Let X be a domain of values, equipped with an ordering < and a least element ? (meaning `unde ned' or `not available'). A projection on X is a function: X ! X such that for x; y 2 X the following three conditions are satis ed: (1) x v x, (2) (x) = x, and (3) x v y implies (x) v (y).

If we read y v x as `y is a part of x' (where ? is the `empty' or `void' part), then requirement (1) says that maps a value x to a part of x.

324 Program Analysis

Now think of x as the static part of x. Then requirement (1) says that the static part of x must indeed be a part of x, requirement (2) says that the static part of the static part x of x is precisely the static part of x, and requirement (3) says that when x is a part of y, then the static part of x is a part of the static part of y. These are all intuitively reasonable requirements.

In this case, we call a static projection: the projection picking out the static part of a value.

15.4.2Projections on atomic type domains

First consider a type of atomic data, such as int or bool. If X is the domain of values of type int, then X = f?; 0; 1; ,1; . . .g. The ordering on X is very simple: y v x if and only if y = ? or y = x, so the static part y of an integer x must either be void or else x itself. This re ects the fact that integers are atomic values.

There are in nitely many projections on X. For instance, for each integer i, there is a projection which maps i to itself and everything else to ?.

However, for binding-time analysis, only two projections on X are particularly useful: ABS and ID, where for all x 2 X,

ABS x

=

?

ID x

=

x

The absent projection ABS says that no part of x is static, and the identity projection ID says that the whole of x is static. It is clear that ABS and ID are precisely the binding-times D and S from the Scheme0 binding-time analysis (Section 5.2).

It is conventional to de ne the ordering on projections pointwise, as for other functions. Thus ABS < ID, which is just the opposite of the ordering S < D on fS; Dg, and also contrasts with the usual situation in abstract interpretation, where the smaller abstract values are the more informative. This di erence is purely a formality, though. We just have to remember that a larger static projection is better (more informative) than a smaller one.

Usually, only a few of the projections on a domain, such as ABS and ID above, are useful for binding-time analysis. These `useful' projections are here called the admissible static projections. We shall de ne the set of admissible static projections as we proceed, by induction on the structure of the types they work on (hence the requirement of strong typing).

We de ne: an admissible static projection on an atomic type domain is ABS or ID. Thus it corresponds to one of the binding times D and S previously used.

15.4.3 Projections on product type domains

The above example shows that projections can describe binding times of atomic values, but their real utility is with composite data, such as pairs, which can be

Projections and partially static data 325

partially static.

Consider a product type, such as int * bool. If X and Y are the value domains corresponding to int and bool, then the domain of values of type int * bool is

X Y = f(x; y)jx 2 X; y 2 Y g

A value v in product domain X Y has form v = (x; y), and the values are ordered componentwise, so (?; ?) is the least element. The following four projections on the domain are particularly useful:

v =

(x; y)

 

 

a(v) =

(?; ?)

b(v) =

(x; ?)

c(v) =

(?; y)

d(v) =

(x; y)

Projection a says that none of the components is static; b says that the left component is static; c the right component; and d that both are static. The four projections could be given the more telling names ABS, LEF T , RIGHT, and ID.

The static projections on X Y can often be written as products of projections on X and Y . Namely, whenever 1 is a projection on X and 2 is a projection on Y , their product 1 2 is a projection on X Y , de ned by

( 1 2)(x; y) = ( 1 x; 2 y)

In particular, the four projections listed above are

a

= ABS ABS

b

= ID ABS

c

=

ABS ID

d

=

ID ID

We de ne: an admissible static projection on a product type is the product of admissible projections on the components.

15.4.4Projections on data type domains

Consider the non-recursive data type

datatype union = Int of int | Bl of bool

where Int and Bl are the constructors or tags of the data type. If X and Y are the value domains corresponding to types int and bool, then the value domain corresponding to union is the tagged sum domain

Int X + Bl Y = f?g [ fInt(x)jx 2 Xg [ fBl(y)jy 2 Y g

326 Program Analysis

A value v of the sum domain either is ? or has one of the forms v = Int(x) and v = Bl(y). The new value ? is less than all others, two values of the same form are compared by the second component, and values of di erent forms are incomparable. There are ve particularly useful projections on the sum domain:

v =

?

Int(x)

Bl(y)

 

 

 

 

a(v) =

?

?

?

b(v) =

?

Int(?)

Bl(?)

c(v) =

?

Int(x)

Bl(?)

d(v) =

?

Int(?)

Bl(y)

e(v) =

?

Int(x)

Bl(y)

Projection a says that no part of the value is static; b says that the tags are static but nothing else is; c says that the tags are static, and if the tag is Int, then the argument is static too; d says the tags are static, and if the tag is Bl, then the argument is static; and e says that the entire value is static. Thus a and e really are ABS and ID on the sum domain, and a suitable name for b would be T AG.

Note that there exist other projections on the union type, for instance b0 withb0 (Int(x)) = Int(?) and b0 (Bl(y)) = ?. However, such projections are unlikely to be useful as binding times, since a specializer cannot easily exploit a static tag such as Int unless all tags are static.

Some of the projections on Int X + Bl Y can be written as sums of projections in X and Y . Whenever 1 is a projection on X and 2 is a projection on Y , their tagged sum Int 1 + Bl 2 is a projection on Int X + Bl Y , de ned by

(Int 1

+ Bl 2)(?)

=

?

(Int 1

+ Bl 2)(Int(x))

=

Int ( 1 x)

(Int 1

+ Bl 2)(Bl(y))

=

Bl ( 2 y)

Projection a above cannot be written as a sum of projections, but the other four can:

a

= ABS

b

= Int ABS + Bl ABS

c

= Int ID + Bl ABS

d

=

Int ABS + Bl ID

e

=

Int ID + Bl ID

We de ne: an admissible static projection on a data type is ABS, or the sum of admissible projections on the constructor argument types.

The sum of projections 1; . . . ; n over constructors c1; . . . ; cn is written Pni=1 ci i.

15.4.5 Projections on recursive data type domains

Consider the recursive data type

Projections and partially static data 327

datatype intlist = Nil j Cons of (int * intlist)

de ning the type of lists of integers. It is rather similar to the datatype de nition in the preceding section, except for the recursion: the fact that intlist is used to de ne itself.

Writing instead datatype intlist = T. Nil j Cons of (int * T), we can emphasize the recursion, using the recursion operator . If X is the value domain corresponding to Int, then the value domain corresponding to intlist is the recursively de ned domain

1

V:Nil + Cons (X V ) = [ Fk(f?g)

k=0

where F (V ) = Nil + Cons (X V ). That is, the values of this type are f?, Nil,

Cons(?; ?), Cons(1; ?), Cons(1; Nil), Cons(1; Cons(2; Nil)), . . . g, namely thenite and partial lists of values of type int.

There are three particularly useful projections on the recursively de ned domain:

v =

?

Nil

Cons(x; v0)

a(v) =

?

?

?

b(v) =

?

Nil

Cons(?; bv0)

c(v) =

?

Nil

Cons(x; cv0)

Projection a says that no part of the value is static; b says that the tags are static and that b much of the list tail is static; and c says that the tags, the list head, and c much of the list tail are static. Thus a is ABS on the recursively de ned domain. Projection b says that the tags are static and that the same holds for the list tail, so the structure of the entire list must be known. An appropriate name for b therefore is ST RUCT . Note that when the structure is static, in particular the length of the list will be known during partial evaluation. Projection c says that the tags and the list head are static and that the same holds for the list tail, so everything is static. Formally, c = ID, the identity on the recursively de ned domain.

When considering projections over recursively de ned datatypes, we require them to be uniform projection in the same manner as b and c above. They must treat the recursive component of type intlist the same as the entire list | we want to consider only binding-time properties which are the same at every level of recursion in the data structure.

A non-ABS uniform projection over intlist has form :Nil + Cons( 0 ), de ned by

1

:Nil + Cons( 0 ) = G G(ABS)

k=0

where G( ) = Nil + Cons ( 0 ) and where 0 is a projection over type int. The projections b and c above do have this form:

328 Program Analysis

a

= ABS

b

=

:Nil + Cons (ABS )

c

=

:Nil + Cons (ID )

Note that ABS and ID are precisely the admissible static projections on the component type int.

We de ne: an admissible static projection on a recursive datatype is ABS, or the uniform sum of admissible projections on the components. This completes the inductive de nition of admissible projections.

15.4.6An example: Association lists

In Section 10.5.5 we considered a Scheme function mkenv building an association list: a list of (name, value)-pairs. We also saw how one could use grammars to say that all the name components were static and that the value components were not. In a typed language the association list would belong to a recursive datatype

datatype assoc = End | More of ((name * value) * assoc)

where name and value are the types of names and values, assumed to be atomic. The admissible projections on these component types (name and value) are ABS and ID, and the admissible projections on name * value are ABS, RIGHT , LEF T , and ID shown in Section 15.4.3.

Following the section on recursive datatypes, we use the following ve uniform projections on the assoc type:

a

= ABS

b

= :End + More (ABS )

c

=

:End + More (LEF T )

d

=

:End + More (RIGHT )

e

= :End + More (ID )

Projection a = ABS says that nothing is static; b says the structure is static; c says the structure and all the name (that is, left) components are static; d says the structure and all the value (that is, right) components are static; and e = ID says everything is static. Suitable names for b, c, and d would be ST RUCT ,

ST RUCT (RIGHT ), and ST RUCT (LEF T ).

The binding time of mkenv's result, which was described by a grammar in Section 10.5.5, can now be described simply as ST RUCT (RIGHT ).

15.5Projection-based binding-time analysis

We have introduced projections and have shown how they describe the binding times of partially static data in a typed language. Now we outline a monovariant

Projection-based binding-time analysis 329

projection-based binding-time analysis for such a language. This analysis should be compared to the Scheme0 binding-time analysis in Section 5.2.

15.5.1The example language PEL

The example language is Launchbury's PEL (`partial evaluation language'). A program consists of datatype de nitions and simply typed rst-order function definitions. Each function has exactly one argument (which may be a tuple):

datatype T1 = . . .

. . .

datatype Tm = . . .

fun f1

x1 = e1

 

. . .

 

 

 

and fn xn = en

 

The syntax of expressions is given in Figure 15.6.

 

 

 

 

 

hExpri

::=

hVari

Variable

 

j

(hExpri, . . . , hExpri)

Tuple

 

j

hConstri hExpri

Constructor applic.

 

j

hFuncNamei hExpri

Function application

 

j

case hExpri of hMatchi. . . hMatchi

Case expression

hMatchi

::=

hConstri hVari => hExpri

Case match

 

 

 

 

Figure 15.6: Syntax of PEL, a typed rst-order functional language.

15.5.2Binding-time analysis maps

Let FuncName be the set of function names, Var the set of variable names, and Proj the set of admissible projections (on all types). The binding-time analysis uses two maps and to compute a third, namely the monovariant division , where

:

FunEnv

=

Fun ! (Proj ! Proj)

:

BTEnv

=

Var ! Proj

: Monodivision = Fun ! Proj

When projection describes how much of f's argument is static, then ( f) is a projection describing how much of f's result is static; x describes how much of the value of variable x is static; and ( f) describes how much of f's argument is static.

330 Program Analysis

Compare this with the Scheme0 binding-time analysis. The role of is the same: describing the binding time of variables. The Scheme0 binding-time analysis needs no map, since the binding time of the result of f may safely be equated to that of its argument. If the argument is static, then surely the result is static too, and if the argument is dynamic, then the result may safely be assumed to be dynamic too.

On the other hand, the Scheme1 binding-time analysis (Section 15.3) does use a map. This is because in Scheme1 the result of a function may be dynamic although its arguments are static. The reason is that Scheme1 allows partially static functions: an argument may be a (partially) static lambda with dynamic free variables.

15.5.3Binding-time analysis functions

The projection-based binding-time analysis for PEL consists of two functions Bpe and Bpv, analogous to Be and Bv in the Scheme0 binding-time analysis. That is, Bpe[[e]] is a projection describing how much of e's value is static, and Bpv[[e]] g is a projection describing how much of g's argument is static in the applications of g found in e. Function Bpe is shown in Figure 15.7 and Bpv is shown in Figure 15.8.

In the gures, [x 7! ] 2 BTEnv maps x to and everything else to ABS, and[x 7! ] denotes updated to map x to .

Bpe[[e]]: FunEnv ! BTEnv ! Proj

Bpe[[x]]

=

x

Bpe[[(e1,. . . ,em)]] =

Bpe[[e1]] Bpe[[em]]

Bpe[[ci e]]

= c1 ID + + ci (Bpe[[e]] ) + + cm ID

Bpe[[f e]]

=

( f) (Bpe[[e]] )

Bpe[[case e of c1 x1=>e1

j . . .

j cn xn=>en]] =

case Bpe[[e]]

of

 

in=1 ci i

=> uin=1 Bpe[[ei]] ( [xi 7!i])

P

 

 

j ABS

=>

ABS

Figure 15.7: The PEL binding-time analysis function Bpe.

15.5.4Safety

When e is an expression with free variable x, let E[[e]][x 7!v] denote the (standard) result of evaluating e with x bound to the value v. Also, recall that when is a division, then ( f) denotes the static part of f's formal parameter.