Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Discrete math with computers_3

.pdf
Скачиваний:
87
Добавлен:
16.03.2016
Размер:
2.29 Mб
Скачать

12.6. BALANCED TREES

337

z

 

easy right

z

 

 

 

 

 

 

 

 

 

rotation

 

x

 

 

y

 

 

 

 

 

 

 

 

zL

 

 

zL

 

y

n

x

 

n

 

 

 

 

 

yR

 

xL

 

 

 

n

 

 

 

 

 

 

n−1

 

 

 

xR

 

n

 

 

xL

 

still

 

 

n−1

 

xR

yR

n−1

 

unbalanced

n

 

n

 

 

n−1

n

 

 

 

 

n

 

Cel z d zL (Cel y b (Cel x a xL xR) yR)

 

 

 

 

Cel z d zL (Cel x a xL (Cel y b xR yR))

 

 

 

 

 

x

 

z

 

 

 

z

 

y

 

 

easy left

 

 

 

 

 

 

 

 

 

x

rotation

 

 

 

 

zL

 

 

zL

 

 

 

n

y

 

 

 

 

 

 

 

n

xL

 

 

 

 

 

 

 

 

xL

 

 

 

n−1

xR

yR

 

 

 

n

n−1

 

 

 

n−1

n

 

 

 

 

n

 

 

 

 

n

 

still

 

 

 

 

 

xR

yR

balanced

 

 

unbalanced

 

 

 

 

n−1

 

 

 

 

 

n

 

 

 

 

 

n

 

 

 

 

 

 

 

Cel x a (Cel z d zL xL) (Cel y b xR yR)

Cel z d zL (Cel x a xL (Cel y b xR yR))

 

 

 

Figure 12.6: Rebalancing an Inside Right-Heavy Tree

Because the easy cases (outside heavy trees) include the possibility that the left and right subtrees of the tallest part have the same height, we can be more explicit about subtree heights in the inside heavy cases. Figure 12.5 uses the same conventions as our early tree diagrams to display an inside right-heavy tree.

The tallest part of the tree in Figure 12.5 occurs at the yL subtree. It has height n + 1, which is at least one. Therefore, it is not a Nub tree. Its Celconstructed structure can be further exposed as a key, x, with two subtrees, xL and xR, as shown in Figure 12.6. At least one of the subtrees of yL must have height n. Otherwise, the height of yL could not be n + 1. The other subtree of yL could have either height n or height n − 1. The diagram indicates that these subtrees can have either height.

The right subtree of an inside right-heavy is not out of balance, but the height of its left subtree is greater (by one) than that of its right subtree.

338

CHAPTER 12. THE AVL TREE MIRACLE

One might guess that there could be some benefit in applying an easy right rotation to that subtree. And, it turns out that this interior rotation makes it easier to complete the rebalancing of the tree. It comes back into balance when we apply an easy left rotation to the whole tree (after performing the easy right rotation on the right subtree). The diagram in Figure 12.6 illustrates the process, and could serve as a guide to a proof that these rotations preserve order and restore balance in the inside right-heavy case. We are not going to carry out these proofs because they are similar to proofs we have already seen, but we will cite the resulting theorems as if we had proved them.

12.6.3Rebalancing Left-Heavy and Right-Heavy Trees

We now have the apparatus necessary to rebalance trees that are too tall by one on one side or the other, assuming that the interior trees are balanced. We do so by using easy left and easy right rotations in combinations that depend on whether the imbalance is on the outside or the inside of the tree. The rotations preserve order, and they also restore balance when applied in correctly chosen combinations.

A left rotation, rotL, as specified in the following formula, restores balance to a tree that has a right subtree whose height is two greater than the height of its left subtree. Similarly, a right rotation, rotR, restores balance to a tree that is one too tall on the left side.

rotR (Cel z d (Cel x a xL xR) zR) = if (height xL) < (height xR)

then easyR (Cel z d (easyL (Cel x a xL xR)) zR) else easyR (Cel z d (Cel x a xL xR) zR)

rotL (Cel z d zL (Cel y b yL yR)) = if (height yR) < (height yL)

then easyL (Cel z d zL (easyR (Cel y b yL yR))) else easyL (Cel z d zL (Cel y b yL yR))

The following theorems are straightforward consequences of theorems we already know about the easy rotations.

Theorem 90. (rotations preserve order)

z. d. x. a. xL. xR. zR.

ordered (Cel z d (Cel x a xL xR) zR)

ordered (rotR (Cel z d (Cel x a xL xR) zR))z. d. x. a. xL. xR. zR.

ordered (Cel z d zL (Cel y b yL yR))

→ ordered (rotL (Cel z d zL (Cel y b yL yR)))

12.6. BALANCED TREES

339

Theorem 91. (rotations restore balance)

z. d. x. a. xL. xR. zR.

balanced (xL) balanced (xR) balanced (zR) height (Cel x a xL xR) > height (zR) + 1 → balanced (rotR (Cel z d (Cel x a xL xR) zR))

z. d. y. b. zL. yL. yR.

balanced (zL) balanced (yL) balanced (yR) height (Cel y b yL yR) > height (zL) + 1 → balanced (rotL (Cel z d zL (Cel y b yL yR)))

Theorem 92. (rotations conserve keys)

z. d. x. a. xL. xR. zR.

k (Cel z d (Cel x a xL xR) zR))

k rotR (Cel z d (Cel x a xL xR) zR) and

z. d. x. a. xL. xR. zR.

k (Cel z d (Cel x a xL xR) zR)))

k rotR (Cel z d (Cel x a xL xR) zR))

We are going to assume that we could prove these theorems if someone asked us to, and we will cite them as if we had constructed their proofs.

12.6.4Inductive Equations for Insertion

Insertion of a new element in a tree is a matter of deciding which side of the tree it goes in and inserting it on that side. If there is no room at the top level, simply apply the same idea, inductively, to the subtree where the insertion is to take place.

After the insertion, it may happen that the tree is out of balance. If so, the height of the side where the insertion occurred will be exactly two more than the height of the other side, and a rotation will restore balance.

In formal terms, the operation (ˆ:) satisfies the following equations.

(k,

d)

ˆ:

Nub = (Cel k d Nub Nub)

{ˆ:

N}

(k,

d)

ˆ:

(Cel x a lf rt) =

{ˆ:

C}

if z < x then

if (height newL) > (height rt) + 1 then rotR (Cel k d newL rt) else (Cel x a newL rt)

else if z > x then

if (height newR) > (height lf) + 1 then rotL (Cel x a lf newR) else (Cel x a lf newR)

else (Cel x d lf rt) where

newL = (k, d) ˆ: lf newR = (k, d) ˆ: rt

340

CHAPTER 12. THE AVL TREE MIRACLE

The “where” clause in the second equation names the tree delivered by insertion into the subtree that preserves order. Only one of these insertions will take place, of course. The new (key, data) pair goes into the left subtree if the key is smaller than the key at the root. If it is greater, it goes into the right subtree.

Otherwise, the key to be inserted must be the same as the key at the root. There are many choices for things to do at this point. It might be considered an error to attempt to insert a duplicate key into a tree, in which case it would be best to deliver a value that could signal the problem. One choice of such a value would be the Nub tree, since it could not be the result of inserting a new key into a tree. On the other hand, it might be appropriate to simply associate the new data with the key that matches the key that was to have been inserted. We made the latter choice because it makes our theorems about conservation of keys easier to state.

The insertion equations imply three important properties of the operation: insertion preserves order, preserves balance, and conserves keys. That is, if the right-hand operand of an insertion is an ordered and balanced tree, then insertion will deliver a tree that is ordered and balanced. Furthermore, the tree delivered by insertion will contain all the keys that occur in the right-hand operand, plus the key in the left-hand operand, and it will contain only those keys and no others.

Theorems 93, 94, and 95 state these properties. Tree induction can be used to prove all of these theorems. We are going to carry out only one of them. The pattern of the others is similar, and we will assume we could construct those proofs if someone asked.

Theorem 93. (insertion preserves order).

s. k. d. ordered (s) → ordered ((k, d) ˆ: s)

Theorem 94. (insertion preserves balance).

s. k. d. balanced (s) → balanced ((k, d) ˆ: s)

Theorem 95. (insertion conserves keys)

s. k. d. k ((k, d) ˆ: s)

s. k. d. x. x s → x ((k, d) ˆ: s)

s. k. d. x. (x s x = k) x ((k, d) ˆ: s)

Proof. of Theorem 94 (by tree induction)

12.6. BALANCED TREES

341

Base Case.

k. d. balanced (Nub) → balanced ((k, d) ˆ: Nub)

={ˆ: N }

k. d. balanced (Nub) → balanced (Cel k d Nub Nub) = {bal N }

k. d. True → balanced (Cel k d Nub Nub) = {bal C}

k. d. True True

= {→}

True

Inductive Case.

In this part of the proof, we will apply the {ˆ: C} equation, which takes three forms, depending on the relationship between the key being inserted and the key associated with the tree’s cell constructor (k < x) (k > x) (k = x). To prove an implication with a disjunction in the hypothesis, it is su cient to prove the conclusion separately for each of the terms in the disjunction. (We know this from the following equation of propositional logic: ((a b c) → d) = ((a → d) (b → d) (c → d))

The proof takes di erent forms for each of these separate parts, but the reasoning is similar enough that we are going to carry out just one of the parts, and leave the others to the imagination.

The part we are going to prove is when the key to be inserted is smaller than the key associated with the Cel constructor for the tree. A proof strategy for an implication is to start from the hypothesis and derive the conclusion. The hypothesis of the implication, in the inductive case, is the formula balanced (Cel x a lf rt). Since lf is a proper subtree of (Cel x a lf rt), the induction hypothesis leads to the conclusion that the formula balanced ((k, d)ˆ: lf) is true.

We are proving three implications, and in the first of these we assume that the key, k, to be inserted is less than the key, x, associated with the Cel constructor for the tree. To complete this part of the proof, we must verify that the conclusion of the implication is true when the hypothesis is true and k < x.

Here, there are two cases because the equation {ˆ : C} specifies that (Cel x a ((k, d) ˆ : lf) rt) is the value of ((k, d) ˆ : (Cel x a lf rt)) when | height ((k, d) ˆ: lf) height (rt) | ≤ 1 and specifies rotR ((k, d) ˆ: (Cel x a lf rt)) as the value when height ((k, d) ˆ: lf) > height (rt) + 1.

We reason as follows:

342

CHAPTER 12. THE AVL TREE MIRACLE

balanced ((k, d) ˆ:

(Cel x a lf rt))

 

= {ˆ: C, k < x}

 

balanced (Cel x a ((k, d) ˆ: lf) rt)

 

= {bal }

(Cel x a ((k, d) ˆ:

 

(Cel y a yL yR)

lf) rt).

| height (yL) height (yR) | ≤

1

This “forall” predicate can be partitioned into three logic formulas, all of which must be true:

| height ((k, d) ˆ: lf) height (rt) | ≤ 1

{formula 1}

(Cel y a yL yR)

((k, d) ˆ: lf).

 

{formula

2}

| height (yL) height (yR) |

1

(Cel y a yL yR)

rt.

 

{formula

3}

| height (yL) height(yR) |

1

We have already verified, using the induction hypothesis, that ((k, d) ˆ: lf) is balanced. Therefore, formula 2 must be true. With regard to formula 3, the hypothesis of the implication we are trying to prove is that (Cel x a lf rt) is balanced. Therefore, all its subtrees are balanced, and since rt is one of those, all of the non-empty subtrees of rt must have left and right subtrees with heights that di er by one or less.

With regard to formula 1, we must consider two situations because the equation {ˆ: C} specifies (Cel x a ((k, d) ˆ: lf) rt) as the value of ((k, d) ˆ: (Cel x a lf rt)) when | height ((k, d) ˆ: lf) height (rt) | ≤ 1 and specifies rotR ((k, d) ˆ: (Cel x a lf rt)) as the value when height ((k, d) ˆ: lf) > height (rt) + 1. The first situation verifies formula 1 directly. In the second situation, Theorem 91 (rotations restore balance) implies that formula 1 is true.

This completes the proof of the inductive case of Theorem 94 when k < x. When k > x, the reasoning goes the same way, but down the right-hand side of the tree instead of the left. When k = x, the insertion operation delivers a tree identical to the input, except for the data item in the Cel constructor for the tree. The input tree is balanced, and swapping the data item in its Cel constructor for a di erent one cannot a ect the balance of the tree. By this route, the proof of the inductive case can be completed.

12.6.5Insertion in Logarithmic Time

In the same way that we reasoned, by induction, that the number of computational steps required for retrieval of a data item from a search tree is proportional to the height, we can prove that the number of steps required for insertion is also proportional to the height. The induction for insertion proceeds in the same way as the induction for retrieval, except that some additional basic operations are encountered, such as the Cel and Nub constructors, the rotation operators, and the height computation. If we can convince ourselves

12.6. BALANCED TREES

343

that each of these operations takes a fixed amount of time, regardless of what their operands are, the outcome of the induction will be that the number of steps required for insertion is proportional to tree height.

Nobody will have di culty believing that the Nub constructor takes a fixed amount of time, as it has no operands. The Cel constructor calls for more careful reasoning because it involves building a tree from components, and we want to argue that building the tree does not depend on the size of the components. We can assume that the components have already been constructed, so we just need to verify that if the computer is given a key k, data item d, and two pre-constructed subtrees lf and rt, the number of computational steps required for the computer to deliver the tree (Cel k d lf rt) does not depend on the size of the key, data item, or subtrees.

Compilers for languages like Haskell refer to constructed entities by address. For example, if the subtree lf, involved in the construction of (Cel k d lf rt), has been previously constructed, then it has been recorded somewhere in the computer’s memory. To construct the tree (Cel k d lf rt), the computer simply places, into the structure delivered by the Cel constructor, the address where lf has been recorded. Similarly, it puts the addresses for k, d, and rt in the structure. Because addresses have a fixed size, all of this construction is completed in a fixed amount of time, regardless of the values represented by k, d, lf, and rt. As for the rotation operators, they simply combine a fixed number of Cel and Nub constructions, so they also can be completed in a fixed number of steps.

The height operator is another matter. The equations we have given for it, when used according to the procedures of the equational model of computation, produce a computation in which the number of steps is proportional to the height of the tree. If we were to use a version of the height operator satisfying equations {height N } and {height C}, we would fail to accomplish our goal of producing an insertion operator that can be carried out in a number of steps proportional to the height. Therefore, we need a di erent height operator.

The trick is to record the height directly in the tree, and insert in incrementally, as the tree is being constructed. This requires a new definition for search trees:

data SearchTree d = Nub |

Cel Integer Integer d (SearchTree d) (SearchTree d)

This new definition has a extra Integer parameter for the Cel constructor, and this parameter is used to record the height of the tree. To construct a tree, its height must be specified, along with the key, data item, and two subtrees. The construction (Cel h k d lf rt) denotes a tree of height h with the key k, data item d, and left and right subtrees lf and rt. The new height operator could be specified by the following equations.

height Nub = 0

344

CHAPTER 12. THE AVL TREE MIRACLE

height (Cel h k d lf rt) = h

Delivering the height of a tree simply amounts to pulling it out of an existing structure, an operation that requires a fixed amount of e ort, regardless of the size of the tree.

Of course, all of the Cel constructions that appear in equations must be modified to include the height, but it turns out that this can be done because the height to be recorded by a Cel construction can be computed from the heights of the subtrees supplied for the construction. For example, the easyL rotation

easyL(Cel hz z d zL (Cel hy y b yL yR)) = (Cel hNewy y b (Cel hNewz z d zL yL) yR)

where

hNewy = max hNewz (height yR) hNewz = max (height zL) (height yL)

Similar modifications to the other rotations and to the Cel constructions in the insertion operator lead to an insertion operator that can be computed in an amount of time proportional to the height of the tree in which the insertion occurs.

Since the height of a balanced tree is proportional to the logarithm of the number of data items in the tree (together with all subtrees contained in the tree), insertion can be done in an amount of time proportional to the logarithm of the number of data items in the tree. This is the insertion part of the AVL miracle. The deletion part, which we are going to turn to now, is a little more complicated, but not much.

12.6.6Deletion

The process of deleting a key and its associated data item from a search tree, while maintaining order and balance, depends on where the deleted item occurs in the tree. Wherever it occurs, the Cel constructor at that point forms an entire tree (a subtree of the one we started with). If we can find a way to delete the item associated with a particular Cel constructor, we can build a new subtree containing all the items of the original subtree except the one associated with its Cel constructor. We can then replace the portion of the original tree corresponding to that Cel constructor with the new subtree.

If the new subtree occurs as a proper subtree of the tree as a whole, we need to make sure the resulting tree is balanced. (It will automatically be ordered because all of the keys occurring in the subtree occurred in the original subtree, so they must have stood in the proper relationship to other keys in the tree.) We have deleted a node, so it may happen that the new subtree is shorter than the old one, and this could disrupt the balance of the whole tree. Therefore, we compare the height of the new subtree to the height of the other subtree in the Cel constructor where the new subtree resides. If the heights di er by

12.6. BALANCED TREES

345

z

 

 

 

 

 

deliver

0

 

 

right subtree

 

 

1

 

 

 

ordered

0

 

ordered

balanced

 

1

 

balanced

 

 

 

 

 

Figure 12.7: Deleting Root When Left Subtree Is Nub

more than one, we rotate right or left as necessary, then carry out a similar check and possible rotation one level up in the tree. We continue this process until we come to the Cel constructor for the tree as a whole.

That is the plan. The first step is to figure out how to delete an item associated with the Cel constructor for a tree. That is, how to delete a key that occurs at the root of a search tree. We break this into two cases. As usual, one of the cases is easy to deal with, and the other one isn’t. We will take care of the easy case first.

The easy case is when the left subtree is a Nub. Since the tree is balanced, the height of the right subtree must be either zero or one. There are no items in the left subtree to worry about, so all items (except the one to be deleted) are retained if we deliver the right subtree as the result. Furthermore, the right subtree is ordered and balanced. It has to be because it is a subtree of the original tree, which was ordered and balanced. Figure 12.7 charts the transformation. As in previous diagrams, a triangle represents an ordered and balanced tree, and the numbers near the bottom of the triangle specify the possibilities for the height of the tree represented by the triangle.

When the left subtree is not a Nub tree, the tree may have any height, and there is more to do. Visualizing the tree transformations as diagrammed in Figure 12.8 may make it easier to follow the deletion procedure in this case. The idea is to march down the spine on the right side of the left subtree until we come to a node whose right subtree is Nub. In keeping with the spine metaphor, we call this node the sacrum. (That is the tailbone. You can look it up.) The sacrum is the node with the key x in the leftmost tree shown in Figure 12.8, and the spine consists of the Cel constructor of the left subtree, the Cel constructor of its right subtree, the Cel constructor of the right subtree of that subtree, and so on down to the sacrum.

The left subtree of the sacrum will have height zero or one (otherwise, the tree would have been out of balance). We cache the key and data item of the sacrum, replace it by its left subtree, and perform right rotations as needed for

346

CHAPTER 12. THE AVL TREE MIRACLE

ordered

cache x

 

balanced

 

 

shrink spine

 

z

 

z

 

 

x

 

 

rotR

 

spine

rotR

rotR

x

 

 

 

 

zR

xL

zR

xL

0

 

1

 

0

 

 

 

 

 

1

 

 

 

 

 

 

ordered

z

toss z

x

balanced

x

throw in x

 

 

rotL if nec

 

 

 

 

 

rotR

 

 

 

rotR

 

 

 

rotR

 

 

 

xL

zR

xL

 

0

0

zR

1

 

1

 

Figure 12.8: Deleting Root When Left Subtree Is Cell

balance all the way back up the spine.

Right rotations are needed where the height of the left subtree is two more than that of the right subtree. The left subtrees along the spine will always have heights exceeding that of the corresponding right subtrees by no more than two because the tree was originally balanced, which implies height di erences of one or less. When we replace the subtree rooted at the sacrum (a subtree that has a Nub tree as its right subtree) by its left subtree, the height of the new tree at the position of the sacrum may be one less than it was before. This new tree is the right subtree of another subtree, so that subtree may be too tall by one on the left. If it is, we know from Theorem 91 that a right rotation will rebalance it. We also know, from Theorem 90, that a right rotation will preserve order. Therefore, at each stage the subtree produced is ordered and

Соседние файлы в предмете Дискретная математика