Добавил:
Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Advanced Wireless Networks - 4G Technologies

.pdf
Скачиваний:
60
Добавлен:
17.08.2013
Размер:
21.94 Mб
Скачать

558 SENSOR NETWORKS

spanning all the sensors, i.e. a schedule has one tree for each round. The lifetime of a schedule equals the lifetime of the system under that schedule. The objective is to find a schedule that maximizes the system lifetime T .

Data aggregation performs in-network fusion of data packets, coming from different sensors enroute to the sink, in an attempt to minimize the number and size of data transmissions and thus save sensor energies. Such aggregation can be performed when the data from different sensors are highly correlated. As usual, we make the simplistic assumption that an intermediate sensor can aggregate multiple incoming packets into a single outgoing packet.

The problem is to find a data gathering schedule with maximum lifetime for a given collection of sensors and a sink, with known locations and the energy of each sensor, where sensors are permitted to aggregate incoming data packets.

Consider a schedule S with lifetime T cycles. Let fi, j be the total number of packets that node i (a sensor) transmits to node j (a sensor or sink) in S. The energy constraints at each sensor, impose

n+1

n

fi, j · Txi, j +

f j,i · Rxi Ei , i = 1, 2, . . . , n.

j=1

j=1

The schedule S induces a flow network G = (V, E). The flow network G is a directed graph having as nodes all the sensors and the sink, and edges (i, j) with capacity fi, j whenever

fi, j > 0.

If S is a schedule with lifetime T , and G is the flow network induced by S, then, for each sensor s, the maximum flow from s to the sink t in G is T . This is due to the fact that each data packet transmitted from a sensor must reach the base station. The packets from s could possibly be aggregated with one or more packets from other sensors in the network. Intuitively, we need to guarantee that each of the T values from s influences the final value(s) received at the sink. In terms of network flows, this implies that sensor s must have a maximum s t flow of size T to the sink in the flow network G. Thus, a necessary condition for a schedule to have lifetime T is that each node in the induced flow network can push flow T to the sink.

Now, we consider the problem of finding a flow network G with maximum T , that allows each sensor to push flow T to the base station, while respecting the energy constraints at all the sensors. What needs to be found are the capacities of the edges of G. Such a flow network G will be referred to as an admissible flow network with lifetime T . An admissible flow network with maximum lifetime is called an optimal admissible flow network.

An optimal admissible flow network can be found using the integer program with linear constraints. If for each sensor k = 1, 2, . . . , n, πi(,kj) is a flow variable indicating the flow that k sends to the sink t over the edge (i, j), the integer program is given by:

Maximize T with

n+1

n

f j,i · Rxi Ei , i = 1, 2, . . . , n

fi, j · Txi, j +

j=1

j=1

 

n

n+1

for all i = 1, 2, . . . , n and i =k,

π (jk,i) =

πi(,kj),

j=1

j=1

 

 

AGGREGATION IN WIRELESS SENSOR NETWORKS

559

n

n+1

 

T + π (jk,k) = πk(k, )j ,

 

j=1

j=1

 

0 πi(,kj) fi, j , for all i = 1, 2, . . . , n and j = 1, 2, . . . , n + 1

n

πi(,kn)+1 = T, k = 1, 2, . . . , n,

i=1

The first line imposes the energy constraint per node; the next two lines enforce the flow conservation principle at a sensor; the next line ensures that the capacity constraints on the edges of the flow network are respected and the last line ensures that T flow from sensor k reaches the sink.

Now we can get a schedule from an admissible flow network. A schedule is a collection of directed trees rooted at the sink that span all the sensors, with one such tree for each cycle. Each such tree specifies how data packets are gathered and transmitted to the sink. These trees are referred to as aggregation trees. An aggregation tree may be used for one or more cycles. The number of cycles f for which an aggregation tree is used is indicated by associating the value f with each one of its edges. In the following f is referred to as the lifetime of the aggregation tree. The depth of a sensor v is the average of its depths in each of the aggregation trees, and the depth of the schedule is be max{depth(v) : v V }.

Figure 14.12 shows an admissible flow network G with lifetime T = 50 and two aggregation trees A1 and A2, with lifetimes 30 and 20, respectively. By looking at one of these trees, say A1, we see that, for each one of 30 cycles, sensor 2 transmits one packet to sensor 1, which in turn aggregates it with its own data packet and then sends one data packet to the base station. Given an admissible flow network G with lifetime T and a directed tree

4

20

 

3

 

 

30

G

30

20

 

20

 

1

2

 

 

30

 

4

3

 

30

30

A1

 

1

2

 

 

30

 

4

20

 

3

 

 

20

A2

 

20

 

1

2

 

Figure 14.12 An admissible flow network G with lifetime 50 rounds and two aggregation trees A1 and A2 with lifetimes 30 and 20 rounds, respectively. The depth of the schedule with aggregation trees A1 and A2 is 2.

560 SENSOR NETWORKS

A rooted at the sink t with lifetime f, we define the (A, f )-reduction G of G to be the flow network that results from G after reducing the capacities of all of its edges, that are also in A, by f . We call G the (A, f )-reduced G. An (A, f )-reduction G of G is feasible if the maximum flow from v to the sink t in G is T f for each vertex v in G . Note that A does not have to span all the vertices of G, and thus it is not necessarily an aggregation tree. Moreover, if A is an aggregation tree, with lifetime f , for an admissible flow network G with lifetime T , and the (A, f )-reduction of G is feasible, then the (A, f )-reduced flow network G of G is an admissible flow network with lifetime T f . Therefore, we can devise a simple iterative algorithm, to construct a schedule for an admissible flow network G with lifetime T , provided we can find such an aggregation tree A.

Aggretree (G, T, t)

1 initialize f 1

2let A = (Vo, Eo) where Vo = {t} and Eo = Ø

3while A does not span all the nodes of G do

4for each edge e = (i, j) G such that i / Vo and j Vo do

5let A be A together with the edge e

6/ / check if the (A , 1)-reduction of G is feasible

7let Gr be the (A , 1)-reduction of G

8if MAXFLOW(v, t, Gr) T 1 for all nodes v of G

9

/ / replace A with A

10

Vo Vo {i}, Eo Eo {e}

11

break

12let cmin be the minimum capacity of the edges in A

13let Gr be the (A, cmin)-reduction of G

14if MAXFLOW(v, t, Gr) T cmin for all nodes v of G

15f cmin

16replace G with the (A, f )-reduction of G

17return f, G, A

The aggretree (G, T, t) algorithm can be used to obtain an aggregation tree A with lifetime f from an admissible flow network G with lifetime T f . Tree A is formed as follows. Initially A contains just the sink t. While A does not span all the sensors, we find and add to A an edge e = (i, j) , where i A and j A, provided that the (A , f )-reduction of G is feasible – here A is the tree A together with the edge e and f is the minimum of the capacities of the edges in A . Given a flow network G and sink t such that each sensor has a minimum s t cut of size T (i.e. the maximum flow from s to t in G is T ), we can prove that it is always possible to find a sequence of aggregation trees, via the algorithm, that

BOUNDARY ESTIMATION

561

can be used to aggregate T data packets from each of the sensors. The proof of correctness is based on a minimax theorems in graph theory [116, 117].

Experimental results show [118] that, for a network with 60 nodes, the above algorithm can improve the network lifetime by a factor of more than 20.

14.7 BOUNDARY ESTIMATION

An important problem in sensor networking applications is boundary estimation [79, 80, 127, 129]. Consider a network sensing a field composed of two or more regions of distinct behavior (e.g. differing mean values for the sensor measurements). An example of such a field is depicted in Figure 14.13(a). In practice this may represent the bound of the area under the fire or contaminated area. Boundary estimation is the process of determining the delineation between homogeneous regions. By transmitting to the sink only the information about the boundary instead of the transmission from each sensor, a significant aggregation effect can be achieved. There are two fundamental limitations in the boundary estimation problem. First, the accuracy of a boundary estimate is limited by the spatial density of sensors in the network and by the amount of noise associated with the measurement process. Second, energy constraints may limit the complexity of the boundary estimate that is ultimately transmitted to a desired destination.

The objective is to consider measurements from a collection of sensors and determine the boundary between two fields of relatively homogeneous measurements.

(a)

(c)

(b)

(d)

Figure 14.13 Sensing an inhomogeneous field. (a) Points are sensor locations. The environment has two conditions indicated by the gray and white regions of the square.

(b) The sensor network domain is partitioned into square cells. (c) Sensors within the network operate collaboratively to determine a pruned partition that matches the boundary. (d) Final approximation to the boundary between the two regions which is transmitted to a remote point.

562 SENSOR NETWORKS

We presume a hierarchical structure of ‘clusterheads’ which manage measurements from nodes below them in the hierarchy. Thus, the nodes in each square of the partition communicate their measurements to a clusterhead in the square. Index the squares at the finest scale by row and column (i, j). The clusterhead in square (i, j) computes the average of these measurements to obtain a value xi, j : N (μi, j 2/mi, j ), where μi, j is the mean value, σ 2 is the noise variance for each sensor measurement, and mi, j is the number of nodes in square (i, j). Thus we assume sensor measurements that have a Gaussian distribution. For simplicity, we assume mi, j = 1. The random distribution is to account for noise in the system as well as for the small probability of node failure.

A possible approach to the boundary estimation problem is to devise a hierarchical processing strategy that enables the nodes to collaboratively determine a nonuniform rectangular partition of the sensor domain that is adapted to the boundaries [119–125]. The partition will have high, fine resolution along the boundary, and low, coarse resolution in homogeneous regions of the field, as depicted in Figure 14.13. The partition effectively provides a ‘staircase’-like approximation to the boundary.

The estimation process partitions the sensor domain of a normalized unit square [0, 1] into n sub-squares of sidelength 1/n, as shown in Figure 14.13(b). The sidelength 1/n is the finest resolution in the analysis. In principle, this initial partition can be generated by a recursive dyadic partition (RDP). First divide the domain into four sub-squares of equal

size. Repeat this process again on each sub-square. Repeat this (1/2) log n = J times. This

2

gives rise to a complete RDP of resolution 1/ n (the rectangular partition of the sensing domain shown in Figure 14.13(b). The RDP process can be represented with a quadtree structure. The quadtree can be pruned back to produce an RDP with nonuniform resolution as shown in Figure 14.13(c). The key issues are: (1) how to implement the pruning process

in the sensor network; and (2) how to determine the best pruned tree.

Let Pn denote the set of all RDPs, including the initial complete RDP and all possible prunings. For a certain RDP P Pn , on each square of the partition, the estimator of the field averages the measurements from the sensors in that square and sets the estimate of the field to that average value. This results in a piecewise constant estimate, denoted by θ , of the field. This estimator will be compared with the data x = {xi, j }. The empirical measure of performance is the sum-of-squared errors between θ = θ (P) and the data x = {xi, j }.

 

 

 

 

(θ , x) =

n

[θ (i, j) xi, j ]2

 

 

 

(14.1)

 

i, j=1

 

The complexity penalized estimator is defined by References [119–125]:

ˆ

 

min

 

2

(14.2)

θ

n = arg

Pn

[θ (P), x] + 2σ p(n)Nθ (P)

 

θ (P)P

 

 

 

 

 

 

 

 

where σ 2 is the noise variance, Nθ (P) denotes the total number of squares in the partition P, and p(n) is a certain monotonically increasing function of n that discourages unnecessarily high-resolution partitions [appropriate choices of p(n) will be discussed below]. The optimization in Equation (14.2) can be solved using a bottom-up tree pruning algorithm in O(n) operations [122, 126, 128,]. At each level of the hierarchy, the clusterhead receives the best sub-partition/subtree estimates from the four clusterheads below it, and compares the total cost of these estimates with the cost of the estimate equal to the average of all sensors in that cluster to make the decision on pruning.

BOUNDARY ESTIMATION

563

14.7.1 Number of RDPs in P

Set P of RDPs consists of all RDPs resulting from pruning PJ , the uniform partition of the unit square into n squares of sidelength 1/n. We need to determine how many RDPs there are in P or, more specifically, we need to know how many partitions there are with exactlysquares/leafs. Since the RDP is based on recursive splits into four, the number of leafs in every partition in P is of the form = 3m + 1, for some integer 0 m (n 1)/3. The integer m corresponds to the number of recursive splits. For each RDP having 3m + 1 leafs there is a corresponding partially ordered sequence of m split points (at dyadic positions in the plane). In general, there are

n

 

 

 

n!

m

(n

m)!m!

 

possible selections of m points from n (n corresponding to the vertices of the finest resolution partition, PJ ). This number is an upper bound on the number of partitions in P with= 3m + 1 leafs (since RDPs can only have dyadic split points).

14.7.2 Kraft inequality

Let n denote the set of all possible models of the field. This set contains piecewise constant models (constant on the dyadic squares corresponding to one of the partitions in Pn ). The constant values are in a prescribed range [R, R], and are quantized to k bits. The range corresponds to the upper and lower limits of the amplitude range of the sensors. The set n consists of a finite number of models derived in the previous section. Here we show that with the number of bits k employed per transmission and p(n) properly calibrated, we have

ep(n)|θ | 1

(14.3)

θ n

where for simplicity notation Nθ (P) = |θ | is used. If (nm) denotes the subset of n consisting of models based on = 3m + 1 leaf partitions, then we have

 

(n1)/3

 

ep(n)|θ | =

 

 

 

θ n

m=0 θ n(m)

 

(n1)/3 nm

 

 

 

 

 

m=0 m!

 

 

(n1)/3 1

 

= m=0

 

 

 

m!

If A m log n + (3m + 1) log(2k )

(n1)/3

n

 

e(3m+1) p(n)

m

(2k )3m+1e(3m+1) p(n)

m=0

 

 

(2k )3m+1e(3m+1) p(n)

 

 

e[m log n+(3m+1) log(2k )(3m+1) p(n)]

(3m + 1) p(n) < 1 (then eA < e1), then we have

(n1)/3

ep(n)|θ | 1/e

θ n

m=0

1

1

(14.4)

m!

To guarantee A < 1, we must have p(n) growing at least like log n. Therefore, set p(n) = γ log n, for some γ > 0. Also, as we will see later in the next section, to guarantee that the quantization of our models is sufficiently fine to contribute a negligible amount to the

564 SENSOR NETWORKS

overall error we must select 2k : n1/4. With these calibrations we have A = [(7/4 3γ )

m+ (1/4 γ )] log n. In order to guarantee that the MSE converges to zero, we will see in the next section that m must be a monotonically increasing function of n. Therefore, for

nsufficiently large, the term involving ( 14 γ ) is negligible, and the condition A < 1 is satisfied by γ > 7/12. In References [119–125] γ = 2/3 is used.

14.7.3 Upper bounds on achievable accuracy

Assume that p(n) satisfies the condition defined by Equation (14.4) where again |θ | denotes the number of squares (alternatively we shall call this the number of leafs in the pruned tree description of the boundary) in the partition θ . It is shown in the above section that

 

 

 

 

ˆ

 

 

p(n) γ log n satisfies Equation (14.4). Let θn denote the solution to

 

ˆ

 

min (θ , x)

2

(14.5)

θ

n = arg

+ 2σ p(n)|θ |

 

θ

 

n

 

 

 

 

 

 

 

where, as before, x denotes the array of measurements at the finest scale {xi, j }, and |θ | denotes the number of squares in the partition associated with θ . This is essentially the same estimator as defined in Equation (14.2) except that the values of the estimate are quantized in this case.

 

If θ

denote the true value of the field at resolution 1/

 

[i.e. θ (i, j)

 

E[x ,

 

]] then,

n

 

 

 

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

n

 

 

=ˆ

i

j

 

 

applying Theorem 7 in References [119, 124], the MSE of the estimator θn

is bounded

above as

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

n

 

 

 

[θˆ

 

 

 

 

 

 

 

1

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

E

{

 

(i, j)

θ

(i, j)]2

min

2

 

 

[θ (i, j)

θ

(i, j)]2

+

8σ 2 p(n)

θ

|

n i, j

 

 

1

n

n

 

 

=

 

 

 

 

n

} ≤ θ n

i, j

=

1

 

 

n

 

 

 

 

 

|

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(14.6)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

The upper bound involves two terms. The first term, 2

 

n

 

[θ (i,

j)

θ

(i, j)]2, is a

 

 

 

 

 

=21

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

i, j

 

 

n

 

 

 

 

bound on the bias or approximation error. The second term, 8σ

p(n) |θ |, is a bound on the

variance or estimation error. The bias term, which measures the squared error between the best possible model in the class and the true field, is generally unknown. However, if we make certain assumptions on the smoothness of the boundary, then the rate at which this term decays as function of the partition size |θ | can be determined.

If the field being sensed is composed of homogeneous regions separated by a onedimensional boundary and if the boundary is a Lipschitz function [122, 128], then by carefully calibrating quantization and penalization [taking k : 1/4 log n and setting p(n) = 2/3 log n] we have [119,125]

 

 

 

 

 

 

 

 

 

 

 

 

 

 

1

n

 

[θˆ

 

 

 

 

 

 

 

 

 

 

 

E

{

n

(i, j)

θ

(i, j)]2

} ≤

O (log n)/n

(14.7)

 

 

 

 

n i, j=1

 

 

 

n

 

 

 

This result shows that the MSE decays to zero at a rate of [(log n)/n].

14.7.4 System optimization

The system optimization includes energy-accuracy trade-off. Energy consumption is defined by two communication costs: the cost of communication due to the construction of the tree (processing cost) and the cost of communicating the final boundary estimate

> (3m + 1)2σ 2

BOUNDARY ESTIMATION

565

(communication cost). We will show that the expected number of leafs produced by the algorithm is O(n), and that the processing and communication energy consumption is proportional to this number. Having in mind MSE : [(log n)/n] and ignoring the logarithmic factor, the accuracy-energy trade-off required to achieve this optimal MSE is roughly MSE : 1/energy. If each of the n sensors transmits its data, directly or by multiple hops, to an external point, the processing and communication energy costs are O(n), which leads to the trade-off MSE : 1/energy, since we know that no estimator exists that can result in an MSE decaying faster than O(1/n). Thus, the hierarchical boundary estimation method offers substantial savings over the naive approach while optimizing the tradeoff between accuracy and complexity of the estimate.

Communication cost is proportional to the final description of the boundary, thus it is

of interest to compute the expected size of the tree, or E[| ˆ |]. We construct an upperbound

θ

for E[| ˆ |] under the assumption of a homogeneous field with no boundary. Let P denote

θ

the tree-structured partition associated with ˆ . Note that, because P is an RDP, it can have

θ

d + 1 leafs (pieces in the partition), where d = 3m, m = 0, . . . , (n 1)/3. Therefore, the expected number of leafs is given by

ˆ

 

(n1)/3

 

|] =

ˆ

| = 3m + 1)

E[|θ

(3m + 1)Pr(|θ

 

 

m=0

 

The probability Pr (| ˆ | = 3m + 1) can be bounded from above by the probability that one

θ

of the possible partitions with 3m + 1 leafs, m > 0, is chosen in favor of the trivial partition with just a single leaf. That is, the event that one of the partitions with 3m + 1 leafs is selected implies that partitions of all other sizes were not selected, including the trivial partition, from which the upper bound follows. This upper bound allows us to bound the expected number of leafs as follows:

ˆ

|]

(n1)/3

(3m + 1)Nm Pm

E[|θ

 

 

m=0

where Nm denotes the number of different (3m + 1)-leaf partitions, and pm denotes the probability that a particular (3m + 1)-leaf partition is chosen in favor of the trivial partition (under the homogeneous assumption). The number Nm can be bounded above by mn , just as in the verification of the Kraft inequality. The probability pm can be bounded as follows. Note this is the probability of a particular outcome of a comparison of two models. The comparison is made between their respective sum-of-squared errors plus complexity penalty, as given by Equation (14.2). The single leaf model has a single degree of freedom (mean value of the entire region), and the alternate model, based on the (3m + 1)-leaf has 3m + 1 degrees of freedom. Thus, under the assumption that the data are i.i.d. zero-mean Gaussian distributed with variance σ 2, it is easy to verify that the difference between the sum-of-squared errors of the models [single-leaf model sum-of-squares minus (3m + 1)- leaf model sum-of-squares] is distributed as σ 2 W3m , where W3m is a chi-square distributed random variable with 3m degrees of freedom (precisely the difference between the degrees of freedom in the two models). This follows from the fact that the difference of the sum-of- squared errors is equal to the sum-of-squares of an orthogonal projection of the data onto a 3m-dimensional subspace.

The single-leaf model is rejected if σ 2 W3m is greater than the difference between the complexity penalties associated with the two models; that is, if σ 2 W3m

566 SENSOR NETWORKS

p(n) 2σ 2 p(n) = 6mσ 2 p(n), where 2σ 2 p(n) is the penalty associated with each additional leaf in P. According to the MSE analysis in the previous section, we require p(n) = γ log n, with γ > 7/12. In References [119–125] γ = 2/3, in which case the rejection of the single-leaf model is equivalent to W3m > 4m log n. The probability of this condition, pm = Pr(W3m > 4m log n), is bounded from above using Lemma 1 of Laurent and Massart

[130]: ‘If W

d is chi-

square distributed with d degrees of freedom, then for s > 0 Pr(W

d

 

 

 

2

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

d + s2d + s2) es

/2’. Making the identification d + g2d + s2 = 4m log n produces

the bound

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

pm = Pr(W3m > 4m log n) e2m log n+m[3/2(4 log n3/2)]

 

 

 

 

 

 

Combining the upper bounds above, we have

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(n1)/3

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(3m + 1)

 

 

 

 

 

/

2(4 log n3

/

 

 

 

 

 

 

 

 

 

 

 

 

E[|θˆ |]

 

m

e2m log n+m 3

 

 

2)

 

 

 

 

(14.8)

 

 

 

 

 

 

m=0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(n1)/3

 

 

 

n

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(3m + 1)

 

 

 

 

 

 

 

 

/

2(4 log n3

/

 

 

 

 

 

 

 

 

 

 

 

=

 

m

nm em log n+m

3

 

 

2)

 

 

 

 

 

 

 

 

 

 

 

m=0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

For n 270 the exponent log n + 3/2(4 log n 3/2) < 0 and therefore

 

 

 

 

 

(n1)/3

 

 

n

 

 

(n1)/3

 

 

 

nm

 

 

 

 

 

 

(n1)/3

 

 

 

 

 

 

 

E[|θˆ |]

 

(3m + 1) m

nm

 

 

 

(3m +

1)

 

nm

 

 

(3m + 1)/m! < 11

 

m=0

m!

m=0

 

 

m=0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Furthermore, note that, as n

 

, the exponent

log n

[3/2(4 log n

3/2)]

→ ∞

.

 

 

 

 

 

 

 

→ ∞m log n

+

m[3/2(4 logn

 

3/2)]

+

 

 

 

 

 

 

 

 

 

 

 

 

This fact implies that the factor e

 

 

 

 

tends to zero when m > 0. There-

 

 

 

 

 

 

 

 

ˆ

 

1 as n

→ ∞.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

fore, the expected number of leafs E[|θ |]

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Thus, for large sensor networks, the expected number of leafs (partition pieces) in the case where there is no boundary (simply a homogeneous field) is one. To consider the inhomogeneous case where a boundary does exist, if the boundary is a Lipschitz function or has a box counting dimension of 1, there exists a pruned RDP with at most C n squares (leafs) that includes the O(n) squares of sidelength 1/n that the boundary passes through. Thus an upper bound on the number of leafs required to describe the boundary in the noiseless case is given by C n.

In the presence of noise, we can use the results above for the homogeneous case to bound the number of spurious leafs due to noise (zero as n grows); as a result, for large sensor networks, we can expect at most C n leafs in total. Thus, the expected energy required to transmit the final boundary description is energy = O(n).

The processing cost is intimately tied to the expected size of the final tree, as this value determines how much pruning will occur. We have seen above that the communication cost is proportional to n and herein we shall show that the processing cost is also O(n). At each scale 2 j /n, j = 0, . . . , 1/2 log2 n 1, the hierarchical algorithm passes a certain number of data or averages, n j , corresponding to the number of squares in the best partition (up to that scale), up the tree to the next scale. We assume that a constant number of bits k is transmitted per measurement. These k n j bits must be transmitted approximately 2 j /n meters (assuming the sensor domain is normalized to 1 square meter). Thus, the total in-network communication energy in bit-meters is:

 

1/2 log2 n1

ε = k

n j 2 j /

 

n

 

j=0

OPTIMAL TRANSMISSION RADIUS IN SENSOR NETWORKS

567

65536 observations Est., p = 2/3 log(65536) Partition, |θ| = 1111

1024 observations Est., p = 2/3 log(1024) Partition, |θ| = 172

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

256 observations

Est., p = 2/3 log(256)

 

 

Partition, |θ| = 70

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Figure 14.14 Effect of sensor network density (resolution) on boundary estimation. Column 1, noisy set of measurements; column 2, estimated boundary; and column 3, associated partition. (Reproduced by permission of IEEE [121].)

In the naive approach, n j = n for all j, and therefore ε kn. In the hierarchical approach, first consider the case when there is no boundary. We have already seen that in such cases the tree will be pruned at each stage with high probability. Therefore, n j = n/4 j and ε

2kn. Now if a boundary of length Cn is present, then n n/4 j + Cn. This produces

εk(C + 2)n. Thus, we see that the hierarchical algorithm results in ε = O(n). Finally, a performance example is shown in Figure 14.14 [121].j

14.8 OPTIMAL TRANSMISSION RADIUS IN SENSOR NETWORKS

In this section we discuss the problem of finding an optimal transmission radius for flooding in sensor networks. On one hand, a large transmission radius implies that fewer retransmissions will be needed to reach the outlying nodes in the network; therefore, the message will be heard by all nodes in less time. On the other hand, a larger transmission radius involves a higher number of neighbors competing to access the medium, and therefore each node has a longer contention delay for packet transmissions. In this section we discuss this tradeoff in CSMA/CA wireless MAC protocols.

Even though flooding has some unique advantages – it maximizes the probability that all reachable nodes inside a network will receive the packet – it has several disadvantages as well. Several works have proposed mechanisms to improve flooding efficiency. The broadcast storm paper by Ni et al. [131] suggests a way to improve flooding by trading robustness. The authors propose to limit the number of nodes that transmit the flooded packet. The main idea is to have some nodes refrain from forwarding their packet if its transmission will not contribute to a larger coverage. Nevertheless, the basic flooding technique is in wide use for a number of querying techniques for sensor networks (in large part because of its guarantee of maximal robustness), and in this section we focus on analyzing its MAC-layer effects and improving its performance by minimizing the settling time of flooding.