Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Diss / (Springer Series in Information Sciences 25) S. Haykin, J. Litva, T. J. Shepherd (auth.), Professor Simon Haykin, Dr. John Litva, Dr. Terence J. Shepherd (eds.)-Radar Array Processing-Springer-Verlag

.pdf
Скачиваний:
68
Добавлен:
27.03.2016
Размер:
14.79 Mб
Скачать

5. Systolic Adaptive Beamforming

161

Now from (5.5) it follows that the vector of residuals may be written in the form

e(n) = X(n)w + y(n) ,

(5.10)

where

(5.11)

and

(5.12)

X(n) is simply the n x (p - 1) matrix of all data received in the auxiliary channels up to time tn, and y(n) is the corresponding n-element vector of data in the primary or reference channel. The matrix B(n) takes account of the exponential time window and, for convenience, it has simply been absorbed into the definition of e(n), y(n) and X(n).

Determination of the (p - l)-element weight vector w(n) which minimizes E2(n) is referred to as least-squares estimation [5.13]. The conventional approach to this problem is to derive an analytical expression for the complex gradient ofthe quantity E2(n) and determine the weight vector w(n) for which it vanishes. Now from (5.7) and (5.10) we have for the complex gradient

(5.13)

(where the superscript H denotes matrix Hermitian conjugation), and setting the right-hand side of this equation equal to zero leads to the well-known Gauss normal equation

M(n)w(n) + (!(n) = 0 ,

(5.14)

where

 

M(n) = XH(n)X(n)

(5.15)

and

 

 

(5.16)

M(n) is the estimated (p - 1) x (p -

1) covariance matrix and (!(n) is the estim-

ated (p - 1)-element cross-correlation vector between the auxiliary signals and

162 T.J. Shepherd and J.G. McWhirter

the primary signal. The solution to (5.14) for nonsingular M(n) is clearly given by

(5.17)

and this provides an analytic expression for the optimum weight vector at time tn.

In their classic paper, Reed et al. [5.14] suggested that the weight vector be obtained by solving (5.14) directly and showed that the problems of poor convergence associated with closed loop algorithms may be avoided in this way. This approach leads directly to the type of signal processing architecture which is illustrated schematically in Fig. 5.3. It comprises a number of distinct components: one to form and store the covariance matrix estimate, one to compute the solution to (5.14), and one to apply the resulting weight vector to the received signal data. These data must be stored in a suitable memory while the weight vector is being computed. The system also requires a number ofhighspeed data communication buses and a sophisticated control unit to deliver the appropriate sequence of instructions to each component. This type of architecture is obviously complicated, extremely difficult to design, and not very suitable for VLSI.

Not only does the analytic solution to (5.14) lead to a complicated circuit architecture, but it is also very poor from the numerical point of view. The problem ofsolving a system oflinear equations like those defined in (5.14) can be ill-conditioned and hence numerically unstable. Ill-conditioning occurs if the

 

PRDlARY CIlANNBL

 

BLOCK DATA

DATA

FORM

MEMORY

 

 

 

 

SYSTEM

 

 

CONTROL

 

 

 

SOLVE

OUTPUT SIGNAL

Fig. 5.3. Schematic of operations required for the Sample Matrix Inversion method

5. Systolic Adaptive Beamforming

163

matrix has a very smaIl determinant, in which case the true solution can be subject to large perturbations and still satisfy the equation quite accurately. The degree to which a system of linear equations is ill-conditioned is determined by the condition number of the coefficient matrix. The condition number C(A) of a matrix A is defined by

(5.18)

where A.1 and A.Nare the largest and smallest singular values, respectively, of the matrix A. The larger C(A), the more ill-conditioned is the system ofequations. It follows from (5.15) that

C(M(n» = C(XH(n)X(n)) = C 2 (X(n))

(5.19)

and so the condition number of the estimated covariance matrix M(n) is much greater than that of the corresponding data matrix X(n). Any numerical algorithm which avoids forming the covariance matrix explicitly and operates directly on the data is likely to be much better conditioned.

5.4QR Decomposition by Givens Rotations

5.4.1QR Decomposition

An alternative approach to the least-squares estimation problem which is particularly good in the numerical sense is that of orthogonal triangularization [5.15]. This is typified by the method known as QR decomposition, which we generalize here to the case of complex data. An n x n unitary matrix Q(n) is generated such that

Q(n)X(n) = [ R~n)] ,

(5.20)

where R(n) is a (p - 1) x (p - 1) upper triangular matrix. Applying the same unitary transformation to the vector y(n), we define

Q(n)y(n) = [

u(n)]

(5.21)

v(n) ,

i.e.,

 

 

u(n) = P(n)y(n)

(5.22)

and

 

 

v(n) = S(n)y(n) ,

(5.23)

164 T.J. Shepherd and J.G. McWhirter

where P(n) and S(n) are the matrices of dimension (p -

1) x nand (n - p + 1)

x n, respectively, which partition Q(n) in the form

 

Q(n) = [ S(n)p(n)] .

 

 

(5.24)

From (5.10), (5.20), and (5.21) it is clear that

 

Q(n)e(n) = [

R(n)]

[u(n)]

(5.25)

 

0

W +

v(n) ,

and, since the matrix Q(n) is unitary, we have

 

E(n) = II e(n) II

= II Q(n)e(n) II .

(5.26)

It follows that the least-squares weight vector w(n) must satisfy the equation

R(n)w(n) + u(n) = 0 ,

(5.27)

in which case the minimum norm residual vector is given by

 

Q(n)e(n) = [ v~n)].

(5.28)

Hence we have

 

e(n) = SH(n)v(n)

(5.29)

and

 

E(n) = II v(n) II .

(5.30)

Since the matrix R(n) is upper triangular, (5.27) is much easier to solve than the Gauss normal equation described earlier. The weight vector w(n) may be derived quite simply by a process of back-substitution. Equation (5.27) is also much better conditioned since the condition number of R(n) is given by

C[R(n)] = C[Q(n)X(n)] = C[X(n)] .

(5.31)

This property follows directly from the fact that Q(n) is unitary.

5.4.2 Givens Rotations

The triangularization process may be carried out using either Householder transformations [5.15] or Givens rotations [5.16]. However, the Givens rotation method is particularly suitable for the adaptive antenna application since it leads to a very efficient algorithm whereby the triangularization process is recursively updated as each new row of data enters the problem. A complex

5. Systolic Adaptive Beamforming

165

Givens rotation is an elementary transformation of the form

[

C

S*] [O ...

O,{3ri

... {3rk ...

]=[o ...

o,r; ...

r~ ... ], (5.32)

- S

cO

0, Xi

Xk

0

0, 0

Xk .. .

 

where we have, for generality, included an explicit scaling factor {3. The rotation coefficients, c and s, satisfy

-So {3r

j

+ C'Xi = 0)

 

s*s+c*c=1

(5.33)

 

 

c* = C

 

and are a generalization of the cosine and sine for an angular rotation in a twodimensional complex space. These relationships uniquely specify the rotation coefficients as

(5.34)

and

(5.35)

where rj is defined to be real and nonnegative and, as a consequence, r; is also real and nonnegative.

A sequence of such elimination operations may be used to triangularize the matrix X(n) in the following recursive manner. Assume that the matrix X(n - 1) has already been reduced to upper triangular form by the unitary transformation

o

 

Q(n - 1)X(n - 1) = [ R(n - 1)] ,

(5.36)

where it is assumed without loss of generality that the diagonal elements of R(n - 1) are real and nonnegative. Now define the unitary matrix

(5.37)

Clearly,

(5.38)

166 T.J. Shepherd and J.G. McWhirter

and so the triangular process may be completed by the following sequence of operations. Rotate the (p - I)-element vector xT(tn) with the first -row of pR(n - 1) so that the leading element of xT(tn ) is eliminated, producing a reduced vector x'T(tn ). The first row of R(n - 1) will, ofcourse, be modified in the process. Then rotate the (p - 2)-element reduced vector x'T(tn ) with the second row of pR(n - 1) so that the leading element of x'T(tn) is eliminated, and so on until every element associated with the data vector has been rotated away. The resulting triangular matrix R(n) then corresponds to a complete triangularization of the matrix X(n) as defined in (5.20). The corresponding unitary matrix Q(n) is simply given by the recursive expression

Q(n) = Q(n)Q(n - 1) ,

(5.39)

where Q(n) is a unitary matrix representing the sequence of Givens rotation operations described above, i.e.,

Q(n)[PR(~-

1)]= [R~n)].

 

 

(5.40)

xT(tn )

 

0

 

 

 

It is not difficult to deduce in addition that

 

Pu(n -

1}]

[ u(n)

]

[u(n)]

(5.41)

Q(n) [ pv(n -

1)

=

pv(n - 1)

=

v(n) ,

y(tn }

 

 

o:(n)

 

 

 

and this shows how the vector u(n) can be updated recursively using the same sequence of Givens rotations. The least-squares weight vector w(n) may then be derived by solving (5.27). The solution is not defined, ofcourse, if n < (p - 1) but the recursive triangularization procedure may, nonetheless, be initialized by setting R(O) = 0 and u(O) = o.

5.4.3 Systolic Array Implementation

Gentleman and Kung [5.1] have shown how the Givens rotation algorithm described above may be implemented in a very efficient pipelined manner using a triangular systolic array. The implementation of a 5-channel adaptive beamforming network based on this architecture is shown in Fig. 5.4. It may be considered to comprise three distinct sections-the basic triangular array labelled ABC, the right-hand column of cells labelled DE and the final processing cell labelled F. The entire array is controlled by a single clock and comprises three types of processing cell. The function of the boundary and internal cells is specified in Fig. 5.4, the final cell being a simple two-input multiplier. Each cell receives its input data from the directions indicated on one clock cycle, performs the specified function and delivers the appropriate output values to neighbouring

x2(3)

Xl(3) x2(2)

Xl(2) X2(1)

Xl(1) 1

~i(t

~(c.S)

~out

if xin = 0 then

(c .. 1; 5 .. 0; r .. (3r;

~out .. ~in) otherwise

(r'

.. (I¥r2 +

IXinI2)~ ;

c

~ (3r/rl; 5

+-

xin/r';

r

+- r'; "Yout

+- c"Yin)

x3(3)

x3(2)

X3(1)

1

1

 

5. Systolic Adaptive Beamforming

167

 

y(3)

 

x4(3)

y(2)

 

X4(2)

y(l)

 

x4(1)

,

 

1

1I

 

,

 

1

~

 

INTERNAL CEll.

"out .. C xin - s(3r r 0(- s*Xin + cfJr

RESIDUAL

Fig. 5.4. Basic QR decomposition array. Data Xi(tn) are denoted xi(n)

cells as indicated on the next clock cycle. Each cell within the basic triangular array stores one element of the recursively evolving triangular matrix R(n) which is initialized to zero at the outset of the least-squares calculation and then updated every clock cycle. As a result of this initialization the value of r within each boundary cell is entirely real. Cells in the right-hand column store one element of the evolving vector u(n) which is also initialized to zero and updated every clock cycle.

Each row of cells within the array performs a basic Givens rotation between one row of the stored triangular matrix and a vector ofdata received from above so that the leading element of the received vector is eliminated as detailed in

168 T.J. Shepherd and J.G. McWhirter

(5.32). The boundary cell computes the appropriate rotation parameters as defined in (5.33) and passes them on to the right on the next clock cycle. (The additional parameter y will be explained in Sect. 5.5.) The internal cells are subsequently used to apply the same rotation to all other elements in the received data vector. Since a delay of one clock cycle per cell is incurred in passing the rotation parameters along the row, it is necessary to impose a corresponding time skew on the input data vectors as shown in Fig. 5.4. Having had its leading element eliminated, the reduced data vector is passed down to the next row of the array and so on. This arrangement ensures that as each row xT(tn ) of the matrix X moves down through the array it interacts with the previously stored triangular matrix R(n - 1) and undergoes the sequence of rotations Q(n) described in the earlier analysis. All of its elements are thereby eliminated (one on each row of the array) and an updated triangular matrix R(n) is generated and stored in the process.

As each element of the vector y moves down through the right-hand column of cells it undergoes the same sequence of Givens rotations interacting with the previously stored vector u(n - 1) and generating an updated vector u(n) in the process. The resulting output, which emerges from the bottom cell in the right hand column eight (in general 2p - 2) clock cycles after the first element Xl (td enters the array, is simply the value of the parameter lX(n) in (5.41).

Having generated both the triangular matrix R(n) and the corresponding vector u(n), the weight vector w(n) may be obtained in a very straightforward manner by solving (5.27) using the method of back-substitution. This could be accomplished in practice using an additional linear systolic array as originally proposed by Gentleman and Kung [5.1] and explained in some detail by Haykin [5.6]. It could also be achieved by means of an additional triangular systolic array of the type suggested by Schreiber and Kuekes [5.17] or, alternatively, by defining a distinct back-substitution mode of operation for the main triangular QR decomposition array. The first of these two methods clearly require some additional processing hardware, not just to perform the back-substitution, but also to unload the elements ofthe triangular matrix R(n) at the appropriate time and in the required sequence. The third method only requires the function ofthe existing hardware to be extended, but suffers (in common with the other methods) from the basic fact that the first element of the matrix R(n) which is required for the purposes of back-substitution is the last one to be formed within the systolic QR decomposition array. As a result, the processing wavefront would have to flow from bottom to top during the back-substitution process, whereas it flows from top to bottom during the QR decomposition stage. Accordingly, it is impossible to merge the two processes in an efficient pipelined manner and, whichever approach is adopted, the QR decomposition procedure must be interrupted for O(p) clock cycles every time a back-substitution is to be performed. This renders the use of back-substitution prohibitively slow for any application in which the optimum weight vector is required at every time iteration. It should be noted in particular that the least-squares residual e(tn ), which constitutes the noise-reduced output signal from an adaptive beam-

5. Systolic Adaptive Beamforming

169

former, is normally required at every time epoch tn and is defined in terms of the weight vector w(n) according to (S.5).

In Sect. S.5 we will show how the least-squares residual e(tn ) may be obtained directly from the triangular systolic array without any need to compute the weight vector explicitly, thereby avoiding the problems associated with backsubstitution. In Sect. S.6, it will then be shown how, as a consequence, the weight vector itself may be obtained from the triangular array without unloading the stored elements or employing any additional hardware. This technique allows the weight vector to be output at the rate of one element every clock cycle and is referred to as "serial weight flushing". A technique for extracting all elements of the updated weight vector every clock cycle without interrupting the QR decomposition process is briefly described in Sect. S.10 and termed "parallel weight extraction". Before describingaltemative techniques for residual and weight vector extraction, however, we continue our discussion of the systolic QR decomposition process.

5.4.4 Square-Root-Free Algorithm

Gentleman [S.18] derived a modified version of the Givens rotation algorithm for QR decomposition which requires no square-root operations. This algorithm, which has obvious advantages, can also be implemented in a highly pipelined manner on a triangular array of the type depicted in Fig. S.Sa. The essence of the square-root-free algorithm is the following factorization of R(n),

R(n) = Dl/2(n)R(n) ,

(S.42)

where

(S.43)

Here, rij(n) denotes the (i,j)th element of R(n), and R(n) is a unit upper triangular matrix containing elements which are rational functions of the elements of X(n). This latter property can be proved by construction. Introducing a corresponding scaling factor 0112 into each data row vector, the elementary Givens rotation in (S.32) may be written in the form

 

°.... 0,0, ... 0 X k

•••

 

 

_ [0

0, d'1/2 ... d'1/2r~

. . 'J

,

(S.44)

-

 

'112 -,

 

 

where d1/2 represents the appropriate diagonal element of R(n) in the diagonal matrix D l /2 (n), rk denotes a general upper off-diagonal element of the matrix R(n), and Xk is the value of a data row element after division by 01/2. From (S.34)

170 T.J. Shepherd and J.G. McWhirter

and (5.35) it follows that the rotation parameters are simply given by

c = P(djd')1/2

(5.45)

and

(5.46)

where

(5.47)

 

 

X3(3)

 

X2(3)

X3(2)

Xl (3)

x2(2)

x3(1)

xl (2)

x2(1)

l

 

xI(I)

l

l

BOUNDARY CELL

~in~

~(c.s.z)

60ut

if "in ~ 0 or "in - 0 then

(d ~ {ild; S ~ 0; z- "in;

"out - din) otherwise

(z - "in; d' ~ ,62d + din 1Z12;

X4(3)

X4(2)

X4(1)

l l l

y(3)

y(2)

y(l)

I

l

,l

RESIDUAL

INTERNAL CELL

Xout E- xin - Z r f' E- 5 X*in + C r

c - ,62dl d'; s - din zl d';

d - d'; "out - c"in)

Fig. S.5a. QR decomposition array using square-root-free algorithm. Data Xi(t.) are denoted xj(n)