- •1 Introduction
- •1.1 What makes eigenvalues interesting?
- •1.2 Example 1: The vibrating string
- •1.2.1 Problem setting
- •1.2.2 The method of separation of variables
- •1.3.3 Global functions
- •1.3.4 A numerical comparison
- •1.4 Example 2: The heat equation
- •1.5 Example 3: The wave equation
- •1.6 The 2D Laplace eigenvalue problem
- •1.6.3 A numerical example
- •1.7 Cavity resonances in particle accelerators
- •1.8 Spectral clustering
- •1.8.1 The graph Laplacian
- •1.8.2 Spectral clustering
- •1.8.3 Normalized graph Laplacians
- •1.9 Other sources of eigenvalue problems
- •Bibliography
- •2 Basics
- •2.1 Notation
- •2.2 Statement of the problem
- •2.3 Similarity transformations
- •2.4 Schur decomposition
- •2.5 The real Schur decomposition
- •2.6 Normal matrices
- •2.7 Hermitian matrices
- •2.8 Cholesky factorization
- •2.9 The singular value decomposition (SVD)
- •2.10 Projections
- •2.11 Angles between vectors and subspaces
- •Bibliography
- •3 The QR Algorithm
- •3.1 The basic QR algorithm
- •3.1.1 Numerical experiments
- •3.2 The Hessenberg QR algorithm
- •3.2.1 A numerical experiment
- •3.2.2 Complexity
- •3.3 The Householder reduction to Hessenberg form
- •3.3.2 Reduction to Hessenberg form
- •3.4 Improving the convergence of the QR algorithm
- •3.4.1 A numerical example
- •3.4.2 QR algorithm with shifts
- •3.4.3 A numerical example
- •3.5 The double shift QR algorithm
- •3.5.1 A numerical example
- •3.5.2 The complexity
- •3.6 The symmetric tridiagonal QR algorithm
- •3.6.1 Reduction to tridiagonal form
- •3.6.2 The tridiagonal QR algorithm
- •3.7 Research
- •3.8 Summary
- •Bibliography
- •4.1 The divide and conquer idea
- •4.2 Partitioning the tridiagonal matrix
- •4.3 Solving the small systems
- •4.4 Deflation
- •4.4.1 Numerical examples
- •4.6 Solving the secular equation
- •4.7 A first algorithm
- •4.7.1 A numerical example
- •4.8 The algorithm of Gu and Eisenstat
- •4.8.1 A numerical example [continued]
- •Bibliography
- •5 LAPACK and the BLAS
- •5.1 LAPACK
- •5.2 BLAS
- •5.2.1 Typical performance numbers for the BLAS
- •5.3 Blocking
- •5.4 LAPACK solvers for the symmetric eigenproblems
- •5.6 An example of a LAPACK routines
- •Bibliography
- •6 Vector iteration (power method)
- •6.1 Simple vector iteration
- •6.2 Convergence analysis
- •6.3 A numerical example
- •6.4 The symmetric case
- •6.5 Inverse vector iteration
- •6.6 The generalized eigenvalue problem
- •6.7 Computing higher eigenvalues
- •6.8 Rayleigh quotient iteration
- •6.8.1 A numerical example
- •Bibliography
- •7 Simultaneous vector or subspace iterations
- •7.1 Basic subspace iteration
- •7.2 Convergence of basic subspace iteration
- •7.3 Accelerating subspace iteration
- •7.4 Relation between subspace iteration and QR algorithm
- •7.5 Addendum
- •Bibliography
- •8 Krylov subspaces
- •8.1 Introduction
- •8.3 Polynomial representation of Krylov subspaces
- •8.4 Error bounds of Saad
- •Bibliography
- •9 Arnoldi and Lanczos algorithms
- •9.2 Arnoldi algorithm with explicit restarts
- •9.3 The Lanczos basis
- •9.4 The Lanczos process as an iterative method
- •9.5 An error analysis of the unmodified Lanczos algorithm
- •9.6 Partial reorthogonalization
- •9.7 Block Lanczos
- •9.8 External selective reorthogonalization
- •Bibliography
- •10 Restarting Arnoldi and Lanczos algorithms
- •10.2 Implicit restart
- •10.3 Convergence criterion
- •10.4 The generalized eigenvalue problem
- •10.5 A numerical example
- •10.6 Another numerical example
- •10.7 The Lanczos algorithm with thick restarts
- •10.8 Krylov–Schur algorithm
- •10.9 The rational Krylov space method
- •Bibliography
- •11 The Jacobi-Davidson Method
- •11.1 The Davidson algorithm
- •11.2 The Jacobi orthogonal component correction
- •11.2.1 Restarts
- •11.2.2 The computation of several eigenvalues
- •11.2.3 Spectral shifts
- •11.3 The generalized Hermitian eigenvalue problem
- •11.4 A numerical example
- •11.6 Harmonic Ritz values and vectors
- •11.7 Refined Ritz vectors
- •11.8 The generalized Schur decomposition
- •11.9.1 Restart
- •11.9.3 Algorithm
- •Bibliography
- •12 Rayleigh quotient and trace minimization
- •12.1 Introduction
- •12.2 The method of steepest descent
- •12.3 The conjugate gradient algorithm
- •12.4 Locally optimal PCG (LOPCG)
- •12.5 The block Rayleigh quotient minimization algorithm (BRQMIN)
- •12.7 A numerical example
- •12.8 Trace minimization
- •Bibliography
72 CHAPTER 3. THE QR ALGORITHM
1: Q1:n,p−1:p := Q1:n,p−1:pP ;
which costs another 12n3 flops.
We earlier gave the estimate of 6n3 flops for a Hessenberg QR step, see Algorithm 3.2. If the latter has to be spent in complex arithmetic then the single shift Hessenberg QR algorithm is more expensive than the double shift Hessenberg QR algorithm that is executed in real arithmetic.
Remember that the reduction to Hessenberg form costs 103 n3 flops without forming the transformation matrix and 143 n3 if this matrix is formed.
3.6The symmetric tridiagonal QR algorithm
The QR algorithm can be applied straight to Hermitian or symmetric matrices. By (3.1) we see that the QR algorithm generates a sequence {Ak} of symmetric matrices. Taking into account the symmetry, the performance of the algorithm can be improved considerably. Furthermore, from Theorem 2.14 we know that Hermitian matrices have a real spectrum. Therefore, we can restrict ourselves to single shifts.
3.6.1Reduction to tridiagonal form
The reduction of a full Hermitian matrix to Hessenberg form produces a Hermitian Hessenberg matrix, which (up to rounding errors) is a real symmetric tridiagonal matrix. Let us consider how to take into account symmetry. To that end let us consider the first reduction step that introduces n − 2 zeros into the first column (and the first row) of A = A Cn×n. Let
P1 |
= |
0 In−1 |
− 2u1u1 |
, u1 Cn, ku1k = 1. |
|
|
1 |
0T |
|
Then,
A |
1 |
:= P AP |
1 |
= (I |
− |
2u |
u )A(I |
− |
2u |
u ) |
|||||||
|
1 |
|
|
1 |
|
1 |
|
|
1 |
1 |
|
|
|||||
|
|
|
|
= A |
− |
u |
(2u A |
− |
2(u Au |
)u ) |
|||||||
|
|
|
|
|
1 |
| |
1 |
|
|
|
1 |
1 |
1 |
||||
|
|
|
|
|
|
|
|
|
|
{z |
|
|
|
} |
|||
v1 = A − u1v1 − v1u1.
In the k-th step of the reduction we similarly have
− (2Au1 − 2u1(u Au1)) u
| {z 1 } 1 v1
Ak = Pk Ak−1Pk = Ak−1 − uk−1vk−1 − vk−1uk−1,
where the last n − k elements of uk−1 and vk−1 are nonzero. Forming
vk−1 = 2Ak−1uk−1 − 2uk−1(uk−1Ak−1uk−1)
costs 2(n − k)2 + O(n − k) flops. This complexity results from Ak−1uk−1. The rank-2
update of Ak−1,
Ak = Ak−1 − uk−1vk−1 − vk−1uk−1,
requires another 2(n−k)2 +O(n−k) flops, taking into account symmetry. By consequence, the transformation to tridiagonal form can be accomplished in
nX−1 4(n − k)2 + O(n − k) = 43n3 + O(n2)
k=1
3.6. THE SYMMETRIC TRIDIAGONAL QR ALGORITHM |
73 |
floating point operations.
3.6.2The tridiagonal QR algorithm
In the symmetric case the Hessenberg QR algorithm becomes a tridiagonal QR algorithm. This can be executed in an explicit or an implicit way. In the explicit form, a QR step is essentially
1:Choose a shift µ
2:Compute the QR factorization A − µI = QR
3:Update A by A = RQ + µI.
Of course, this is done by means of plane rotations and by respecting the symmetric tridiagonal structure of A.
In the more elegant implicit form of the algorithm we first compute the first Givens rotation G0 = G(1, 2, ϑ) of the QR factorization that zeros the (2, 1) element of A − µI,
|
−s |
c |
a21 |
0 |
(3.11) |
c |
s |
a11 − µ |
= , c = cos(ϑ0), s = sin(ϑ0). |
Performing a similary transformation with G0 we have (n = 5)
|
|
× |
× |
× |
|
G0AG0 |
= A′ = |
× |
× |
+ |
|
+ × × × |
|||||
|
|
|
|
× × |
|
|
|
|
|
× |
|
|
|
|
|
|
|
|
|
|
|
× |
× |
Similar as with the double step Hessenberg QR algorithm we chase the bulge down the diagonal. In the 5 × 5 example this becomes
|
G0 |
× × × |
|
|
G1 |
× × × + |
|
|||||||
|
−−=−G−−(1−,−2−,−ϑ−0→) |
|
× |
× |
+ |
|
|
−−=−G−−(2−,−3−,−ϑ−1→) |
|
× |
× |
0 |
|
|
|
|
|
|
|
|
+ |
|
|
||||||
|
|
|
|
|
× |
× |
|
|
|
|
|
× |
× |
|
A |
|
|
+ |
× |
× |
|
|
0 |
× |
× |
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
× |
× |
|
|
|
|
|
× |
× |
G2 |
× × × |
|
|
|
G3 |
× × × |
|
|
|
|
|
|
|||||
−−=−G−−(3−,−4−,−ϑ−2→) |
× |
× |
0 |
|
|
|
−−=−G−−(4−,−5−,−ϑ−3→) |
× |
× |
|
|
|
|
|
|
||
|
0 |
|
|
|
|
× × × |
|
|
|
|
|||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
||
|
|
× |
× |
× |
× |
|
|
× × |
× |
|
|
|
|||||
|
|
× |
× |
+ |
|
|
|
0 |
|
0 |
|
= A. |
|||||
|
|
+ |
|
|
|
|
|
|
|
|
|||||||
The full step is given by |
|
|
|
× |
× |
|
|
|
× |
× |
|
|
|
||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
= Q AQ, |
|
Q = G0 G1 · · · Gn−2. |
|
|
|
|
|
|
|
|||||
|
|
A |
|
|
|
|
|
|
|
|
|||||||
Because Gke1 = e1 for k > 0 we have
Q e1 = G0 G1 · · · Gn−2 e1 = G0 e1.
Both explicit and implicit QR step form the same first plane rotation G0. By referring to the Implicit Q Theorem 3.5 we see that explicit and implicit QR step compute essentially the same A.
74 |
CHAPTER 3. THE QR ALGORITHM |
Algorithm 3.6 Symmetric tridiagonal QR algorithm with implicit Wilkinson shift
1:Let T Rn×n be a symmetric tridiagonal matrix with diagonal entries a1, . . . , an and o -diagonal entries b2, . . . , bn.
This algorithm computes the eigenvalues λ1, . . . , λn of T and corresponding eigenvectors q1, . . . , qn. The eigenvalues are stored in a1, . . . , an. The eigenvectors are stored in the matrix Q, such that T Q = Q diag(a1, . . . , an).
2:m = n /* Actual problem dimension. m is reduced in the convergence check. */
3:while m > 1 do
4:d := (am−1 − am)/2; /* Compute Wilkinson’s shift */
5:if d = 0 then
6:s := am − |bm|;
7: |
else |
||
|
s := am − bm2 /(d + sign(d)p |
|
); |
8: |
d2 + bm2 |
||
9:end if
10:x := a(1) − s; /* Implicit QR step begins here */
11:y := b(2);
12:for k = 1 to m − 1 do
13:if m > 2 then
14:[c, s] := givens(x, y);
15: |
else |
s c |
b2 |
a2 |
|
|
|
||||
16: |
Determine [c, s] such that c |
−s |
a1 |
b2 |
|
c s
−s c is diagonal
17:end if
18:w := cx − sy;
19:d := ak − ak+1; z := (2cbk+1 + ds)s;
20:ak := ak − z; ak+1 := ak+1 + z;
21:bk+1 := dcs + (c2 − s2)bk+1;
22:x := bk+1;
23:if k > 1 then
24:bk := w;
25:end if
26:if k < m − 1 then
27: |
y := −sbk+2; bk+2 := cbk+2; |
||
28: |
end if |
c |
s |
|
|
||
29: |
Q1:n;k:k+1 := Q1:n;k:k+1 |
−s |
c ; |
30:end for/* Implicit QR step ends here */
31:if |bm| < ε(|am−1| + |am|) then /* Check for convergence */
32:m := m − 1;
33:end if
34:end while
