Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Prime Numbers

.pdf
Скачиваний:
49
Добавлен:
23.03.2015
Размер:
2.99 Mб
Скачать

412

Chapter 8 THE UBIQUITY OF PRIME NUMBERS

Let us give an example of the application of such an algorithm. To assess the volume of the unit D-ball, which is the ball of radius 1, we can take f in terms of the Heaviside function θ (which is 1 for positive arguments, 0 for negative arguments, and 1/2 at 0),

f (x) = θ(1/4 (x − y) · (x − y)),

with y = (1/2, 1/2, . . . , 1/2), so that f vanishes everywhere outside a ball of radius 1/2. (This is the largest ball that fits inside the cube R.) The estimate of the unit D-ball volume will thus be 2DI, where I is the output of Algorithm 8.3.7 for the given, sphere-defining function f .

As we have intimated before, it is a wondrous thing to see firsthand how much better a qMC algorithm of this type can do, when compared to a direct Monte Carlo trial. One beautiful aspect of the fundamental qMC concept is that parallelism is easy: In Algorithm 8.3.7, just start each of, say, M machines at a di erent starting seed, ideally in such a way that some contiguous sequence of N M total vectors is realized. This option is, of course, the point of having a seed function in the first place. Explicitly, to obtain a one-billion-point integration, each of 100 machines would use the above algorithm as is with N = 107, except that machine 0 would start with n = 0 (and hence start by calling seed(0)), the second machine would start n = 1, through machine 99, which would start with n = 99. The final integral would be the average of the 100 machine estimates.

Here is a typical numerical comparison: We shall calculate the number π with qMC methods, and compare with direct Monte Carlo. Noting that the exact volume of the unit D-ball is

πD/2

VD = Γ(1 + D/2) ,

let us denote by VD(N ) the calculated volume after N vectors are generated, and denote by πN the “experimental” value for π obtained by solving the volume formula for π in terms of VD. We shall do two things at once: Display the typical convergence and convey a notion of the inherent parallelism. For primes p = 2, 3, 5, so that we are assessing the 3-ball volume, the result of Algorithm 8.3.7 is displayed in Table 8.1.

What is displayed in the left-hand column is the total number of points “dropped” into the unit D-cube, while the second column is the associated, cumulative approximation to π. We say cumulative because one may have run each interval of 106 counts on a separate machine, yet we display the right-hand column as the answer obtained by combining the machines up to that N value inclusive. For example, the result π5 can be thought of either as the result after 5 · 106 points are generated, or equivalently, after 5 separate machines each do 106 points. In the latter instance, one would have called the seed(n) procedure with 5 di erent initial seeds to start each respective machine’s interval. How do these data compare with direct Monte Carlo? The rough answer is that one can expect the error in the last (N = 107) row of

8.3 Quasi-Monte Carlo (qMC) methods

413

N/106 πN

13.14158

23.14154

33.14157

43.14157

53.14158

63.14158

73.14158

83.141590

9 3.14158

10 3.1415929

Table 8.1 Approximations to π via prime-based qMC (Halton) sequence, using primes p = 2, 3, 5, the volume of the unit 3-ball is assessed for various cumulative numbers of qMC points, N = 106 through N = 107. We have displayed decimal digits only through the first incorrect digit.

a similar Monte Carlo table to be in the third or so digit to the right of the

decimal (because log10 N is about 3.5 in this case). This superiority of qMC to direct methods—which is an advantage of several orders of magnitude—is typical for “millions” of points and moderate dimensions.

Now to the matter of Wall Street, meaning the phenomenon of computational finance. If the notion of very large dimensions D for integration has seemed fanciful, one need only cure that skepticism by observing the kind of calculation that has been attempted in connection with risk management theory and other aspects of computational finance. For example, 25-dimensional integrals relevant to financial computation, of the form

I =

· · ·

cos x

|

e−x·x dDx,

 

|

 

 

 

x R

 

 

were analyzed in [Papageorgiu and Traub 1997], with the conclusion that, surprisingly enough, qMC methods (in their case, using the Faure sequences) would outperform direct Monte Carlo methods, in spite of the asymptotic

estimate O((lnD N )/N ), which does not fare too well in practice against

O(1/ N ) when D = 25. In other treatments, for example [Paskov and Traub 1995], integrals with dimension as high as D = 360 are tested. As those authors astutely point out, their integrals (involving collateralized mortgage obligation, or CMO in the financial language) are good test cases because the integrand has a certain computational complexity and so—in their words— “it is crucial to sample the integrand as few times as possible.” As intimated in [Boyle et al. 1995] and by various other researchers, whether or not a qMC is superior to a direct Monte Carlo in some high dimension D depends very much on the actual calculation being performed. The general sentiment is that numerical analysts not from the financial world per se tend to use

414

Chapter 8 THE UBIQUITY OF PRIME NUMBERS

integrals that present the more di cult challenge for the qMC methods. That is, financial integrands are often “smoother” in practice.

Just as interesting as the qMC technique itself is the controversy that has simmered in the qMC literature. Some authors believe that the Halton sequence—the one on which we have focused as an example of primebased qMC—is inferior to, say, the Sobol [Bratley and Fox 1988] or Faure [Niederreiter 1992] sequences. And as we have indicated above, this assessment tends to depend strongly on the domain of application. Yet there is some theoretical motivation for the inferiority claims; namely, it is a theorem [Faure 1982] that the star discrepancy of a Faure sequence satisfies

D

1

 

p − 1

 

D lnD N

,

D!

2 ln p

 

N

N

 

 

where p is the least prime greater than or equal to D. Whereas a D- dimensional Halton sequence can be built from the first D primes, and this Faure bound involves the next prime, still the bound of Theorem 8.3.5 is considerably worse. What is likely is that both bounding theorems are not best-possible results. In any case, the prime numbers once again enter into discrepancy theory and its qMC applications.

As has been pointed out in the literature, there is the fact that qMC’s

error growth of O (lnD N )/N is, for su ciently large D, and su ciently

small N , or practical combinations of D, N magnitudes, worse than direct

Monte Carlo’s O 1/ N . Thus, some researchers do not recommend qMC

methods unconditionally. One controversial problem is that in spite of various theorems such as Theorem 8.3.5 and the Faure bound above, we still do not know how the “real-world” constants in front of the big-O terms really behave. Some recent developments address this controversy. One such development is the discovery of “leaped” Halton sequences. In this technique, one can “break” the unfortunate correlation between coordinates for the D-dimensional Halton sequence. This is done in two possible ways. First, one adopts a permutation on the inverse-radix digits of integers, and second, if the base primes are denoted by p0, . . . , pD−1, then one chooses yet another distinct prime pD and uses only every pD-th vector of the usual Halton sequence. This is claimed to improve the Halton sequence dramatically for high dimension, say D = 40 to 400 [Kocis and Whiten 1997]. It is of interest that these authors found a markedly good distinct prime pD to be 409, a phenomenon having no explanation. Another development, from [Crandall 1999a], involves the use of a reduced set of primes—even when D is large—and using the resulting lower-dimensional Halton sequence as a vector parameter for a D-dimensional space-filling curve. In view of the sharply base-dependent bound of Theorem 8.3.5, there is reason to believe that this technique of involving only small primes carries a distinct statistical advantage in higher dimensions.

While the notion of discrepancy is fairly old, there always seem to appear new ideas pertaining to the generation of qMC sets. One promising new approach involves the so-called (t, m, s)-nets [Owen 1995, 1997a, 1997b],

8.4 Diophantine analysis

415

[Tezuka 1995], [Veach 1997]. These are point clouds that have “minimal fill” properties. For example, a set of N = bm points in s dimensions is called a (t, m, s)-net if every justified box of volume bt−m has exactly bt points. Yet another intriguing connection between primes and discrepancy appears in the literature (see [Joe 1999] and references therein). This notion of “numbertheoretical rules” involves approximations of the form

 

p−1

 

< ,

[0,1]D f (x) dDx ≈ p j=0 f

/ p

1

 

jK

 

{ } where here y denotes the vector composed of the fractional parts of y, and K

is some chosen constant vector having each component coprime to p. Actually, composite numbers can be used in place of p, but the analysis of what is called L2 discrepancy, and the associated typical integration error, goes especially smoothly for p prime. We have mentioned these new approaches to underscore the notion that qMC is continually undergoing new development. And who knows when or where number theory or prime numbers in particular will appear in qMC theories of the future?

In closing this section, we mention a new result that may explain why qMC experiments sometimes do “so well.” Take the result in [Sloan and Wozniakowski 1998], in which the authors remark that some errors (such as those in Traub’s qMC for finance in D = 360 dimensions) appear to have O(1/N ) behavior, i.e., independent of dimension D. What the authors actually prove is that there exist classes of integrand functions for which suitable lowdiscrepancy sequences provide overall integration errors of order O(1/N ρ) for some real ρ [1, 2].

8.4 Diophantine analysis

Herein we discuss Diophantine analysis, which loosely speaking is the practice of discovering integer solutions to various equations. We have mentioned elsewhere Fermat’s last theorem (FLT), for which one seeks solutions to

xp + yp = zp,

and how numerical attacks alone have raised the lower bound on p into the millions (Section 1.3.3, Exercise 9.68). This is a wonderful computational problem—speaking independently, of course, of the marvelous FLT proof by A. Wiles—but there are many other similar explorations. Many such adventures involve a healthy mix of theory and computation.

For instance, there is the Catalan equation for p, q prime and x, y positive

integers,

xp − yq = 1,

of which the only known solution is the trivial yet attractive

32 23 = 1.

416

Chapter 8 THE UBIQUITY OF PRIME NUMBERS

Observe that in seeking Diophantine solutions here we are simply addressing the problem of whether there exist higher instances of consecutive powers. An accessible treatment of the history of the Catalan problem to the date of its publication is [Ribenboim 1994], while more recent surveys are [Mignotte 2001] and [Mets¨ankyl¨ 2004]. Using the theory of linear forms of logarithms of algebraic numbers, R. Tijdeman showed in 1976 that the Catalan equation has at most finitely many solutions; in fact,

yq < eeee730 ,

as discussed in [Guy 1994]. Thus, the complete resolution of the Catalan problem is reduced to a (huge!) computation. Shortly after Tijdeman’s great theorem, M. Langevin showed that any solution must have the exponents p, q < 10110. Over the years, this bound on the exponents continued to fall, with other results pushing up from below. For example at the time the first edition of the present book was published, it was known that min{p, q} > 107 and max{p, q} < 7.78 × 1016. Further, explicit easily checkable criteria on allowable exponent pairs were known, for example the double Wieferich condition of Mih˘ailescu: if p, q are Catalan exponents other than the pair

2, 3, then

pq−1 1 (mod q2) and qp−1 1 (mod p2).

It was hoped that such advances together with su ciently robust calculations might finish o the Catalan problem. In fact, the problem was indeed finished o , but using much more cleverness than computation.

In [Mih˘ailescu 2004] a complete proof of the Catalan problem is presented, and yes, 8 and 9 are the only pair of nontrivial consecutive powers. It is interesting that we still don’t know whether there are infinitely many pairs of consecutive powers that di er by 2, or any other fixed number larger than 1, though it is conjectured that there are not. In this regard, see Exercise 8.20.

Related both to Fermat’s last theorem and the Catalan problem is the Diophantine equation

xp + yq = zr,

(8.2)

where x, y, z are positive coprime integers and exponents p, q, r are positive integers with 1/p+1/q +1/r ≤ 1. The Fermat–Catalan conjecture asserts that there are at most finitely many such powers xp, yq , zr in (8.2). The following are the only known examples:

 

1p + 23 = 32

(p ≥ 7),

 

25

+ 72

= 34

,

 

132

+ 73

= 29

,

 

27 + 173

= 712,

 

35 + 114

= 1222,

338

+ 15490342

= 156133,

14143

+ 22134592

= 657,

8.4 Diophantine analysis

417

92623 + 153122832 = 1137,

177 + 762713 = 210639282,

438 + 962223 = 300429072.

(The latter five examples were found by F. Beukers and D. Zagier.) There is a cash prize (the Beal Prize) for a proof of the conjecture of Tijdeman and Zagier that (8.2) has no solutions at all when p, q, r ≥ 3; see [Bruin 2003] and [Mauldin 2000]. It is known [Darmon and Granville 1995] that for p, q, r fixed with 1/p+1/q +1/r ≤ 1, the equation (8.2) has at most finitely many coprime solutions x, y, z. We also know that in some cases for p, q, r the only solutions are those that appear in our small table. In particular, all of the triples with exponents {2, 3, 7}, {2, 3, 8}, {2, 3, 9}, and {2, 4, 5} are in the above list. In addition, there are many other triples of exponents for which it has been proved that there are no nontrivial solutions. These results are due to many people, including Bennett, Beukers, Bruin, Darmon, Ellenberg, Kraus, Merel, Poonen, Schaefer, Skinner, Stoll, Taylor, and Wiles. For some recent papers from which others may be tracked down, see [Bruin 2003] and [Beukers 2004].

The Fermat–Catalan conjecture is a special case of the notorious ABC conjecture of Masser. Let γ(n) denote the largest squarefree divisor of n. The ABC conjecture asserts that for each fixed > 0 there are at most finitely many coprime positive integer triples a, b, c with

a + b = c, γ(abc) < c1.

A recent survey of the ABC conjecture, including many marvelous consequences, may be found in [Granville and Tucker 2002].

Though much work in Diophantine equations is extraordinarily deep, there are many satisfying exercises that use such concepts as quadratic reciprocity to limit Diophantine solutions. For example, one can prove that

y2 = x3 + k

(8.3)

has no integral solutions whatever if k = (4n − 1)3 4m2, m = 0, and no prime dividing m is congruent to 3 (mod 4) (see Exercise 8.13).

Aside from interesting analyses of specific equations, there is a profound general theory of Diophantine equations. The saga of this decades-long investigation is fascinating. A fundamental question, posed at the turn of the last century as Hilbert’s “tenth problem,” asks for a general algorithm that will determine the solutions to an arbitrary Diophantine equation. In the attack on this problem, a central notion was that of a Diophantine set, which is a set S of positive integers such that some multivariate polynomial P (X, Y1, . . . , Yl) exists with coe cients in Z with the property that x S if and only if P (x, y1, . . . , yl) = 0 has a positive integer solution in the yj . It is not hard to prove the theorem of H. Putnam from 1960, see [Ribenboim 1996, p. 189], that a set S of positive integers is Diophantine if and only if there is a multivariate polynomial Q with integer coe cients such that the set of its positive values at nonnegative integer arguments is exactly the set S.

418

Chapter 8 THE UBIQUITY OF PRIME NUMBERS

Armed with this definition of a Diophantine set, formal mathematicians led by Putnam, Davis, Robinson, and Matijaseviˇc established the striking result that the set of prime numbers is Diophantine. That is, they showed that there exists a polynomial P —with integer coe cients in some number of variables—such that as its variables range over the nonnegative integers, the set of positive values of P is precisely the set of primes.

One such polynomial given explicitly by Jones, Sato, Wada, and Wiens in 1976 (see [Ribenboim 1996]) is

(k + 2) 1 (wz + h + j − q)2 ((gk + 2g + k + 1)(h + j) + h − z)2

(2n + p + q + z − e)2 16(k + 1)3(k + 2)(n + 1)2 + 1 − f 2 2

e3(e + 2)(a + 1)2 + 1 − o2 2 − a2y2 − y2 + 1 − x2 2

16r2y4(a2 1) + 1 − u2 2

((a + u4 − u2a)2 1)(n + 4dy)2 + 1 (x + cu)2 2

(n + l + v − y)2 − a2l2 − l2 + 1 − m2 2 (ai + k + 1 − l − i)2

(p + l(a − n − 1) + b(2an + 2a − n2 2n − 2) − m)2

(q + y(a − p − 1) + s(2ap + 2a − p2 2p − 2) − x)2

(z + pl(a − p) + t(2ap − p2 1) − pm)2 .

This polynomial has degree 25, and it conveniently has 26 variables, so that the letters of the English alphabet can each be used! An amusing consequence of such a prime-producing polynomial is that any prime p can be presented with a proof of primality that uses only O(1) arithmetic operations. Namely, supply the 26 values of the variables used in the above polynomial that gives the value p. However, the number of bit operations for this verification can be enormous.

Hilbert’s “tenth problem” was eventually solved—with the answer being that there can be no algorithm as sought—with the final step being Matijaseviˇc’s proof that every listable set is Diophantine. But along the way, for more than a half century, the set of primes was at center stage in the drama [Matijaseviˇc 1971], [Davis 1973].

Diophantine analysis, though amounting to the historical underpinning of all of number theory, is still today a fascinating, dynamic topic among mathematicians and recreationalists. One way to glimpse the generality of the field is to make use of network resources such as [Weisstein 2005]. A recommended book on Diophantine equations from a computational perspective is [Smart 1998].

8.5 Quantum computation

It seems appropriate to have in this applications chapter a brief discussion of what may become a dominant computational paradigm for the 21st century.

8.5 Quantum computation

419

We speak of quantum computation, which is to be thought of as a genuine replacement for computer processes as we have previously understood them. The first basic notion is a distinction between classical Turing machines (TMs) and quantum Turing machines (QTMs). The older TM model is the model of every prevailing computer of today, with the possible exception of very minuscule, tentative and experimental QTMs, in the form of small atomic experiments and so on. (Although one could argue that nature has been running a massive QTM for billions of years.) The primary feature of a TM is that it processes “serially,” in following a recipe of instructions (a program) in a deterministic fashion. (There is such a notion as a probabilistic TM behaving statistically, but we wish to simplify this overview and will avoid that conceptual pathway.) On the other hand, a QTM would be a device in which a certain “parallelism” of nature would be used to e ect computations with truly unprecedented e ciency. That parallelism is, of course, nature’s way of behaving according to laws of quantum mechanics. These laws involve many counterintuitive concepts. As students of quantum theory know, the microscopic phenomena in question do not occur as in the macroscopic world. There is the particle–wave duality (is an electron a wave or a particle or both?), the notion of amplitudes, probability, interference—not just among waves but among actual parcels of matter—and so on. The next section is a very brief outline of quantum computation concepts, intended to convey some qualitative features of this brand new science.

8.5.1Intuition on quantum Turing machines (QTMs)

Because QTMs are still overwhelmingly experimental, not having solved a single “useful” problem so far, we think it appropriate to sketch, mainly by analogy, what kind of behavior could be expected from a QTM. Think of holography, that science whereby a solid three-dimensional object is cast onto a planar “hologram.” What nature does is actually to “evaluate” a 3- dimensional Fourier transform whose local power fluctuations determine what is actually developed on the hologram. Because light moves about one foot in a nanosecond (109 seconds), one can legitimately say that when a laser light beam strikes an object (say a chess piece) and the reflections are mixed with a reference beam to generate a hologram, “nature performed a huge FFT in a couple of nanoseconds.” In a qualitative but striking sense, a known O(N ln N ) algorithm (where N would be su ciently many discrete spatial points to render a high-fidelity hologram, say) has turned into more like an O(1) one. Though it is somewhat facetious to employ our big-O notation in this context, we wish only to make the point that there is parallelism in the light-wave-interference model that underlies holography. On the film plane of the hologram, the final light intensity depends on every point on the chess piece. This is the holographic, one could say “parallel,” aspect. And QTM proposals are reminiscent of this e ect.

We are not saying that a laboratory hologram setup is a QTM, for some ingredients are missing in that simplistic scenario. For one thing, modern QTM

420

Chapter 8 THE UBIQUITY OF PRIME NUMBERS

theory has two other important elements beyond the principle of quantum interference; namely, probabilistic behavior, and a theoretical foundation involving operators such as unitary matrices. For another thing, we would like any practical QTM to bear not just on optical experiments, but also on some of the very di cult tasks faced by standard TMs—tasks such as the factoring of large integers. As have been a great many new ideas, the QTM notion was pioneered in large measure by the eminent R. Feynman, who observed that quantum-mechanical model calculations tend, on a conventional TM, to su er an exponential slowdown. Feynman even devised an explicit model of a QTM based on individual quantum registers [Feynman 1982, 1985]. The first formal definition was provided by [Deutsch 1982, 1985], to which current formal treatments more or less adhere. An excellent treatment—which sits conveniently between a lay perspective and a mathematical one—is [Williams and Clearwater 1998]. On the more technical side of the physics, and some of the relevant number-theoretical ideas, a good reference is [Ekert and Jozsa 1996]. For a very accessible lay treatment of quantum computation, see [Hey 1999], and for course-level material see [Preskill 1999].

Let us add a little more quantum flavor to the idea of laser light calculating an FFT, nature’s way. There is in quantum theory an ideal system called the quantum oscillator. Given a potential function V (x) = x2, the Schr¨odinger equation amounts to a prescription for how a wave packet ψ(x, t), where t denotes time, moves under the potential’s influence. The classical analogue is a simple mass-on-a-spring system, giving smooth oscillations of period τ , say. The quantum model also has oscillations, but they exhibit the following striking phenomenon: After one quarter of the classical period τ , an initial wave packet evolves into its own Fourier transform. This suggests that you could somehow load data into a QTM as an initial function ψ(x, 0), and later read o ψ(x, τ /4) as an FFT. (Incidentally, this idea underlies the discussion around the Riemann-ζ representation (8.5).) What we are saying is that the laser hologram scenario has an analogue involving particles and dynamics. We note also that wave functions ψ are complex amplitudes, with |ψ|2 being probability density, so this is how statistical features of quantum theory enter into the picture.

Moving now somewhat more toward the quantitative, and to prepare for the rest of this section, we presently lay down a few specific QTM concepts. It is important right at the outset, especially when number-theoretical algorithms are involved, to realize that an exponential number of quantities may be “polynomially stored” on a QTM. For example, here is how we can store in some fashion—in a so-called quantum register—every integer a [0, q − 1], in only lg q so-called qbits. At first this seems impossible, but recall our admission that the quantum world can be notoriously counterintuitive. A mental picture will help here. Let q = 2d, so that we shall construct a quantum register having d qbits. Now imagine a line of d individual ammonia molecules, each molecule being NH3 in chemical notation, thought of as a tetrahedron formed by the three hydrogens and a nitrogen apex. The N apex is to be thought of as “up” or “down,” 1 or 0, i.e., either above or below the

8.5 Quantum computation

421

three H’s. Thus, any d-bit binary number can be represented by a collective orientation of the molecules. But what about representing all possible binary strings of length d? This turns out to be easy, because of a remarkable quantum property: An ammonia molecule can be in both 1, 0 states at the same time. One way to think of this is that lowest-energy states—called ground states— are symmetrical when the geometry is. A container of ammonia in its ground state has each molecule somehow “halfway present” at each 0, 1 position. In theoretical notation we say that the ground state of one ammonia qbit (molecule, in this model) is given by:

1

( | 0 + | 1 ),

φ = 2

where the “bra-ket” notation | is standard (see the aforementioned quantumtheoretical references). The notation reminds us that a state belongs to an abstract Hilbert space, and only an inner product can bring this back to a measurable number. For example, given the ground state φ here, the probability that we find the molecule in state | 0 is the squared inner product

| 0 | φ |2 =

2 0 | 0

 

2

= 2 ,

 

1

 

 

 

1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

i.e., 50 per cent chance that the nitrogen atom is measured to be “down.” Now back to the whole quantum register of d qbits (molecules). If each molecule is in the ground state φ, then in some sense every single d-bit binary string is represented. In fact, we can describe the state of the entire register as [Shor 1999]

1

2d 1

ψ =

 

 

2d/2

| a ,

 

 

a=0

where now |a denotes the composite state given by the molecular orientations corresponding to the binary bits of a; for example, for d = 5 the state |10110 is the state in which the nitrogens are oriented “up, down, up, up, down.” This is not so magical as it sounds, when one realizes that now the probability of finding the entire register in a particular state a [0, 2d 1] is just 1/2d. It is this sense in which every integer a is stored—the collection of all a values is a “superposition” in the register.

Given a state that involves every integer a [0, q − 1], we can imagine acting on the qbits with unitary operators. For example, we might alter the 0-th and 7-th qbits by acting on the two states with a matrix operator. An immediate physical analogy here would be the processing of two input light beams, each possibly polarized up or down, via some slit interference experiment (having polaroid filters within) in which two beams are output. Such a unitary transformation preserves overall probabilities by redistributing amplitudes between states.

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]