Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:

Prime Numbers

.pdf
Скачиваний:
43
Добавлен:
23.03.2015
Размер:
2.99 Mб
Скачать

332

Chapter 7 ELLIPTIC CURVE ARITHMETIC

2. [Begin Montgomery adding/doubling ladder]

[U : V ] = [X : Z]; // Copy coordinate.

[T : W ] = doubleh([X : Z]);

3. [Loop over bits of n, starting with next-to-highest] for(B − 2 ≥ j ≥ 0) {

if(nj == 1) {

[U : V ] = addh([T : W ], [U : V ], [X : Z]);

[T : W ] = doubleh([T : W ]);

} else {

[T : W ] = addh([U : V ], [T : W ], [X : Z]);

[U : V ] = doubleh([U : V ]);

}

}

4. [Final calculation]

if(n0 == 1) return addh([U : V ], [T : W ], [X : Y ]); return doubleh([U : V ]);

Montgomery’s rules when B = 0 make for an e cient algorithm, as can be seen from the simplification of the addh() and doubleh() function forms. In particular, the addh() and doubleh() functions can each be done in 9 multiplications. In the case B = 0, A = 1, the operation count drops further.

We have noted that to get the a ne x-coordinate of [n]P , one must compute XZ1 in the field. When n is very large, the single inversion is, of course, not expensive in comparison. But such inversion can sometimes be avoided entirely. For example, if, as in factoring studies covered later, we wish to know whether [n]P = [m]P in the elliptic-curve group, it is enough to check whether the cross product XnZm − XmZn vanishes, and this is yet another inversion-free task. Similarly, there is a very convenient fact: If the point at infinity has been attained by some multiple [n]P = O, then the Z denominator will have vanished, and any further multiples [mn]P will also have vanishing Z denominator. Because of this, one need not find the precise multiple when O is attained; the fact of Z = 0 propagates nicely through successive applications of the elliptic multiply functions.

We have observed that only x-coordinates of multiples [n]P are processed in Algorithm 7.2.7, and that ignorance of y values is acceptable in certain implementations. It is not easy to add two arbitrary points with the homogeneous coordinate approach above, because of the suppression of y coordinates. But all is not lost: There is a useful result that tells very quickly whether the sum of two points can possibly be a given third point. That is, given merely the x-coordinates of two points P1, P2 the following algorithm can be used to determine the two x-coordinates for the pair P1 ± P2, although which of the coordinates goes with the + and which with will be unknown.

7.3 The theorems of Hasse, Deuring, and Lenstra

333

Algorithm 7.2.8 (Sum/di erence without y-coordinates (Crandall)). For an elliptic curve E determined by the cubic

y2 = x3 + Cx2 + Ax + B,

we are given the unequal x-coordinates x1, x2 of two respective points P1, P2. This algorithm returns a quadratic polynomial whose roots are (in unspecified order) the x-coordinates of P1 ± P2.

1.[Form coe cients]

G = x1 − x2;

α= (x1x2 + A)(x1 + x2) + 2(Cx1x2 + B);

β = (x1x2 − A)2 4B(x1 + x2 + C);

2.[Return quadratic polynomial]

return G2X2 2αX + β;

// This polynomial vanishes for x+, x, the x-coordinates of P1 ± P2.

It turns out that the discriminant 4(α2 − βG2) must always be square in the field, so that if one requires the explicit pair of x-coordinates for P1 ± P2, one may calculate

α ± α2 − βG2 G2

in the field, to obtain x+, x, although again, which sign of the radical goes with which coordinate is unspecified (see Exercise 7.11). The algorithm thus o ers a test of whether P3 = P1 ±P2 for a set of three given points with missing y-coordinates; this test has value in certain cryptographic applications, such as digital signature [Crandall 1996b]. Note that the missing case of the algorithm, x1 = x2 is immediate: One of P1 ± P2 is O, the other has x-coordinate as in the last part of Theorem 7.2.6. For more on elliptic arithmetic, see [Cohen et al. 1998]. The issue of e cient ladders for elliptic arithmetic is discussed later, in Section 9.3.

7.3The theorems of Hasse, Deuring, and Lenstra

A fascinating and di cult problem is that of finding the order of an elliptic curve group defined over a finite field, i.e., the number of points including O on an elliptic curve Ea,b(F ) for a finite field F . For field Fp, with prime p > 3, we can immediately write out an exact expression for the order #E by observing, as we did in the simple Algorithm 7.2.1, that for (x, y) to be a point, the cubic form in x must be a square in the field. Using the Legendre symbol we can write

#E (Fp) = p + 1 + x Fp

x3

ax + b

 

(7.8)

 

+ p

 

 

 

 

 

as the required number of points (x, y) (mod p) that solve the cubic (mod p), with of course 1 added for the point at infinity. This equation may be

334 Chapter 7 ELLIPTIC CURVE ARITHMETIC

generalized to fields Fpk as follows:

 

 

#E Fpk = pk + 1 +

χ(x3 + ax + b),

 

x Fpk

where χ is the quadratic character for Fpk . (That is, χ(u) = 1, −1, 0, respectively, depending on whether u is a nonzero square in the field, not a square, or 0.) A celebrated result of H. Hasse is the following:

Theorem 7.3.1 (Hasse). The order #E of Ea,b(Fpk ) satisfies

(#E) (pk + 1)

2

pk

.

 

 

 

 

 

This remarkable result strikes to the very heart of elliptic curve theory and applications thereof. Looking at the Hasse inequality for Fp, we see that

p + 1 2p < #E < p + 1 + 2p.

There is an attractive heuristic connection between this inequality and the

alternative relation (7.8). Namely, think of the Legendre symbol x3+ax+b

p

as a “random walk,” i.e., a walk driven by coin flips of value ±1 except for possible symbols p0 = 0. It is known from statistical theory that the

expected absolute distance from the origin after summation of n such random

±1 flips is proportional to n. Certainly, the Hasse theorem gives the “right” order of magnitude for the excursions away from p for the possible orders of #Ea,b(Fp). At a deeper heuristic level one must have caution, however:

As mentioned in Section 1.4.2, the ratio of such a random walk’s position

to n can be expected to diverge something like ln ln n. The Hasse theorem says this cannot happen—the stated ratio is bounded by 2. Indeed, there are certain subtle features of Legendre-symbol statistics that reveal departure from randomness (see Exercise 2.41).

Less well known is a theorem from [Deuring 1941], saying that for any integer m (p + 1 2p, p + 1 + 2p), there exists some pair (a, b) in the set

{(a, b) : a, b Fp; 4a3 + 27b2 = 0}

such that #Ea,b(Fp) = m. What the Deuring theorem actually says is that the number of curves—up to isomorphism—of order m is the so-called Kronecker class number of (p +1− m)2 4m. In [Lenstra 1987], these results of Hasseand Deuring are exploited to say something about the statistics of curve orders over a given field Fp, as we shall now see.

In applications to factoring, primality testing, and cryptography, we are concerned with choosing a random elliptic curve and then asking for the likelihood of the curve order possessing a particular arithmetic property, such as being smooth, being easily factorable, or being prime. However, there are two possible ways of choosing a random curve. One is to just choose a, b at random and be done with it. But sometimes we also would like to have

7.4 Elliptic curve method

335

a random point on the curve. If one is working with a true elliptic curve over a finite field, points on it can easily be found via Algorithm 7.2.1. But if one is working over Zn with n composite, the call to the square root in this algorithm is not likely to be useful. However, it is possible to completely bypass Algorithm 7.2.1 and find a random curve and a point on it by choosing the point before the curve is fully defined! Namely, choose a at random, then choose a point (x0, y0) at random, then choose b such that (x0, y0) is on the curve y2 = x3 + ax + b; that is, b = y02 − x30 − ax0.

With these two approaches to finding a random curve, we can formalize the question of the likelihood of the curve order having a particular property. Suppose p is a prime larger than 3, and let S be a set of integers in the Hasse interval (p + 1 2p, p + 1 + 2p). For example, S might be the set of B-smooth numbers in the interval for some appropriate value of B (see

Section 1.4.5), or S might be the set of prime numbers in the interval,

2or the

set3 of

doubles of primes. Let N

1(S) be the number of pairs (a, b) Fp with

 

 

2

0 and with #E

 

 

4a

+ 27b

 

 

(F

 

)

2 S

. Let N

 

(

S

) be the number of triples

 

 

= 3

a,b

 

 

p

 

3

 

2

 

 

3

 

 

2

= 0

(a, x0, y0) Fp such that for b = y0 − x0 − ax0, we have 4a + 27b

 

and #E

 

 

would we expect for the counts N

1(S), N2(S)? For

 

 

 

a,b

(Fp) S. What 2

 

 

 

 

 

 

 

 

 

 

 

 

 

the first count, there are p

choices for a, b to begin with, and each number

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

S

 

 

 

#Ea,b(Fp) falls in an interval of length 4p, so we might expect N1(

) to be

 

about 14 (#S)p3/2. Similarly, we might expect N2(S) to be about 14 (#S)p5/2. That is, in each case we expect the probability that the curve order lands in the set S to be about the same as the probability that a random integer chosen from (p + 1 2p, p + 1 + 2p) lands in S. The following theorem says that this is almost the case.

Theorem 7.3.2 (Lenstra). There is a positive number c such that if p > 3 is prime and S is a set of integers in the interval (p + 1 2p, p + 1 + 2p) with at least 3 members, then

N1(S) > c(#S)p3/2/ ln p, N2(S) > c(#S)p5/2/ ln p.

This theorem is proved in [Lenstra 1987], where also upper bounds, of the same approximate order as the lower bounds, are given.

7.4 Elliptic curve method

A subexponential factorization method of great elegance and practical importance is the elliptic curve method (ECM) of H. Lenstra. The elegance will be self-evident. The practical importance lies in the fact that unlike QS or NFS, ECM complexity to factor a number n depends strongly on the size of the least prime factor of n, and only weakly on n itself. For this reason, many factors of truly gigantic numbers have been uncovered in recent years; many of these numbers lying well beyond the range of QS or NFS.

Later in this section we exhibit some explicit modern ECM successes that exemplify the considerable power of this method.

336

Chapter 7 ELLIPTIC CURVE ARITHMETIC

7.4.1Basic ECM algorithm

The ECM algorithm uses many of the concepts of elliptic arithmetic developed in the preceding sections. However, we shall be applying this arithmetic to a construct Ea,b(Zn), something that is not a true elliptic curve, when n is a composite number.

Definition 7.4.1. For elements a, b in the ring Zn, with gcd(n, 6) = 1 and discriminant condition gcd(4a3 + 27b2, n) = 1, an elliptic pseudocurve over the ring is a set

Ea,b(Zn) = {(x, y) Zn × Zn : y2 = x3 + ax + b} {O},

where O is the point at infinity. (Thus an elliptic curve over Fp = Zp from Definition 7.1.1 is also an elliptic pseudocurve.)

(Curves given in the form (7.5) are also considered as pseudocurves, with the appropriate discriminant condition holding.) We have seen in Section 7.1 that when n is prime, the point at infinity refers to the one extra projective point on the curve that does not correspond to an a ne point. When n is composite, there are additional projective points not corresponding to a ne points, yet in our definition of pseudocurve, we still allow only the one extra point, corresponding to the projective solution [0, 1, 0]. Because of this (intentional) shortchanging in our definition, the pseudocurve Ea,b(Zn), together with the operations of Definition 7.1.2, does not form a group (when n is composite). In particular, there are pairs of points P, Q for which “P + Q” is undefined. This would be detected in the construction of the slope m in Definition 7.1.2; since Zn is not a field when n is composite, one would be called upon to invert a nonzero member of Zn that is not invertible. This group-law failure is the motive for the name “pseudocurve,” yet, happily, there are powerful applications of the pseudocurve concept. In particular, Algorithm 2.1.4 (the extended Euclid algorithm), if called upon to find the inverse of a nonzero member of Zn that is in fact noninvertible, will instead produce a nontrivial factor of n. It is Lenstra’s ingenious idea that through this failure of finding an inverse, we shall be able to factor the composite number n.

We note in passing that the concept of elliptic multiplication on a pseudocurve depends on the addition chain used. For example, [5]P may be perfectly well computable if one computes it via P → [2]P → [4]P → [5]P ,

but the elliptic addition may break down if

one tries to

compute

it via

P → [2]P → [3]P → [5]P . Nevertheless, if

two di erent

addition

chains

to arrive at [k]P both succeed, they will give the same answer.

 

Algorithm 7.4.2 (Lenstra elliptic curve method (ECM)). Given a composite number n to be factored, gcd(n, 6) = 1, and n not a proper power, this algorithm attempts to uncover a nontrivial factor of n. There is a tunable parameter B1 called the “stage-one limit” in view of further algorithmic stages in the modern ECM to follow.

1. [Choose B1 limit]

7.4 Elliptic curve method

337

B1 = 10000; // Or whatever is a practical initial “stage-one limit” B1. 2. [Find curve Ea,b(Zn) and point (x, y) E]

2

3

 

 

[0, n

1];

 

Choose random x, y, a

 

 

 

b = (y

− x3

− ax)2 mod n;

 

 

 

g = gcd(4a

+ 27b , n);

 

 

 

 

if(g == n) goto [Find curve . . .];

 

if(g > 1) return g;

 

 

 

 

// Factor is found.

E = Ea,b(Zn); P = (x, y);

 

// Elliptic pseudocurve and point on it.

3. [Prime-power multipliers]

 

 

 

 

 

for(1 ≤ i ≤ π(B1)) {

 

 

 

ai

// Loop over primes pi.

Find largest integer ai such that pi

≤ B1;

for(1 ≤ j ≤ ai) {

 

 

 

 

// j is just a counter.

P= [pi]P , halting the elliptic algebra if the computation of some d1 for addition-slope denominator d signals a nontrivial g = gcd(n, d), in which case return g;

//Factor is found.

}

}

4. [Failure]

Possibly increment B1; // See text. goto [Find curve . . .];

What we hope with basic ECM is that even though the composite n allows only a pseudocurve, an illegal elliptic operation—specifically the inversion required for slope calculation from Definition 7.1.2—is a signal that for some prime p|n we have

[k]P = O, where k = pai i ,

pai i ≤B1

with this relation holding on the legitimate elliptic curve Ea,b(Fp). Furthermore, we know from the Hasse Theorem 7.3.1 that the order #Ea,b(Fp) is in the interval (p + 1 2p, p + 1 + 2p). Evidently, we can expect a factor if the multiplier k is divisible by #E(Fp), which should, in fact, happen if this order is B1-smooth. (This is not entirely precise, since for the order to be B1-smooth it is required only that each of its prime factors be at most B1, but in the above display, we have instead the stronger condition that each prime power divisor of the order is at most B1. We could change the inequality defining ai to pai i ≤ n + 1 + 2n, but in practice the cost of doing so is too high for the meager benefit it may provide.) We shall thus think of the stage-one limit B1 as a smoothness bound on actual curve orders in the group determined by the hidden prime factor p.

It is instructive to compare ECM with the Pollard p−1 method (Algorithm 5.4.1). In the p − 1 method one has only the one group Zp (with order p − 1), and one is successful if this group order is B-smooth. With ECM one has

338

Chapter 7 ELLIPTIC CURVE ARITHMETIC

a host of elliptic-curve groups to choose from randomly, each giving a fresh chance at success.

With these ideas, we may perform a heuristic complexity estimate for ECM. Suppose the number n to be factored is composite, coprime to 6, and not a proper power. Let p denote the least prime factor of n and let q denote another prime factor of n. Algorithm 7.4.2 will be successful in splitting n if we choose a, b, P in Step [Find curve . . .] and if for some value of k of the form

k = pa

 

pai ,

l

i

 

i<l

where l ≤ π(B1) and a ≤ al, we have

 

[k]P = O on Ea,b(Fp),

[k]P =O on Ea,b(Fq ).

The likelihood of these two events occurring is dominated by the first, and so we shall ignore the second. As mentioned above, the first event will occur if #Ea,b(Fp) is B1-smooth. From Theorem 7.3.2, the probability prob(B1) of success is greater than

ψ(p + 1 + 2p, B1) − ψ(p + 1 2p, B1) c √p ln p .

Here the notation ψ(x, y) is as in (1.42). Since it takes about B1 arithmetic steps to perform the trial for one curve in Step [Prime-power multipliers], we would like to choose B1 so as to minimize the expression B1/prob(B1). Assuming that prob(B1) is about the same as

ψ( 3 p, B1) − ψ( 1 p, B1) c 2 2 , p ln p

so that we can use the estimates discussed in Section 1.4.5, we have that this minimum occurs when

B1 = exp ( 2/2 + o(1)) ln p ln ln p ,

and for this value of B1, the complexity estimate B1/prob(B1) is given by

exp

(2 + o(1)) ln p ln ln p

;

 

 

 

 

 

 

 

see Exercise 7.12. Of course, we do not know p to begin with, and so it would only be a divination to choose an appropriate value of B1 to begin with in Step [Choose B1 limit]. Thus, the algorithm instructs us to start with a low B1 value of 10000, and then possibly to raise this value in Step [Failure]. In practice, what is done is that one value of B1 is run su ciently many times without success for one to become convinced that a higher value is called for, perhaps double the prior value, and this procedure is iterated. Of course, another option in Step [Failure] is to abort and so give up on the

7.4 Elliptic curve method

339

factorization attempt completely. When the B1 value is gradually increased in ECM, one then expects success when B1 finally reaches the critical range displayed above, and that the time spent unsuccessfully with smaller B1’s is negligible in comparison.

So, in summary, the heuristic expected complexity of ECM to give a

nontrivial factorization of n with least prime factor p is L(p) 2+o(1) arithmetic steps with integers the size of n, using the notation from (6.1). (Note that the error expression “o(1)” tends to 0 as p tends to infinity.) Thus, the larger the least prime factor of n, the more arithmetic steps are expected. The worst case occurs when n is the product of two roughly equal primes, in which case the expected number of steps can be expressed as L(n)1+o(1), which is exactly the same as the heuristic complexity of the quadratic sieve; see Section 6.1.1. However, due to the higher precision of a typical step in ECM, we generally prefer to use the QS method, or the NFS method, for worst-case numbers. If we are presented with a number n that is unknown to be in the worst case, it is usually recommended to try ECM first, and only after a fair amount of time is spent with this method should QS or NFS be initiated. But if the number n is so large that we know beforehand that QS or NFS would be out of the question, it leaves ECM as the only current option. Who knows, we may get lucky! Here, “luck” can play either of two roles: The number under consideration may indeed have a small enough prime factor to discover with ECM, or upon implementing ECM, we may hit upon a fortunate choice of parameters sooner than expected and find an impressive factor. In fact, one interesting feature of ECM is that the variance in the expected number of steps is large since we are counting on just one successful event to occur.

It is interesting that the heuristic complexity estimate for the ECM may be made completely rigorous except for the one assumption we made that integers in the Hasse interval are just as likely to be smooth as typical integers in the larger interval (p/2, 3p/2); see [Lenstra 1987].

In the discussion following we describe some optimizations of ECM. These improvements do not materially a ect the complexity estimate. but they do help considerably in practice.

7.4.2Optimization of ECM

As with the Pollard (p − 1) method (Section 5.4), on which the ECM is based, there is a natural, second stage continuation. In view of the remarks following Algorithm 7.4.2, assume that the order #Ea,b(Fp) is not B1-smooth for whatever practical choice of B1 has been made, so that the basic algorithm can be expected to fail to find a factor. But we might just happen to have

#E(Fp) = q pai i ,

pai i ≤B1

where q is a prime exceeding B1. When such a single outlying prime is part of the unknown factorization of the order, one need not have multiplied the

340

Chapter 7 ELLIPTIC CURVE ARITHMETIC

current point by every prime in (B1, q]. Instead, one can use the point

Q =

 

pai

 

P,

 

 

i

 

 

 

pi ≤B1

 

 

 

which is the point actually “surviving” the stage-one ECM Algorithm 7.4.2, and check the points

[q0]Q, [q0 + ∆0]Q, [q0 + ∆0 + ∆1]Q, [q0 + ∆0 + ∆1 + ∆2]Q, . . . ,

where q0 is the least prime exceeding B1, and ∆i are the di erences between subsequent primes after q0. The idea is that one can store some points

Ri = [∆i]Q,

once and for all, then quickly process the primes beyond B1 by successive elliptic additions of appropriate Ri. The primary gain to be realized here is that to multiply a point by a prime such as q requires O(ln q) elliptic operations, while addition of a precomputed Ri is, of course, one operation.

Beyond this “stage-two” optimization and variants thereupon, one may invoke other enhancements such as

(1)Special parameterization to easily obtain random curves.

(2)Choice of curves with order known to be divisible by 12 or 16 [Montgomery 1992a], [Brent et al. 2000].

(3)Enhancements of large-integer arithmetic and of the elliptic algebra itself, say by FFT.

(4)Fast algorithms applied to stage two, such as “FFT extension” which is actually a polynomial-evaluation scheme applied to sets of precomputed x-coordinates.

Rather than work through such enhancements with incremental algorithm exhibitions, we instead adopt a specific strategy: We shall discuss the above enhancements briefly, then exhibit a single, practical algorithm containing many of said enhancements.

On enhancement (1) above, a striking feature our eventual algorithm will enjoy is that one need not involve y-coordinates at all. In fact, the algorithm will use the Montgomery parameterization

gy2 = x3 + Cx2 + x,

with elliptic multiplication carried out via Algorithm 7.2.7. Thus a point will have the general homogeneous form P = [X, any, Z] = [X : Z] (see Section 7.2 for a discussion of the notation), and we need only track the residues X, Z (mod n). As we mentioned subsequent to Algorithm 7.2.7, the appearance of the point-at-infinity O during calculation on a curve over Fp, where p|n, is signified by the vanishing of denominator Z, and such vanishing

4u3v

7.4 Elliptic curve method

341

propagates forever afterward during further evaluations of functions addh() and doubleh(). Thus, the parameterization in question allows us to continually check gcd(n, Z), and if this is ever greater than 1, it may well be the hidden factor p. In practice, we “accumulate” Z-coordinates, and take the gcd only rarely, for example after stage one, and as we shall see, one final time after a stage two.

On enhancement (2), it is an observation of Suyama that under Montgomery parameterization the group order #E is divisible by 4. But one can press further, to ensure that the order be divisible by 8, 12, or even 16. Thus, in regard to enhancement (2) above, we can make good use of a convenient result [Brent et al. 2000]:

Theorem 7.4.3 (ECM curve construction). Define an elliptic curve

Eσ (Fp) to be governed by the cubic

y2 = x3 + C(σ)x2 + x,

where C depends on field parameter σ = 0, 1, 5 according to

u = σ2 5,

v = 4σ,

C(σ) = (v − u)3(3u + v) 2.

Then the order of Eσ is divisible by 12, and moreover, either on E or a twist E (see Definition 7.2.5) there exists a point whose x-coordinate is u3v3.

Now we can ignite any new curve attempt by simply choosing a random σ. We use, then, Algorithm 7.2.7 with homogeneous x-coordinatization starting in the form X/Z = u3/v3, proceeding to ignore all y-coordinates throughout the factorization run. What is more, we do not even care whether an initial point is on E or its twist, again because y-coordinate ignorance is allowed.

On enhancements (3), there are ideas that can reduce stage-two computations. One trick that some researchers enjoy is to use a “birthday paradox” second stage, which amounts to using semirandom multiples for two sets of coordinates, and this can sometimes yield performance advantages [Brent et al. 2000]. But there are some ideas that apply in the scenario of simply checking all outlying primes q up to some “stage-two limit” B2 > B1; that is, without any special list-matching schemes. Here is a very practical method that reduces the computational e ort asymptotically down to just two (or fewer) multiplies (mod n) for each outlying prime candidate. We have already argued above that if qn, qn+1 are consecutive primes, one can add some stored multiple [∆n]Q to any current calculation [qn]Q to get the next point [qn+1]Q, and that this involves just one elliptic operation per prime qm. Though that may be impressive, we recall that an elliptic operation is a handful, say, of multiplies (mod n). We can bring the complexity down simply, yet dramatically, as follows. If we know, for some prime r, the multiple [r]Q = [Xr : Zr] and we have

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]