Primality Testing in Polynomial Time (Ⅰ)
Fred Akalin
1. Introduction
Exactly ten years ago, Agrawal, Kayal, and Saxena published “PRIMES is in P”, which described an algorithm that could provably determine whether a given number was prime or composite in polynomial time.
The AKS algorithm is quite short, but understanding how it works via the proofs in the paper requires some mathematical sophistication. Also, some results in the last decade have simplified both the algorithm and its accompanying proofs. In this article I will explain in detail the main result of the AKS paper, and in a follow-up article I will strengthen the main result, use it to get a polynomial-time primality testing algorithm, and implement that algorithm in Javascript. If you've understood my introduction to primality testing, you should be able to follow along.
Let's get started! The basis for the AKS primality test is the following generalization of Fermat's little theorem to polynomials:
(Fermat's little theorem for polynomials, strong version.) If \(n \ge 2\) and \(a\) is relatively prime to \(n\), then \(n\) is prime if and only if \[ (X + a)^n \equiv X^n + a \pmod{n}\text{.} \]
The form of the equation above may be unfamiliar. The polynomials in question are formal polynomials. That is, we care only about the coefficients of the polynomial and not how it behaves as a function. In this case, we restrict ourselves to polynomials with integer coefficients. Then we can meaningfully compare two polynomials modulo \(n\): we consider two polynomials congruent modulo \(n\) if their respective coefficients are all congruent modulo \(n\). (Equivalently, two polynomials \(f(X)\) and \(g(X)\) are congruent modulo \(n\) if \(f(X) - g(X) = n \cdot h(X)\) for some polynomial \(h(X)\).) This definition is consistent with how they behave as functions; if two polynomials \(f(X)\) and \(g(X)\) are congruent modulo \(n\), then treating them as functions, \(f(x)\ \equiv g(x) \pmod{n}\) for any integer \(x\).[1]
Unfortunately, this test by itself cannot give a polynomial-time algorithm as testing even one value of \(a\) may require looking at \(n\) coefficients of the left-hand side. (Remember that we're interested in algorithms with time polynomial not in the input \(n\), but in its bit length \(\lg n\). Such an algorithm is described as having time polylog in \(n\).) However, we can reduce the number of coefficients we have to look at by taking the powers of \(X\) modulo some number \(r\). This is equivalent to taking the modulo of the polynomials themselves by \(X^r - 1\); you can see this for yourself by picking some polynomial and some value for \(r\) and doing long division by \(X^r - 1\) to find the remainder. (It may seem weird to talk about taking the modulo of one polynomial with another, but it's entirely analogous to integers.) This gives us a weaker version of the theorem above:
(Fermat's little theorem for polynomials, weak version.) If \(n\) is prime and \(a\) is not a multiple of \(n\), then for any \(r \ge 2\) \[ (X + a)^n \equiv X^n + a \pmod{X^r - 1, n}\text{.} \]
The “double mod” notation above may be unfamiliar, but in this case its meaning is simple. We consider two polynomials congruent modulo \(X^r - 1, n\) when they are congruent modulo \(n\) after you reduce the powers of \(X\) modulo \(r\) and combine like terms. More generally, two polynomials \(f(X)\) and \(g(X)\) are congruent modulo \(n(X), n\) if \(f(X) - g(X) \equiv n(X) \cdot h(X) \pmod{n}\) for some polynomial \(h(X)\).
With this theorem, we only have to compare \(r\) coefficients, but we introduce the possibility of the condition above being met even when \(n\) is composite. But can we impose conditions on \(r\) and \(a\) such that if the condition holds for a polynomial number of pairs of \(r\) and \(a\), we can be sure that \(n\) is prime? The answer is “yes”; in particular, we can find a single \(r\) and an upper bound \(M\) polylog in \(n\) such that if the condition holds for \(r\) and \(0 \le a \lt M\), then \(n\) is prime.
In the remainder of this article, we'll work backwards. That is, we'll first assume we have some \(n \ge 2\), \(r \ge 2\), and \(M \ge 1\) such that for all \(0 \le a \lt M\) \[ (X + a)^n \equiv X^n + a \pmod{X^r - 1, n}\text{.} \] Then we'll assume that \(n\) is not a power of one of its prime divisors \(p\) and try to deduce the conditions that imposes on \(n\), \(r\), \(M\), and \(p\). Then we can take the contrapositive to find the inverse conditions on \(n\), \(r\), \(M\), and \(p\) that would then force \(n\) to be a power of \(p\). Since we can easily test whether \(n\) is a perfect power, if it's not one, we can immediately conclude that \(n = p^1\) and thus prime. (Of course, if it does turn out to be a perfect power, then it is trivially composite.)
To understand the conditions that we will derive, we must first
talk about introspective numbers.
Given a base \(b\), a polynomial \(g(X)\) and a number \(q\), we
call \(q\) introspective[2] for \(g(X)\) modulo \(b\) if
\[
g(X)^q = g(X^q) \pmod{b}\text{.}
\] We also say that \(g(X)\) is introspective under \(q\)
modulo \(b\). A basic property of introspective numbers and polynomials is that
they are closed under multiplication. That is, if \(q_1\) and \(q_2\)
are introspective for \(g(X)\) modulo \(b\), then \(q_1 \cdot q_2\) is
also introspective for \(g(X)\) modulo \(b\), and if \(g_1(X)\) and
\(g_2(X)\) are introspective under \(q\) modulo \(b\), then \(g_1(X)
\cdot g_2(X)\) is also introspective under \(q\) modulo \(b\). In particular, given our assumptions above, we can easily see that
\(1\), \(p\), and \(n\) are introspective for \(X + a\) modulo \(p\)
for any \(0 \le a \lt M\). We can also show that \(n/p\) is also
introspective for \(X + a\) modulo \(p\). Using closure under
multiplication, we can talk about the set of numbers generated by
\(p\) and \(n/p\), which are all introspective for \(X + a\) modulo
\(p\). Call this set \(I\): We can also take the closure of all \(X + a\) to get a set of
polynomials which are all introspective under \(p\), \(n/p\), or any
number in \(I\). Call this set \(P\):
\[
P = \left\{ 0 \right\} \cup
\left\{ X^{e_0} \cdot (X + 1)^{e_1} \cdot \ldots \cdot (X + M -
1)^{e_{M - 1}} \mid e_0, e_1, \ldots, e_{M - 1} \ge 0 \right\}\text{.}
\]
To summarize, \(I\) is a set of numbers and \(P\) is a set of
polynomials such that for any \(i \in I\) and \(g(X) \in P\), \(i\) is
introspective for \(g(X)\) modulo \(p\). Of course, it's still not
clear what these two sets have to do with whether \(n\) is prime or
not. But we will examine certain finite sets related to \(I\) and
\(P\) and their sizes, and we will see that we can deduce their
properties depending on the relation of \(n\) to \(p\). Now we're ready to work towards finding our restrictions on \(n\),
\(r\), \(M\), and \(p\). We'll slowly build them up such that when
the last one falls into place, we know that \(n\) is a perfect power
of \(p\). Here's what we're starting with: Let us restrict \(I\) to a finite set by bounding the exponents of
\(p\) and \(n/p\):
\[
I_k = \left\{ p^i (n/p)^j \mid 0 \le i, j \lt k \right\} \subset I\text{.}
\] Notice that if \(n\) is not a power of \(p\), then all members of
\(I_k\) are distinct, and therefore we can easily calculate its
size:[3]
\[
|I_k| = k^2\text{.}
\] Let's also restrict \(P\) to a finite set by bounding the degrees
of its polynomials:
\[
P_d = \left\{ g \in P \mid \deg(g) \lt d \right\} \subset P\text{.}
\] We can calculate \(|P_d|\) exactly,[4] but
we only need a lower bound for when \(d \le M\). Consider \(P_d^{\{0,
1\}}\), the subset of \(P_d\) where each \(X + a\) is present at most
once. Since each \(X + a\) is either present or not present, but not
all of them can be present at the same time, there are \(2^d - 1\)
distinct polynomials in \(P_d^{\{0, 1\}}\). Adding back the zero
polynomial yields \(|P_d^{\{0, 1\}}| = 2^d\). Since \(P_d^{\{0,
1\}}\) is a subset of \(P_d\), \(|P_d| \ge |P_d^{\{0, 1\}}| = 2^d\).
Therefore, if \(d \le M\), then[5]
\[ |P_d| \ge 2^d\text{.} \]
This will be useful later (for a particular value of \(d\)), so let's
add the restriction to \(M\):
Let us restrict \(I\) in a different way, by reducing modulo \(r\):
\[
J = \left\{ x \bmod r \mid x \in I \right\}
\]
and let \(t = |J|\). (This size will play an important role
later.) Our final set that we're interested in needs some background to
define. We want to find a subset of \(P\) that lies in some field
\(F\) because fields have some convenient properties that we will use
later.[6] Consider \(\mathbb{Z}/p\mathbb{Z}\), the ring
of integers
modulo \(p\). Since \(p\) is prime, it is also a field. In
particular, it is
the finite
field \(\mathbb{F}_p\) of order \(p\). Then consider
\(\mathbb{F}_p[X]\),
its polynomial
ring, which is the set of polynomials with coefficients in
\(\mathbb{F}_p\). Given some polynomial \(q(X) \in \mathbb{F}_p[X]\),
we can further reduce modulo \(q(X)\) to get \(\mathbb{F}_p[X] /
q(X)\). Finally, if \(q(X)\) is
irreducible
over \(\mathbb{F}_p\), then \(\mathbb{F}_p[X] / q(X)\) is also a
field. (We can show that both \(\mathbb{F}_p = \mathbb{Z}/p\mathbb{Z}\)
and \(\mathbb{F}_p[X] / q(X)\) are fields from the same general
theorem of rings: if \(R\) is
a principal
ideal domain and \((c)\) is
the two-sided
ideal generated by \(c\), then
the quotient
ring \(R / (c)\) is a field if and only if \(c\) is
a prime
element of \(R\).)[7] So we just need to find a polynomial that's irreducible over
\(\mathbb{F}_p\). We know that \(X^r - 1\) has \(\Phi_r(X)\), the
\(r\)th cyclotomic
polynomial, as a factor. \(\Phi_r(X)\) is irreducible over
\(\mathbb{Z}\), but not necessarily over \(\mathbb{F}_p\). But if
\(r\) is relatively prime to \(p\), then \(\Phi_r(X)\) factors into
irreducible polynomials all of degree \(o_r(p)\)
(the multiplicative
order of \(p\) modulo \(r\)) over \(\mathbb{F}_p\).[8] Then we can
just require that \(r\) be relatively prime to \(p\). If we do so,
then we can let \(h(X)\) be one of the factors of \(\Phi_r(X)\) over
\(\mathbb{F}_p\) and we have our field \(F = \mathbb{F}_p[X] /
h(X)\). Finally, we can define our last set. Let
\[
Q = \left\{ f(X) \bmod (h(X), p) \mid f(X) \in P \right\} \subseteq F\text{.}
\] We can map elements of \(P\) into \(Q\) via reduction modulo
\((h(X), p)\). But we're interested in only the elements of \(P\)
that map to distinct elements of \(Q\), since that will let us find a
lower bound for \(|Q|\). A simple example would be the set of \(X +
a\) for \(0 \le a \lt M\); if the degree of \(h(X)\) is greater than
\(1\) and \(p \ge M\), then each \(X + a\) is distinct in \(Q\). Another interesting set is \(X^k\) for \(1 \le k \le r\). Since
\(h(X) \equiv 0 \pmod{h(X}, p)\), we can say that \(X\) is a root of
the polynomial function \(h(y)\) over the field \(F\). But since
\(h(y)\) is a factor of \(\Phi_r(y)\), \(X\) is then a primitive
\(r\)th root of unity in \(Q\).[9] But the powers of a primitive \(r\)th
root of unity (from \(1\) to \(r\)) are all distinct. Therefore all
\(X^k\) for \(1 \le k \le r\) are distinct in \(Q\). Most importantly, we can show that distinct elements in \(P_d\) map
to distinct elements in \(Q\) if \(d \le t\). Let \(f(X)\) and
\(g(X)\) be two different elements of \(P_d\). Assume that \(f(X)
\equiv g(X) \pmod{h(x}, p)\). Then, for \(m \in I\):
\[
f(X^m) \equiv f(X)^m \pmod{X^r - 1, p}
\]
and
\[
g(X^m) \equiv g(X)^m \pmod{X^r - 1, p}
\]
by introspection modulo \(p\), and therefore
\[
f(X^m) \equiv g(X^m) \pmod{X^r - 1, p}
\]
which immediately leads to
\[
f(X^m) \equiv g(X^m) \pmod{h(X}, p)\text{.}
\]
Therefore, all \(X^m\) for \(m \in I\) are roots of the polynomial
function \(u(y) = f(y) - g(y)\) over the field \(F\), and in
particular all \(X^m\) for \(m \in J\). But all \(t\) such \(X^m\)
are distinct in \(Q\) by the argument above. Therefore, \(u(y)\) must
have degree at least \(t\) since a polynomial over a field cannot have
more roots than its degree. But the degree of \(u(y)\) is less than
\(d\) since both \(f(y)\) and \(g(y)\) have degree less than \(d\).
Since \(d \le t\), this is a contradiction, so therefore \(f(X)
\neq g(X) \pmod{h(x}, p)\). But since \(f(X)\) and \(g(X)\)
were arbitrary, that implies that distinct elements of \(P_d\) map to
distinct elements of \(Q\) for \(d \le t\). Given the above, we can conclude that as long as we require that
\(d \le t\), \(p \ge M\), and \(o_r(p) = \deg(h(X)) \gt 1\), then
\[
|Q| \ge |P_d| \ge 2^d\text{.}
\] We're finally ready to put it all together. Again assume \(n\) is
not a power of \(p\), and recall that \(|J| = t\). Let \(s \gt
\sqrt{t}\). Then \(|I_s| = s^2 \gt t\). By
the pigeonhole
principle, there must be two elements \(m_1, m_2 \in I_s\) that
map to the same element in \(J\); that is, there must be \(m_1, m_2
\in I_s\) such that \(m_1 \equiv m_2 \pmod{r}\). Now pick some
\(g(X)\) from \(P\). Then
\[
g(X)^{m_1} \equiv g(X^{m_1}) \pmod{X^r - 1, p}
\]
and
\[
g(X)^{m_2} \equiv g(X^{m_2}) \pmod{X^r - 1, p}
\]
by introspection modulo \(p\). But \(X^{m_1} \equiv X^{m_2} \pmod{X^r - 1}\) since \(m_1 \equiv m_2 \pmod{r}\), so
\[
g(X^{m_1}) \equiv g(X^{m_2}) \pmod{X^r - 1, p}\text{.}
\]
Chaining all these congruences together lets us deduce that
\[
g(X)^{m_1} \equiv g(X)^{m_2} \pmod{X^r - 1, p}\text{,}
\]
which immediately leads to
\[
g(X)^{m_1} \equiv g(X)^{m_2} \pmod{h(X}, p)\text{.}
\]
That means that \(g(X) \bmod (h(X), p) \in Q\) is a root of the
polynomial function \(u(y) = y^{m_1} - y^{m_2}\) over the field \(F\).
But \(g(X)\) was picked arbitrarily from \(P\), so \(u(y)\) has at
least \(|Q|\) roots. \(\deg(u(y)) = \max(m_1, m_2) \le p^{s-1} \cdot
(n/p)^{s-1} = n^{s-1}\), and \(u(y)\), being a polynomial over a
field, cannot have more roots than its degree, so if \(n\) is not a
power of \(p\), then \(|Q| \le n^{s-1}\). Equivalently, if \(|Q| \gt
n^{s-1}\), then \(n\) must be a power of \(p\).[10] But
we've shown above that \(|Q| \ge 2^d\) for \(d \le t\), so if we can
pick \(d\) and \(s\) such that \(2^d \gt n^{s-1}\), then we can force
\(n\) to be a power of \(p\). Taking logs, we see that this is
equivalent to picking \(d\) and \(s\) such that \(d \gt (s - 1) \lg
n\). Since \(d \le t\), this imposes \(t \gt (s - 1) \lg n\) in order
for there to be room to pick \(d\). Rearranging, we get \(s \lt
\frac{t}{\lg n} + 1\). But \(s \gt \sqrt{t}\), so this imposes
\(\sqrt{t} \lt \frac{t}{\lg n} + 1\) in order for there to be room to
pick \(s\). Rearranging again, we get \(\frac{t}{\sqrt{t} - 1} \gt
\lg n\). Since \(\frac{t}{\sqrt{t} - 1} \gt \sqrt{t}\), it suffices
to require that \(t \gt \lg^2 n\) in order for there to be room to
pick \(d\) and \(s\). Furthermore, since \(s\) has to be an integer,
then \(s \ge \lfloor \sqrt{t} \rfloor + 1\), and therefore \(d \gt
\lfloor \sqrt{t} \rfloor \lg n\). Let's update our assumptions: So to summarize, if we make the above assumptions, we can pick
\(d\) and \(s\) such that \(|Q| \ge 2^d \gt n^{s - 1}\), which implies
that \(n\) must be a power of \(p\), which was our goal. Now we just
have to express all assumptions in terms of \(n\), \(r\), and \(M\),
strengthening them if necessary. \(J\) is generated by \(p\) and
\(n/p\), so its order (i.e., \(t\)) is at least \(o_r(p)\), which is
in turn at least \(o_r(n)\), since \(p\) is a prime factor of \(n\)
(this brings along the assumption that \(r\) and \(n\) are relatively
prime). Therefore, we can replace the assumptions \(t \gt \lg^2 n\)
and \(o_r(p) \gt 1\) with \(o_r(n) \gt \lg^2 n\). We can remove the
reference to \(d\) by finding the maximum value of \(t\). Since \(r\)
is relatively prime to \(n\), \(J\) is a subgroup of \(Z_r\), and
therefore its order divides (and therefore is at most) \(\varphi(r)\).
So we can replace \(M \ge d \gt \lfloor \sqrt{t} \rfloor \lg n\) with
\(M \gt \lfloor \sqrt{\varphi(r)} \rfloor \lg n\). Finally, we can
remove the reference to \(p\) by mandating that \(n\) has no prime
factor less than \(M\). Here are our final assumptions: We can summarize the above discussion in the following theorem:
(AKS theorem, weak version.) Let
\(n \ge 2\), \(r\) be relatively prime to \(n\) with \(o_r(n) \gt
\lg^2 n\), and \(M \gt \lfloor \sqrt{\varphi(r)} \rfloor \lg n\).
Furthermore, let \(n\) have no prime factor less than \(M\) and let
\[
(X + a)^n \equiv X^n + a \pmod{X^r - 1, n}
\]
for \(0 \le a \lt M\). Then \(n\) is the power of some prime \(p \ge
M\). And that's it for now! In the follow-up article we will strengthen
this theorem to further show that \(n\) is equal to \(p\), and
therefore prime. Then we will use this result to get a
primality-testing algorithm that we will prove to be polynomial
time. [1] We use uppercase letters for variables when we treat
polynomials as formal polynomials and lowercase letters when we
treat them as functions. ↩ [2] The term “introspection”, which comes
from the original AKS paper, was probably chosen to invoke the idea
that the exponent \(q\) can be pushed into and pulled out of \(g(X)\).
Here we generalize it a bit. ↩ [3] This condition is too weak to be useful by itself,
but we will parlay it into something we can use later.
↩ [4] Using the ideas
on this page,
we can show that \(|P_d| = {M + d \choose d - 1} + 1\) by
considering each \(X + a\) a labeled urn (plus a
“dummy” urn) and each unit of power an unlabeled
ball. (This was used in the AKS paper.)
↩ [5] This lower bound, as well as other ideas that simplify the
proof, was taken
from Prime
Numbers: A Computational Perspective.
↩ [6] You may first want to brush up on the definitions
of group,
ring,
and field,
and the differences between them.
↩ [7] This is Theorem 1.47(iv) from
“Introduction
to finite fields and their applications”.
↩ [8] The reducibility of \(\Phi_r(X)\) over
\(\mathbb{F}_p\) given \(r\) relatively prime to \(p\) is Theorem
2.47(ii) from
“Introduction
to finite fields and their applications”.
↩ [9] It's a bit weird to talk about a polynomial being
the root of other polynomials, but recall that we can form a
polynomial ring over any field, even a field of polynomials. We
keep track of which polynomials belong to which domains by using
different variables.
↩ [10] Here's where we force \(n\) to be a prime power.
↩2. Introspective numbers
3. Bounds on finite sets related to \(I\) and \(P\)
\(r \ge 2\),
\(M \ge 1\),
\(p\) is a prime divisor of \(n\).
\(r \ge 2\),
\(M \ge d\),
\(p\) is a prime divisor of \(n\).
\(r \ge 2\), \(r\) relatively prime to \(p\),
\(M \ge d\),
\(p\) is a prime divisor of \(n\).
\(o_r(p) \gt 1\),
\(M \ge d\),
\(t \ge d\),
\(p \ge M\), \(p\) is a prime divisor of \(n\).
4. The AKS theorem (weak version)
\(o_r(p) \gt 1\)
\(M \ge d \gt \lfloor \sqrt{t} \rfloor \lg n\),
\(t \gt \lg^2 n\),
\(p \ge M\), \(p\) is a prime divisor of \(n\).
\(o_r(n) \gt \lg^2 n\),
\(M \gt \lfloor \sqrt{\varphi(r)} \rfloor \lg n\).