Primality Testing in Polynomial Time (Ⅰ)

Fred Akalin

1. Introduction

Exactly ten years ago, Agrawal, Kayal, and Saxena published “PRIMES is in P”, which described an algorithm that could provably determine whether a given number was prime or composite in polynomial time.

The AKS algorithm is quite short, but understanding how it works via the proofs in the paper requires some mathematical sophistication. Also, some results in the last decade have simplified both the algorithm and its accompanying proofs. In this article I will explain in detail the main result of the AKS paper, and in a follow-up article I will strengthen the main result, use it to get a polynomial-time primality testing algorithm, and implement that algorithm in Javascript. If you've understood my introduction to primality testing, you should be able to follow along.

Let's get started! The basis for the AKS primality test is the following generalization of Fermat's little theorem to polynomials:

(Fermat's little theorem for polynomials, strong version.) If \(n \ge 2\) and \(a\) is relatively prime to \(n\), then \(n\) is prime if and only if \[ (X + a)^n \equiv X^n + a \pmod{n}\text{.} \]

The form of the equation above may be unfamiliar. The polynomials in question are formal polynomials. That is, we care only about the coefficients of the polynomial and not how it behaves as a function. In this case, we restrict ourselves to polynomials with integer coefficients. Then we can meaningfully compare two polynomials modulo \(n\): we consider two polynomials congruent modulo \(n\) if their respective coefficients are all congruent modulo \(n\). (Equivalently, two polynomials \(f(X)\) and \(g(X)\) are congruent modulo \(n\) if \(f(X) - g(X) = n \cdot h(X)\) for some polynomial \(h(X)\).) This definition is consistent with how they behave as functions; if two polynomials \(f(X)\) and \(g(X)\) are congruent modulo \(n\), then treating them as functions, \(f(x)\ \equiv g(x) \pmod{n}\) for any integer \(x\).[1]

Unfortunately, this test by itself cannot give a polynomial-time algorithm as testing even one value of \(a\) may require looking at \(n\) coefficients of the left-hand side. (Remember that we're interested in algorithms with time polynomial not in the input \(n\), but in its bit length \(\lg n\). Such an algorithm is described as having time polylog in \(n\).) However, we can reduce the number of coefficients we have to look at by taking the powers of \(X\) modulo some number \(r\). This is equivalent to taking the modulo of the polynomials themselves by \(X^r - 1\); you can see this for yourself by picking some polynomial and some value for \(r\) and doing long division by \(X^r - 1\) to find the remainder. (It may seem weird to talk about taking the modulo of one polynomial with another, but it's entirely analogous to integers.) This gives us a weaker version of the theorem above:

(Fermat's little theorem for polynomials, weak version.) If \(n\) is prime and \(a\) is not a multiple of \(n\), then for any \(r \ge 2\) \[ (X + a)^n \equiv X^n + a \pmod{X^r - 1, n}\text{.} \]

The “double mod” notation above may be unfamiliar, but in this case its meaning is simple. We consider two polynomials congruent modulo \(X^r - 1, n\) when they are congruent modulo \(n\) after you reduce the powers of \(X\) modulo \(r\) and combine like terms. More generally, two polynomials \(f(X)\) and \(g(X)\) are congruent modulo \(n(X), n\) if \(f(X) - g(X) \equiv n(X) \cdot h(X) \pmod{n}\) for some polynomial \(h(X)\).

With this theorem, we only have to compare \(r\) coefficients, but we introduce the possibility of the condition above being met even when \(n\) is composite. But can we impose conditions on \(r\) and \(a\) such that if the condition holds for a polynomial number of pairs of \(r\) and \(a\), we can be sure that \(n\) is prime? The answer is “yes”; in particular, we can find a single \(r\) and an upper bound \(M\) polylog in \(n\) such that if the condition holds for \(r\) and \(0 \le a \lt M\), then \(n\) is prime.

In the remainder of this article, we'll work backwards. That is, we'll first assume we have some \(n \ge 2\), \(r \ge 2\), and \(M \ge 1\) such that for all \(0 \le a \lt M\) \[ (X + a)^n \equiv X^n + a \pmod{X^r - 1, n}\text{.} \] Then we'll assume that \(n\) is not a power of one of its prime divisors \(p\) and try to deduce the conditions that imposes on \(n\), \(r\), \(M\), and \(p\). Then we can take the contrapositive to find the inverse conditions on \(n\), \(r\), \(M\), and \(p\) that would then force \(n\) to be a power of \(p\). Since we can easily test whether \(n\) is a perfect power, if it's not one, we can immediately conclude that \(n = p^1\) and thus prime. (Of course, if it does turn out to be a perfect power, then it is trivially composite.)

To understand the conditions that we will derive, we must first talk about introspective numbers.

2. Introspective numbers

Given a base \(b\), a polynomial \(g(X)\) and a number \(q\), we call \(q\) introspective[2] for \(g(X)\) modulo \(b\) if \[ g(X)^q = g(X^q) \pmod{b}\text{.} \]

We also say that \(g(X)\) is introspective under \(q\) modulo \(b\).

A basic property of introspective numbers and polynomials is that they are closed under multiplication. That is, if \(q_1\) and \(q_2\) are introspective for \(g(X)\) modulo \(b\), then \(q_1 \cdot q_2\) is also introspective for \(g(X)\) modulo \(b\), and if \(g_1(X)\) and \(g_2(X)\) are introspective under \(q\) modulo \(b\), then \(g_1(X) \cdot g_2(X)\) is also introspective under \(q\) modulo \(b\).

In particular, given our assumptions above, we can easily see that \(1\), \(p\), and \(n\) are introspective for \(X + a\) modulo \(p\) for any \(0 \le a \lt M\). We can also show that \(n/p\) is also introspective for \(X + a\) modulo \(p\). Using closure under multiplication, we can talk about the set of numbers generated by \(p\) and \(n/p\), which are all introspective for \(X + a\) modulo \(p\). Call this set \(I\):

\[ I = \left\{ p^i \left( n/p \right)^j \mid i, j \ge 0 \right\}\text{.} \]

We can also take the closure of all \(X + a\) to get a set of polynomials which are all introspective under \(p\), \(n/p\), or any number in \(I\). Call this set \(P\): \[ P = \left\{ 0 \right\} \cup \left\{ X^{e_0} \cdot (X + 1)^{e_1} \cdot \ldots \cdot (X + M - 1)^{e_{M - 1}} \mid e_0, e_1, \ldots, e_{M - 1} \ge 0 \right\}\text{.} \] To summarize, \(I\) is a set of numbers and \(P\) is a set of polynomials such that for any \(i \in I\) and \(g(X) \in P\), \(i\) is introspective for \(g(X)\) modulo \(p\). Of course, it's still not clear what these two sets have to do with whether \(n\) is prime or not. But we will examine certain finite sets related to \(I\) and \(P\) and their sizes, and we will see that we can deduce their properties depending on the relation of \(n\) to \(p\).

3. Bounds on finite sets related to \(I\) and \(P\)

Now we're ready to work towards finding our restrictions on \(n\), \(r\), \(M\), and \(p\). We'll slowly build them up such that when the last one falls into place, we know that \(n\) is a perfect power of \(p\). Here's what we're starting with:

\(n \ge 2\),
\(r \ge 2\),
\(M \ge 1\),
\(p\) is a prime divisor of \(n\).

Let us restrict \(I\) to a finite set by bounding the exponents of \(p\) and \(n/p\): \[ I_k = \left\{ p^i (n/p)^j \mid 0 \le i, j \lt k \right\} \subset I\text{.} \]

Notice that if \(n\) is not a power of \(p\), then all members of \(I_k\) are distinct, and therefore we can easily calculate its size:[3] \[ |I_k| = k^2\text{.} \]

Let's also restrict \(P\) to a finite set by bounding the degrees of its polynomials: \[ P_d = \left\{ g \in P \mid \deg(g) \lt d \right\} \subset P\text{.} \]

We can calculate \(|P_d|\) exactly,[4] but we only need a lower bound for when \(d \le M\). Consider \(P_d^{\{0, 1\}}\), the subset of \(P_d\) where each \(X + a\) is present at most once. Since each \(X + a\) is either present or not present, but not all of them can be present at the same time, there are \(2^d - 1\) distinct polynomials in \(P_d^{\{0, 1\}}\). Adding back the zero polynomial yields \(|P_d^{\{0, 1\}}| = 2^d\). Since \(P_d^{\{0, 1\}}\) is a subset of \(P_d\), \(|P_d| \ge |P_d^{\{0, 1\}}| = 2^d\). Therefore, if \(d \le M\), then[5] \[ |P_d| \ge 2^d\text{.} \] This will be useful later (for a particular value of \(d\)), so let's add the restriction to \(M\):

\(n \ge 2\),
\(r \ge 2\),
\(M \ge d\),
\(p\) is a prime divisor of \(n\).

Let us restrict \(I\) in a different way, by reducing modulo \(r\): \[ J = \left\{ x \bmod r \mid x \in I \right\} \] and let \(t = |J|\). (This size will play an important role later.)

Our final set that we're interested in needs some background to define. We want to find a subset of \(P\) that lies in some field \(F\) because fields have some convenient properties that we will use later.[6]

Consider \(\mathbb{Z}/p\mathbb{Z}\), the ring of integers modulo \(p\). Since \(p\) is prime, it is also a field. In particular, it is the finite field \(\mathbb{F}_p\) of order \(p\). Then consider \(\mathbb{F}_p[X]\), its polynomial ring, which is the set of polynomials with coefficients in \(\mathbb{F}_p\). Given some polynomial \(q(X) \in \mathbb{F}_p[X]\), we can further reduce modulo \(q(X)\) to get \(\mathbb{F}_p[X] / q(X)\). Finally, if \(q(X)\) is irreducible over \(\mathbb{F}_p\), then \(\mathbb{F}_p[X] / q(X)\) is also a field.

(We can show that both \(\mathbb{F}_p = \mathbb{Z}/p\mathbb{Z}\) and \(\mathbb{F}_p[X] / q(X)\) are fields from the same general theorem of rings: if \(R\) is a principal ideal domain and \((c)\) is the two-sided ideal generated by \(c\), then the quotient ring \(R / (c)\) is a field if and only if \(c\) is a prime element of \(R\).)[7]

So we just need to find a polynomial that's irreducible over \(\mathbb{F}_p\). We know that \(X^r - 1\) has \(\Phi_r(X)\), the \(r\)th cyclotomic polynomial, as a factor. \(\Phi_r(X)\) is irreducible over \(\mathbb{Z}\), but not necessarily over \(\mathbb{F}_p\). But if \(r\) is relatively prime to \(p\), then \(\Phi_r(X)\) factors into irreducible polynomials all of degree \(o_r(p)\) (the multiplicative order of \(p\) modulo \(r\)) over \(\mathbb{F}_p\).[8] Then we can just require that \(r\) be relatively prime to \(p\). If we do so, then we can let \(h(X)\) be one of the factors of \(\Phi_r(X)\) over \(\mathbb{F}_p\) and we have our field \(F = \mathbb{F}_p[X] / h(X)\).

\(n \ge 2\),
\(r \ge 2\), \(r\) relatively prime to \(p\),
\(M \ge d\),
\(p\) is a prime divisor of \(n\).

Finally, we can define our last set. Let \[ Q = \left\{ f(X) \bmod (h(X), p) \mid f(X) \in P \right\} \subseteq F\text{.} \]

We can map elements of \(P\) into \(Q\) via reduction modulo \((h(X), p)\). But we're interested in only the elements of \(P\) that map to distinct elements of \(Q\), since that will let us find a lower bound for \(|Q|\). A simple example would be the set of \(X + a\) for \(0 \le a \lt M\); if the degree of \(h(X)\) is greater than \(1\) and \(p \ge M\), then each \(X + a\) is distinct in \(Q\).

Another interesting set is \(X^k\) for \(1 \le k \le r\). Since \(h(X) \equiv 0 \pmod{h(X}, p)\), we can say that \(X\) is a root of the polynomial function \(h(y)\) over the field \(F\). But since \(h(y)\) is a factor of \(\Phi_r(y)\), \(X\) is then a primitive \(r\)th root of unity in \(Q\).[9] But the powers of a primitive \(r\)th root of unity (from \(1\) to \(r\)) are all distinct. Therefore all \(X^k\) for \(1 \le k \le r\) are distinct in \(Q\).

Most importantly, we can show that distinct elements in \(P_d\) map to distinct elements in \(Q\) if \(d \le t\). Let \(f(X)\) and \(g(X)\) be two different elements of \(P_d\). Assume that \(f(X) \equiv g(X) \pmod{h(x}, p)\). Then, for \(m \in I\): \[ f(X^m) \equiv f(X)^m \pmod{X^r - 1, p} \] and \[ g(X^m) \equiv g(X)^m \pmod{X^r - 1, p} \] by introspection modulo \(p\), and therefore \[ f(X^m) \equiv g(X^m) \pmod{X^r - 1, p} \] which immediately leads to \[ f(X^m) \equiv g(X^m) \pmod{h(X}, p)\text{.} \] Therefore, all \(X^m\) for \(m \in I\) are roots of the polynomial function \(u(y) = f(y) - g(y)\) over the field \(F\), and in particular all \(X^m\) for \(m \in J\). But all \(t\) such \(X^m\) are distinct in \(Q\) by the argument above. Therefore, \(u(y)\) must have degree at least \(t\) since a polynomial over a field cannot have more roots than its degree. But the degree of \(u(y)\) is less than \(d\) since both \(f(y)\) and \(g(y)\) have degree less than \(d\). Since \(d \le t\), this is a contradiction, so therefore \(f(X) \neq g(X) \pmod{h(x}, p)\). But since \(f(X)\) and \(g(X)\) were arbitrary, that implies that distinct elements of \(P_d\) map to distinct elements of \(Q\) for \(d \le t\).

Given the above, we can conclude that as long as we require that \(d \le t\), \(p \ge M\), and \(o_r(p) = \deg(h(X)) \gt 1\), then \[ |Q| \ge |P_d| \ge 2^d\text{.} \]

\(n \ge 2\),
\(o_r(p) \gt 1\),
\(M \ge d\),
\(t \ge d\),
\(p \ge M\), \(p\) is a prime divisor of \(n\).

4. The AKS theorem (weak version)

We're finally ready to put it all together. Again assume \(n\) is not a power of \(p\), and recall that \(|J| = t\). Let \(s \gt \sqrt{t}\). Then \(|I_s| = s^2 \gt t\). By the pigeonhole principle, there must be two elements \(m_1, m_2 \in I_s\) that map to the same element in \(J\); that is, there must be \(m_1, m_2 \in I_s\) such that \(m_1 \equiv m_2 \pmod{r}\). Now pick some \(g(X)\) from \(P\). Then \[ g(X)^{m_1} \equiv g(X^{m_1}) \pmod{X^r - 1, p} \] and \[ g(X)^{m_2} \equiv g(X^{m_2}) \pmod{X^r - 1, p} \] by introspection modulo \(p\). But \(X^{m_1} \equiv X^{m_2} \pmod{X^r - 1}\) since \(m_1 \equiv m_2 \pmod{r}\), so \[ g(X^{m_1}) \equiv g(X^{m_2}) \pmod{X^r - 1, p}\text{.} \] Chaining all these congruences together lets us deduce that \[ g(X)^{m_1} \equiv g(X)^{m_2} \pmod{X^r - 1, p}\text{,} \] which immediately leads to \[ g(X)^{m_1} \equiv g(X)^{m_2} \pmod{h(X}, p)\text{.} \]

That means that \(g(X) \bmod (h(X), p) \in Q\) is a root of the polynomial function \(u(y) = y^{m_1} - y^{m_2}\) over the field \(F\). But \(g(X)\) was picked arbitrarily from \(P\), so \(u(y)\) has at least \(|Q|\) roots. \(\deg(u(y)) = \max(m_1, m_2) \le p^{s-1} \cdot (n/p)^{s-1} = n^{s-1}\), and \(u(y)\), being a polynomial over a field, cannot have more roots than its degree, so if \(n\) is not a power of \(p\), then \(|Q| \le n^{s-1}\). Equivalently, if \(|Q| \gt n^{s-1}\), then \(n\) must be a power of \(p\).[10] But we've shown above that \(|Q| \ge 2^d\) for \(d \le t\), so if we can pick \(d\) and \(s\) such that \(2^d \gt n^{s-1}\), then we can force \(n\) to be a power of \(p\). Taking logs, we see that this is equivalent to picking \(d\) and \(s\) such that \(d \gt (s - 1) \lg n\). Since \(d \le t\), this imposes \(t \gt (s - 1) \lg n\) in order for there to be room to pick \(d\). Rearranging, we get \(s \lt \frac{t}{\lg n} + 1\). But \(s \gt \sqrt{t}\), so this imposes \(\sqrt{t} \lt \frac{t}{\lg n} + 1\) in order for there to be room to pick \(s\). Rearranging again, we get \(\frac{t}{\sqrt{t} - 1} \gt \lg n\). Since \(\frac{t}{\sqrt{t} - 1} \gt \sqrt{t}\), it suffices to require that \(t \gt \lg^2 n\) in order for there to be room to pick \(d\) and \(s\). Furthermore, since \(s\) has to be an integer, then \(s \ge \lfloor \sqrt{t} \rfloor + 1\), and therefore \(d \gt \lfloor \sqrt{t} \rfloor \lg n\). Let's update our assumptions:

\(n \ge 2\),
\(o_r(p) \gt 1\)
\(M \ge d \gt \lfloor \sqrt{t} \rfloor \lg n\),
\(t \gt \lg^2 n\),
\(p \ge M\), \(p\) is a prime divisor of \(n\).

So to summarize, if we make the above assumptions, we can pick \(d\) and \(s\) such that \(|Q| \ge 2^d \gt n^{s - 1}\), which implies that \(n\) must be a power of \(p\), which was our goal. Now we just have to express all assumptions in terms of \(n\), \(r\), and \(M\), strengthening them if necessary. \(J\) is generated by \(p\) and \(n/p\), so its order (i.e., \(t\)) is at least \(o_r(p)\), which is in turn at least \(o_r(n)\), since \(p\) is a prime factor of \(n\) (this brings along the assumption that \(r\) and \(n\) are relatively prime). Therefore, we can replace the assumptions \(t \gt \lg^2 n\) and \(o_r(p) \gt 1\) with \(o_r(n) \gt \lg^2 n\). We can remove the reference to \(d\) by finding the maximum value of \(t\). Since \(r\) is relatively prime to \(n\), \(J\) is a subgroup of \(Z_r\), and therefore its order divides (and therefore is at most) \(\varphi(r)\). So we can replace \(M \ge d \gt \lfloor \sqrt{t} \rfloor \lg n\) with \(M \gt \lfloor \sqrt{\varphi(r)} \rfloor \lg n\). Finally, we can remove the reference to \(p\) by mandating that \(n\) has no prime factor less than \(M\). Here are our final assumptions:

\(n \ge 2\), \(n\) has no prime factors less than \(M\),
\(o_r(n) \gt \lg^2 n\),
\(M \gt \lfloor \sqrt{\varphi(r)} \rfloor \lg n\).

We can summarize the above discussion in the following theorem:

(AKS theorem, weak version.) Let \(n \ge 2\), \(r\) be relatively prime to \(n\) with \(o_r(n) \gt \lg^2 n\), and \(M \gt \lfloor \sqrt{\varphi(r)} \rfloor \lg n\). Furthermore, let \(n\) have no prime factor less than \(M\) and let \[ (X + a)^n \equiv X^n + a \pmod{X^r - 1, n} \] for \(0 \le a \lt M\). Then \(n\) is the power of some prime \(p \ge M\).

And that's it for now! In the follow-up article we will strengthen this theorem to further show that \(n\) is equal to \(p\), and therefore prime. Then we will use this result to get a primality-testing algorithm that we will prove to be polynomial time.

[1] We use uppercase letters for variables when we treat polynomials as formal polynomials and lowercase letters when we treat them as functions.

[2] The term “introspection”, which comes from the original AKS paper, was probably chosen to invoke the idea that the exponent \(q\) can be pushed into and pulled out of \(g(X)\). Here we generalize it a bit.

[3] This condition is too weak to be useful by itself, but we will parlay it into something we can use later.

[4] Using the ideas on this page, we can show that \(|P_d| = {M + d \choose d - 1} + 1\) by considering each \(X + a\) a labeled urn (plus a “dummy” urn) and each unit of power an unlabeled ball. (This was used in the AKS paper.)

[5] This lower bound, as well as other ideas that simplify the proof, was taken from Prime Numbers: A Computational Perspective.

[6] You may first want to brush up on the definitions of group, ring, and field, and the differences between them.

[7] This is Theorem 1.47(iv) from “Introduction to finite fields and their applications”.

[8] The reducibility of \(\Phi_r(X)\) over \(\mathbb{F}_p\) given \(r\) relatively prime to \(p\) is Theorem 2.47(ii) from “Introduction to finite fields and their applications”.

[9] It's a bit weird to talk about a polynomial being the root of other polynomials, but recall that we can form a polynomial ring over any field, even a field of polynomials. We keep track of which polynomials belong to which domains by using different variables.

[10] Here's where we force \(n\) to be a prime power.