# Computing the Integer Square Root

## Fred Akalin

## 1. The algorithm

Today I’m going to talk about a fast algorithm to compute
the *integer
square root* of a non-negative integer \(n\),
\(\isqrt(n) = \lfloor \sqrt{n} \rfloor\), or in words,
the greatest integer whose square is less than or equal to \(n\).^{[1]} Most
sources that describe the algorithm take it for granted that it is
correct and fast. This is far from obvious! So I will prove both
correctness and speed below.

One simple fact is that \(\isqrt(n) \le n/2\), so a
straightforward algorithm is just to test every non-negative integer
up to \(n/2\). This takes \(O(n)\) arithmetic operations, which is bad
since it’s exponential in the *size* of the input. That
is, letting \(\Bits(n)\) be the number of bits required
to store \(n\) and letting \(\lg n\) be the base-\(2\) logarithm of
\(n\), \(\Bits(n) = O(\lg n)\), and thus this algorithm
takes \(O(2^{\Bits(n)})\) arithmetic operations.

We can do better by doing binary search; start with the range \([0, n/2]\) and adjust it based on comparing the square of an integer in the middle of the range to \(n\). This takes \(O(\lg n) = O(\Bits(n))\) arithmetic operations.

^{[2]}

- If \(n = 0\), return \(0\).
- Otherwise, set \(i\) to \(0\) and set \(x_0\) to \(2^{\lceil \Bits(n) / 2\rceil}\).
- Repeat:
- Set \(x_{i+1}\) to \(\lfloor (x_i + \lfloor n/x_i \rfloor) / 2 \rfloor\).
- If \(x_{i+1} \ge x_i\), return \(x_i\). Otherwise, increment \(i\).

^{[3]}

```
// isqrt returns the greatest number x such that x^2 <= n. The type of
// n must behave like BigInteger (e.g.,
// https://github.com/akalin/jsbn ), and n must be non-negative.
//
//
// Example (open up the JS console on this page and type):
//
// isqrt(new BigInteger("64")).toString()
function isqrt(n) {
var s = n.signum();
if (s < 0) {
throw new Error('negative radicand');
}
if (s == 0) {
return n;
}
// x = 2^ceil(Bits(n)/2)
var x = n.constructor.ONE.shiftLeft(Math.ceil(n.bitLength()/2));
while (true) {
// y = floor((x + floor(n/x))/2)
var y = x.add(n.divide(x)).shiftRight(1);
if (y.compareTo(x) >= 0) {
return x;
}
x = y;
}
}
```

## 2. Correctness

The core of the algorithm is the iteration rule: \[ x_{i+1} = \left\lfloor \frac{x_i + \lfloor \frac{n}{x_i} \rfloor}{2} \right\rfloor \] where the floor functions are there only because we’re using integer division. Define an integer-valued function \(f(x)\) for the right side. Using basic properties of the floor function, you can show that you can remove the inner floor: \[ f(x) = \left\lfloor \frac{1}{2} (x + n/x) \right\rfloor \] which makes it a bit easier to analyze. Also, the properties of \(f(x)\) are closely related to its equivalent real-valued function: \[ g(x) = \frac{1}{2} (x + n/x)\text{.} \]

For starters, again using basic properties of the floor function, you can show that \(f(x) \le g(x)\), and for any integer \(m\), \(m \le f(x)\) if and only if \(m \le g(x)\).

Finally, let’s give a name to our desired output: let \(s =
\isqrt(n) = \lfloor \sqrt{n} \rfloor\).^{[4]}

Proof. By the basic properties of \(f(x)\) and \(g(x)\) above, it suffices to show that \(g(x) \ge s\). \(g'(x) = (1 - n/x^2)/2\) and \(g''(x) = n/x^3\). Therefore, \(g(x)\) is concave-up for \(x \gt 0\); in particular, its single positive extremum at \(x = \sqrt{n}\) is a minimum. But \(g(\sqrt{n}) = \sqrt{n} \ge s\). ∎

Proof. \(\Bits(n) = \lfloor \lg n \rfloor + 1 \gt \lg n\). Therefore, \[ \begin{aligned} x_0 &= 2^{\lceil \Bits(n) / 2 \rceil} \\ &\ge 2^{\Bits(n) / 2} \\ &\gt 2^{\lg n / 2} \\ &= \sqrt{n} \\ &\ge s\text{.} \; \blacksquare \end{aligned} \]

(Note that any number greater than \(s\), say \(n\) or \(\lceil n/2 \rceil\), can be chosen for our initial guess without affecting correctness. However, the expression above is necessary to guarantee performance. Another possibility is \(2^{\lceil \lceil \lg n \rceil / 2 \rceil}\), which has the advantage that if \(n\) is an even power of \(2\), then \(x_0\) is immediately set to \(\sqrt{n}\). However, this is usually not worth the cost of checking that \(n\) is a power of \(2\), as is required to compute \(\lceil \lg n \rceil\).)

Proof. Assume it terminates. If it terminates in step \(1\), then we are done. Otherwise, it can only terminate in step \(3.2\) where it returns \(x_i\) such that \(x_{i+1} = f(x_i) \ge x_i\). This implies that \(g(x_i) = (x_i + n/x_i) / 2 \ge x_i\). Rearranging yields \(n \ge x_i^2\) and combining with our invariant we get \(\sqrt{n} \ge x_i \ge s\). But \(s + 1 \gt \sqrt{n}\), so that forces \(x_i\) to be \(s\), and thus \(\NewtonSqrt\) returns \(s\) if it terminates. ∎

Proof. Assume it doesn’t terminate. Then we have a strictly decreasing infinite sequence of integers \(\{ x_0, x_1, \dotsc \}\). But this sequence is bounded below by \(s\), so it cannot decrease indefinitely. This is a contradiction, so \(\NewtonSqrt\) must terminate. ∎

We are done proving correctness, but you might wonder if the check \(x_{i+1} \ge x_i\) in step \(3.2\) is necessary. That is, can it be weakened to the check \(x_{i+1} = x_i\)? The answer is “no”; to see that, let \(k = n - s^2\). Since \(n \lt (s+1)^2\), \(k \lt 2s + 1\). On the other hand, consider the inequality \(f(x_i) \gt x_i\). Since that would cause the algorithm to terminate and return \(x_i\), that implies that \(x_i = s\). Therefore, that inequality is equivalent to \(f(s) \gt s\), which is equivalent to \(f(s) \ge s + 1\), which is equivalent to \(g(s) = (s + n/s) / 2 \ge s + 1\). Rearranging yields \(n \ge s^2 + 2s\). Substituting in \(n = s^2 + k\), we get \(s^2 + k \ge s^2 + 2s\), which is equivalent to \(k \ge 2s\). But since \(k \lt 2s + 1\), that forces \(k\) to equal \(2s\). That is the maximum value \(k\) can be, so therefore \(n\) must be one less than a perfect square. Indeed, for such numbers, weakening the check would cause the algorithm to oscillate between \(s\) and \(s + 1\). For example, \(n = 99\) would yield the sequence \(\{ 16, 11, 10, 9, 10, 9, \dotsc \}\).

## 3. Run-time

We will show that \(\NewtonSqrt\) takes \(O(\lg \lg n)\) arithmetic operations. Since each loop iteration does only a fixed number of arithmetic operations (with the division of \(n\) by \(x\) being the most expensive), it suffices to show that our algorithm performs \(O(\lg \lg n)\) loop iterations.

It is well known that Newton’s method converges quadratically sufficiently close to a simple root. We can’t actually use this result directly, since it’s not clear that the convergence properties of Newton’s method are preserved when using integer operations, but we can do something similar.

Define \(\Err(x) = x^2/n - 1\) and let \(ϵ_i = \Err(x_i)\). Intuitively, \(\Err(x)\) is a conveniently-scaled measure of the error of \(x\): it is less than \(1\) for most of the values we care about and it bounded below for integers greater than our target \(s\). Also, we will show that the \(ϵ_i\) shrink quadratically. These facts will then let us show our bound for the iteration count.

Proof. \(n \lt (s + 1)^2\), so \(n + 1 \le (s + 1)^2\), and therefore \((s + 1)^2/n - 1 \ge 1/n\). But the expression on the left side is just \(\Err(s + 1)\). \(x_i \ge s + 1\) if and only if \(ϵ_i \ge \Err(s + 1)\), so the result immediately follows. ∎

Proof. \(ϵ_{i+1}\) is just \(\Err(f(x_i)) \le \Err(g(x_i))\), so it suffices to show that \(\Err(g(x_i)) \lt (ϵ_i/2)^2\). Inverting \(\Err(x)\), we get that \(x_i = \sqrt{(ϵ_i + 1) \cdot n}\). Expressing \(g(x_i)\) in terms of \(ϵ_i\) we get \[ g(x_i) = \frac{\sqrt{n}}{2} \left( \frac{ϵ_i + 2}{\sqrt{ϵ_i + 1}} \right) \] and \[ \Err(g(x_i)) = \frac{(ϵ_i/2)^2}{ϵ_i+1}\text{.} \] Therefore, it suffices to show that the denominator is greater than \(1\). But \(x_i \ge s + 1\) implies \(ϵ_i \gt 0\) by Lemma 3, so that follows immediately and the result is proved. ∎

Proof. Let’s start with \(x_0\): \[ \begin{aligned} x_0 &= 2^{\lceil \Bits(n) / 2 \rceil} \\ &= 2^{\lfloor (\lfloor \lg n \rfloor + 1 + 1)/2 \rfloor} \\ &= 2^{\lfloor \lg n / 2 \rfloor + 1} \\ &= 2 \cdot 2^{\lfloor \lg n / 2 \rfloor}\text{.} \end{aligned} \] Then \(x_0/2 = 2^{\lfloor \lg n / 2 \rfloor} \le 2^{\lg n / 2} = \sqrt{n}\). Since \(x_0/2\) is an integer, \(x_0/2 \le \sqrt{n}\) if and only if \(x_0/2 \le \lfloor \sqrt{n} \rfloor = s\). Therefore, \(x_0 \le 2s\).

As for \(ϵ_0\): \[ \begin{aligned} ϵ_0 &= \Err(x_0) \\ &\le \Err(2s) \\ &= (2s)^2/n - 1 \\ &= 4s^2/n - 1\text{.} \end{aligned} \] Since \(s^2 \le n\), \(4s^2/n \le 4\) and thus \(ϵ_0 \le 3\).

Finally, \(ϵ_1\) is just \(\Err(f(x_0))\). Using calculations from Lemma 4, \[ \begin{aligned} ϵ_1 &\le \Err(g(x_0)) \\ &= (ϵ_0/2)^2/(ϵ_0 + 1) \\ &\le (3/2)^2/(3 + 1) \\ &= 9/16\text{.} \end{aligned} \] Therefore, \(ϵ_1 \le 1\). ∎

Proof. Let \(k\) be the number of loop iterations performed when running the algorithm for \(n\) (i.e., \(x_k \ge x_{k-1}\)) and assume \(k \ge 4\). Then \(x_i \ge s + 1\) for \(i \lt k - 1\). Since \(ϵ_1 \le 1\) by Lemma 5, \(ϵ_2 \le 1/2\) and \(ϵ_i \le (ϵ_2)^{2^{i-2}}\) for \(2 \le i \lt k - 1\) by Lemma 4, then \(ϵ_{k-2} \le 2^{-2^{k-4}}\). But \(1/n \le ϵ_{k-2}\) by Lemma 3, so \(1/n \le 2^{-2^{k-4}}\). Taking logs to bring down the \(k\) yields \(k - 4 \le \lg \lg n\). Then \(k \le \lg \lg n + 4\), and thus \(k = O(\lg \lg n)\). ∎

^{[5]}

## 4. The Initial Guess

It’s also useful to show that if the initial guess \(x_0\) is bad, then the run-time degrades to \(Θ(\lg n)\). We’ll do this by defining the function \(\NewtonSqrt\) except that it takes a function \(\mathrm{I{\small NITIAL}\text{-}G{\small UESS}}\) that is called with \(n\) and assigned to \(x_0\) in step 1. Then, we can treat \(ϵ_0\) as a function of \(n\) and analyze how long \(ϵ_i\) stays above \(1\) to show that \(\NewtonSqrt\) uses an initial guess such that \(ϵ_0(n) = Θ(1)\), then Theorem 4 reduces to Theorem 3 in that case. However, if \(x_0\) is chosen to be \(Θ(n)\), e.g. the initial guess is just \(n\) or \(n/k\) for some \(k\), then \(ϵ_0(n)\) will also be \(Θ(n)\), and so the run time will degrade to \(Θ(\lg n)\). So having a good initial guess is important for the performance of \(\NewtonSqrt\)!

Like this post? Subscribe to my feed or follow me on Twitter .

## Footnotes

[1] Aside from the Wikipedia article, the algorithm is described as Algorithm 9.2.11 in Prime Numbers: A Computational Perspective. ↩

[2] Note that only integer operations are used, which makes this algorithm suitable for arbitrary-precision integers. ↩

[3] Go and JS implementations are available on my GitHub. ↩

[4] Here, and in most of the article, we’ll implicitly assume that \(n \gt 0\). ↩

[5] \(D(n)\) is \(Θ(\lg^2 n)\) using long division, but fancier division algorithms have better run-times. ↩