Introduction

Determining if a number is prime is very important for many applications. While the main application is in cryptography, there are other applications which require prime numbers.

Probabilistic tests are most useful for large prime numbers. Small prime numbers can be tested quickly enough using a naive primality test. The Miller-Rabin primality test is one of the most popular probabilistic tests to determine if a number is composite or probably prime.

In this tip, code for a naive primality test as well as a Miller-Rabin primality test are provided.

The Naive Primality Test

The naive primality test works by testing if that number is divisible by any other number. There are 2 main optimizations which are used to speed up the primality tests:

1. Every even integer greater than 2 is not prime.

2. The maximum number you have to test goes up to sqrt(number).

/** * Uses a naive primality test * * @param number * @return true if prime, false otherwise */ public static boolean naivePrime(int number) { if (number <= 1 || (number & 1) == 0) { if (number == 2) { return true; } return false; } int limit = (int) (Math.sqrt(number) + 1); for (int i = 3; i < limit; i += 2) { if (number % i == 0) { return false; } } return true; }

Miller-Rabin Primality Test

The Miller-Rabin primality test works using some properties of advanced number theory which goes above a lot of other people's head (definitely above mine). If you want to learn how it works, seeWikipedia: Miller-Rabin Primality Test

Here's the code:

/** * Determines if a number is probably prime using the Miller-Rabin primality * test. * * @param number * @param iterations * How accurate the test needs to be. Accuracy ~= 1 - * O(4^-iterations) * @return false if definitely composite. true if probably prime. */ public static boolean millerRabinPrime(int number, int iterations) { if (number <= 1 || (number & 1) == 0) { if (number == 2) { return true; } // numbers less than or equal to 1 are not prime // even numbers are not prime return false; } else if (number == 3) { // 3 is prime return true; } // write number - 1 as 2^s * d, with d odd by factoring powers of 2 from // n-1 long s = 1; while ((number - 1 & 1 << s) == 0) { ++s; } long d = (number - 1) / (1 << s); // System.out.println("2^" + s + " * " + d); Random generator = new Random(); // if (iterations > number - 4) // { // iterations = number - 3; // } for (int i = 1; i <= iterations; ++i) { // pick a random integer a in the range [2, n-2] long a = generator.nextInt(number - 3) + 2; // test alternative: use an even a distribution // long a = (number - 3) * i / iterations; // compute x=a^d % number, check to see if x==1 or x==number-1 long x = safe_pow(a, d, number); if (x == 1 || x == number - 1) { continue; } boolean gotoLoop = false; for (int r = 1; r < s && !gotoLoop; ++r) { // x = x^2 % n x = x * x % number; if (x == 1) { return false; } else if (x == number - 1) { gotoLoop = true; break; } } if (!gotoLoop) { // definately composite return false; } } // probably prime return true; } /** * Computes base^power % mod. Protects against over-flow * * @param base * @param power * @param mod * @return */ public static long safe_pow(long base, long power, long mod) { if (power == 0) { return 1; } else if (power == 1) { return base; } else if ((power & 1) == 0) { // even power // base^power % mod = ((base * base % mod) ^ (power/2)) % mod return safe_pow((base * base % mod), power / 2, mod) % mod; } else { // odd // base^power % mod = ((base * base % mod) ^ (power/2) * base) % mod return safe_pow((base * base % mod), power / 2, mod) * base % mod; } }

Performance (accuracy/speed)

Here's a simple test rig for testing the accuracy and speed of the Miller-Rabin test vs. the naive test.

public static void main(String[] args) { long start = System.currentTimeMillis(); int startP = 0x0; int endP = 0x1FFFFF; for (int i = startP; i < endP; ++i) { naiveTest(i); } long end = System.currentTimeMillis(); System.out.println("Low-range Naive: " + (end - start)); System.out.println("Low-range Naive average: " + ((double) end - start) / (endP - startP)); start = System.currentTimeMillis(); for (int i = startP; i < endP; ++i) { millerRabinTest(i, 0x5); } end = System.currentTimeMillis(); System.out.println("Low-range miller-rabin: " + (end - start)); System.out.println("Low-range miller-rabin average: " + ((double) end - start) / (endP - startP)); for (int i = startP; i < endP; ++i) { boolean naive = naiveTest(i); boolean miller = millerRabinTest(i, 0x5); if (naive != miller) { System.out.println("inaccurate for " + i + " naive: " + naive + " miller: " + miller); } } start = System.currentTimeMillis(); startP = 0x7FF0FFFF; endP = 0x7FF2FFFF; for (int i = startP; i < endP; ++i) { naiveTest(i); } end = System.currentTimeMillis(); System.out.println("High-range Naive: " + (end - start)); System.out.println("High-range Naive average: " + ((double) end - start) / (endP - startP)); start = System.currentTimeMillis(); for (int i = startP; i < endP; ++i) { millerRabinTest(i, 0x5); } end = System.currentTimeMillis(); System.out.println("High-range miller-rabin: " + (end - start)); System.out.println("High-range miller-rabin average: " + ((double) end - start) / (endP - startP)); for (int i = startP; i < endP; ++i) { boolean naive = naiveTest(i); boolean miller = millerRabinTest(i, 0x5); if (naive != miller) { System.out.println("inaccurate for " + i + " naive: " + naive + " miller: " + miller); } } }

This tests numbers in the low range (0-2097151) and numbers in the high range (2146500607-2146631679)

Running the test, I got the following results (your results may vary, times are in milliseconds):

Low-range Naive: 573

Low-range Naive average: 2.732278219355688E-4

Low-range miller-rabin: 1487

Low-range miller-rabin average: 7.090571923528635E-4

High-range Naive: 1015

High-range Naive average: 0.00774383544921875

High-range miller-rabin: 142

High-range miller-rabin average: 0.0010833740234375

A summary of these results:

For small numbers, using the Naive test is faster (~3 times faster). Even so, the Miller-Rabin test was able to test small numbers quickly and efficiently. The interesting results are those for large numbers. The Miller-Rabin test was ~7 times faster for moderately large numbers. To put this into perspective:

Generally for cryptography, most algorithms use ~256 bits or more. Here do the limitations of how I implemented the algorithm, we only have 32 bits (technically 31 because Java doesn't let you have unsigned values). These performance gains will grow exponentially with with larger and larger numbers. Just for fun, I ran the Miller-Rabin test for the full range of all integers (0 to 2^31-1) and it took 34.44 minutes. I didn't want to run the same range using the Naive test, but using the above results I think it's reasonable to conclude that it would have taken ~4 hours to complete that test.

So how does the Miller-Rabin test fair on accuracy?

1. If the test marks a number composite, it is composite (no questions asked).

2. The theoretical chance that it will incorrectly guess that a number is prime is 4^(-k), where k is the number of iterations. It's difficult to derive a direct k-value from my code because I used a random number generator and didn't check for duplicates. However, for large numbers being tested, it's reasonable to conclude that the k-value is ~5 for the parameters I used. This results in a theoretical chance of incorrectly labeling something as prime at 0.09765625%. In practice, this chance is actually a lot smaller. I have been able to run this code over billions, perhaps even trillions of numbers and have yet to encounter the algorithm labeling the number as prime incorrectly.

Conclusion

In conclusion, I hope this code will be useful to someone (likely just as a reference). The major pit-fall of this code is that it doesn't support big integers. There is also one major speed improvement that can be made, and that's replacing the current integer power code with a FFT (Fast-Fourier Transform) algorithm. This last suggestion can speed up the algorithm by an order of magnitude.

The default Java BigInteger class also supports a probabilistic primality test (likely implemented using the Solovay-Strassen primality test). This test has the problem of having an accuracy of 2^-k. While still quite good, it's not nearly as good as the the Miller-Rabin test. Also, the Miller-Rabin test will never incorrectly label composite numbers while the Solovay-Strassen test has a chance of incorrectly labeling prime or composite numbers. I believe both tests have the same running time if comparing to my implementation. I'm not sure if the Solovay-Strassen could also use a FFT to improve it's performance, though I wouldn't doubt it.

Happy coding