## Monday, March 26, 2012

### Eratosthenes Prime Sieve: a Java Implementation

One of the classical problems in the Sphere Online Judge consists in generating prime numbers. Such a problem can be solved using a prime sieve, a technique that allows finding new prime numbers by discarding all the multiples of known prime numbers. For example, we know that 2 is a prime, so we would automatically discard all even numbers, then we would go ahead and discard the multiples of 3, 5, 7 and so on. Eratosthenes came up with this algorithm roughly 2250 years ago (Wikipedia provides a good description of the sieve).
I wrote down a very straightforward implementation of Eratosthenes sieve in java. Since the algorithm can be used to find a considerable amount of prime numbers, usually it's implemented so that the amount of memory used is as small as possible. A single bit of information is used to store whether or not a number is prime. Thus each byte contains information about 8 numbers. Discarding all even numbers beforehand cuts in half the number of bits needed. Other techniques can be used to reduce the amount of memory used, but I decided to stop here for simplicity. Here's my source code:
`````` public class PrimeSieve {
private final byte sieve[];
/**
* Creates a sieve of integers up to n.
*
* @param n
*/
public PrimeSieve(int n) {
// using one bit per number, skipping even numbers
int sieveSize = n / 16 + 1;
// round up to the next multiple of 16
n = sieveSize * 16;
System.out.println("Sieving numbers up to " + n);
// initialize the array of bytes. Each bit corresponds to an odd integer
// between 1 and n, starting with the rightmost bit of the first byte.
// If the bit is 0, the number is prime. Initially, all numbers are
// assumed to be prime and some will be sifted out.
sieve = new byte[sieveSize];
// 1 is composite
sieve = 0x01;
// this is the maximum starting number to search for primes in the form
// 2*k+1
int maxK = (int) Math.floor(Math.sqrt(n / 2));
int nHalf = n / 2;
// loop on numbers of the form 2*k+1
for (int k = 1; k <= maxK; k++) {
// if 2*k+1 is marked as prime, sift all its multiples
if (get(k)) {
// start from (2*k+1)^2: must divide this by two since the array
// doesn't contain multiples of two. Thus the starting number is
// (2*k+1)^2/2 = 2*k*(k+1). Note that this is odd.
// the increment is 2*k+1 (since the sieve contains only odd
// numbers, using this increment automatically skips to the next
// odd multiple).
final int increment = 2 * k + 1;
for (int composite = 2 * k * (k + 1); composite < nHalf; composite += increment)
// the index in the array is obtained by discarding the
// rightmost 3 bits (or divide by 8); likewise, the position
// in the byte is obtained by right shifting one as many
// time as the number represented by the same 3 bits.
// Note that the function get() is implemented similarly.
sieve[composite >> 3] |= (1 << (composite & 7));
}
}
}
/**
* Checks if the number 2*n+1 is marked as prime.
*
* @param n
*      An integer number.
* @return True if 2*n+1 is marked as prime, false otherwise.
*/
boolean get(int n) {
return ((sieve[n >> 3] >> (n & 7)) & 1) == 0;
}
/**
*
* @param n
*      An integer.
* @return True if n is prime.
*/
public boolean isPrime(int n) {
if (n == 2)
return true;
if (n == 1 || n % 2 == 0)
return false;
int i = n / 16;
if (i >= sieve.length)
throw new RuntimeException("The number " + n
+ " exceeds the values in the sieve.");
return ((sieve[i] >> ((n / 2) & 7)) & 1) == 0;
}
}
``````
Sometimes it's useful to find prime numbers in a small interval instead of all of them up to a certain value. In cases like this, it's inconvenient to use the full sieve, since its speed and memory requirements always depend on the upper primes bound. A segmented sieve allows sifting the composite numbers in a given interval, ignoring most of the previous numbers. In this case speed and memory requirements will depend (mostly) on the length of the interval. I'll give an implementation of the segmented sieve in my next blog post.