If you find this site interesting, it has learned you 2-3 tips, may be it made you laugh or cry (uh?). Please support it by sending us some coin at the following three addresses, thank you. Bitcoin adress Litecoin adress Dodgecoin adress ## The Sieve of Eratosthenes## The principle
The Sieve of Eratosthenes is a method for finding all prime numbers up to a
fixed number N. - Take the first element of the array, ie 2 and eliminate all multiples of 2 of the table.
- Then take the next number not eliminated, here 3 and remove multiples of 3.
- Then take the next number not eliminated, here 5 (because 4 was eliminated in the first step) and eliminate multiples of 5.
- and so on...
At the end all non-primes are eliminated ## First implementationFrom an algorithmic point of view, we need several things : - An array of Boolean 2 to N as to whether a number is removed or not.
- A loop over this table.
- A second nested loop eliminating multiple for a selected number.
private static final int MAX = 100000000;private static boolean[] array = new boolean[MAX];//--// for (int i = 2; i < MAX; i++) {if (!array[i]) {int j = i + i;while (j < MAX) {array[j] = true;j += i; } } }
- array represents the array of booleans with true meaning eliminated (it is a composite number) and false not eliminated (this is potentially a prime).
- MAX is the maximum size of the list of primes that we want to find (the limit set N).
Which means, we found 5.7 million prime numbers between 2 and 100 milion in 7 seconds.
source
## Second implementation
This first version of the algorithm is widely optimizable. Indeed
it is possible to work on two areas of improvement namely, find the numbers
faster by accelerating the processing speed (double loop should be faster)
and/or expand the range to find more prime number (enlarge the limit N) - Take the number 2 and eliminate multiple : 2*2, 2*3, 2*4, 2*5, 2*6...
- Take the number 3 and eliminate multiple : 3*2, 3*3, 3*4, 3*5, 3*6...
- Take the number 5 and eliminate multiple : 5*2, 5*3, 5*4, 5*5, 5*6...
- Take the number n and eliminate multiple : n*2, n*3, n*4, n*5, n*6...
private static final int MAX = 100000000;private static final int SQRT_MAX = (int) Math.sqrt(MAX) + 1;private static final int MEMORY_SIZE = (int) (MAX / 2);private static boolean[] array = new boolean[MEMORY_SIZE];//--// for (int i = 3; i < SQRT_MAX; i += 2) {if (!array[(i / 2)]) {int j = (i * i) / 2;while (j < MEMORY_SIZE) {array[j] = true;j += i; } } } Overall we divide the time by 2.
Most assiduous note that we lose one prime number compared to the first implementation (column Pi). Just because this algorithm does not store the even numbers and so 2 is omitted. source ## Third implementationWith this second implementation, we see that we reach a fundamental limit of the Sieve of Eratosthenes. Without touching java jvm parameters, if we try to reach 1 billion, we get :
java.lang.OutOfMemoryError: Java heap space at org.cpas.prime.Eratos2.<clinit>(Eratos2.java:12) Exception in thread "main"
The algorithm needs a lot of memory to run because we need to store N booleans
(or N/2 in version 2). The memory complexity of the Sieve of Eratosthenes is O(n). private static final long MAX = 1000000000L;private static final long SQRT_MAX = (long) Math.sqrt(MAX) + 1;private static final int MEMORY_SIZE = (int) (MAX >> 4);private static byte[] array = new byte[MEMORY_SIZE];//--// for (long i = 3; i < SQRT_MAX; i += 2) {if (!getBit(i)) {long j = (i * i);while (j < MAX) {setBit(j); j += (2 * i); } } } //--// public static boolean getBit(long i) {byte block = array[(int) (i >> 4)];byte mask = (byte) (1 << ((i >> 1) & 7));return ((block & mask) != 0);} public static void setBit(long i) {int index = (int) (i >> 4);byte block = array[index];byte mask = (byte) (1 << ((i >> 1) & 7));array[index] = ( byte) (block | mask);} And we reach a billion....
## Fourth implémentation
This is all well and good but this limit space is always present. As clever as we
can be, we still have to allocate an array of N elements for N large enough exceed
the memory size of the computer.
Arrays.fill(segment, ( byte) 0);long debut = numSegment * MAX_SEGMENT;long fin = debut + MAX_SEGMENT;long finSqrt = (long) Math.sqrt(fin) + 1;int p = -1;int i = 1;while (p < finSqrt) {p = amorce[i++]; long j = p - (debut % p);j += ((j & 1) == 0 ? p : 0); long p2 = p * 2;while (j < MAX_SEGMENT) {segment[( int) (j >> 4)] |= (1 << ((j >> 1) & 7));j += p2; } }
numSegment represents the segment number and MAX_SEGMENT the segment size, we conclude with
the beginning and end of the segment.
Finally it should be noted that the calculation of primes for the initiation in Eratos4 is
made with trial division. It would have been quicker to do with Eratosthenes (a handful of
milliseconds against 500ms) but you have to vary the pleasures ...
source ## Fourth implementation (bis)
By analyzing the code Sieve of Eratosthenes segmented we see that it is
expensive in computation time at the beginning of the sieve of a segment,
when we apply the elimination with small numbers from the initiation. This is
because... precisely the numbers are too small and need a lot of "i+i" to reach
the end of the segment. - On a segment of one billion, we must 333 million loop turn to eliminate all multiples of 3.
- For 5, it takes 200 million loop turn.
- For 17, we have 58 million.
## Fourth implementation (ter)Since Java is a bit slow, try to transcribe Eratos4bis in C.
Actually it is much faster ! ## And now ?
It's nice to have calculated billion prime numbers but now what do we do ? One segment
of 1 billion represents about a file of 60MB (26MB zipped in). So in less than an hour
of computing time, we find all the prime numbers from 1 to 100 billion (there are 4 billion),
but above we have 6GB of data (compressed or 2GB). |