If you find this site interesting, it has learned you 2-3 tips, may be it made you laugh or cry (uh?). Please support it by sending us some coin at the following three addresses, thank you. Bitcoin adress Litecoin adress Dodgecoin adress The Sieve of AtkinThe principle
The Sieve of Atkin is a method considered as an optimized version of the Sieve
of Eratosthenes to find all prime numbers up to a limit fixed N. The
algorithm was recently developed by A. O. L. Atkin and Daniel J. Bernstein you
can find the publication of their work "Prime sieves using binary quadratic forms"
by following this
link.
First implementationFrom a computational point of view, let's implement the version found in pseudo code in Wikipedia English in order to understand how it works. Despite the problem of content mentioned above, the site seems to be a good starting point for understanding Atkin.
private static final int MAX = 100000000; private static final int SQRT_MAX = (int) Math.sqrt(MAX) + 1; private static boolean[] array = new boolean[MAX]; //--// for (int x = 1; x < SQRT_MAX; x++) { for (int y = 1; y < SQRT_MAX; y++) { int k = 4 * x * x + y * y; if ((k < MAX) && ((k % 12 == 1) || (k % 12 == 5))) { array[k] = !array[k]; } k = 3 * x * x + y * y; if ((k < MAX) && (k % 12 == 7)) { array[k] = !array[k]; } if (x > y) { k = 3 * x * x - y * y; if ((k < MAX) && (k % 12 == 11)) { array[k] = !array[k]; } } } } array[2] = true; array[3] = true; for (int n = 5; n <= SQRT_MAX; n++) { if (array[n]) { int n2 = n * n; for (int k = n2; k < MAX; k += n2) { array[k] = false; } } }
In terms of performance we have small hope because times of not optimized Atkin are
better than Eratosthenes first implementation (and not optimized), we have 6 seconds
9 for Atkin against 7 seconds 5 for Eratosthenes (in the case of primes up to 100 million).
Implementation, analysis time
To optimize Atkin, try to analyze the program and see where it takes more time.
To do this we will first cut the first loop into 3 in order to calculate lap times.
Of course the bursting of the loop does not influence the result, we will have all
primes at the end of the program.
double debut = System.currentTimeMillis();
int xUpper = (int) Math.sqrt(MAX / 4) + 1; for (int x = 1; x < xUpper; x++) { for (int y = 1; y < Math.sqrt(MAX - k1); y++) { Si (4x² + y²) % 12 == 1 ou (4x² + y²) % 12 == 5 --> Alors array[k] = !array[k]; } } double intermediaire1 = System.currentTimeMillis(); xUpper = (int) Math.sqrt(MAX / 3) + 1; for (int x = 1; x < xUpper; x++) { for (int y = 2; y < Math.sqrt(MAX - k1); y++) { Si (3x² + y²) % 12 == 7 Alors array[k] = !array[k]; } } double intermediaire2 = System.currentTimeMillis(); xUpper = (int) Math.sqrt(MAX); for (int x = 1; x < xUpper; x++) { for (int y = 1; y < x; y++) { Si (3x² - y²) % 12 == 11 Alors array[k] = !array[k]; } } double intermediaire3 = System.currentTimeMillis(); for (int n = 5; n <= SQRT_MAX; n++) { Elimination des multiples des carrés n², 2n², 3n², ..., MAX } double fin = System.currentTimeMillis();
We see that the time of the loop 1 is greater than loop 2, itself greater than
loop 3 and itself above the loop elimination of multiple squares. More loop 1
takes 43% of the total execution time of the algorithm, the loop 2 33%, the loop 3 15%
and the loop of the squares 7%. Second implementation
To optimize the algorithm, we will address the modulo operator which is an expensive
operation in computing time.
int k1 = 3 * x * x; for (int y = 2; y < Math.sqrt(MAX - k1); y++) { int k = k1 + y * y; if (k % 12 == 7) { System.out.println("x = " + x + " y = " + y); } } } x = 55 y = 9988 x = 55 y = 9992 x = 55 y = 9994 x = 55 y = 9998 x = 57 y = 2 x = 57 y = 4 x = 57 y = 8 x = 57 y = 10 x = 57 y = 14 We see that the solutions meet the following rules :
int[] sequence2 = { 4, 2 }; for (int x = 1; x < xUpper; x += 2) { int index = 0; int k1 = 3 * x * x; for (int y = 2; y < Math.sqrt(MAX - k1); y += sequence2[(++index & 1)]) { int k = k1 + y * y; array[k] = !array[k]; } } And improvements are tangible. source
The optimization gives good results as the loop is 5 times faster than the previous implementation but do not race this loop lent itself well to this change... Third implementation
Now that we have seen how to improve the time loop 2, we can do the same on
the other 2 loops.
Enjoy ! We managed to get a best time than Eratosthenes optimized. For
comparison, the Eratosthenes "second implementation" is 3.063 seconds for
a limit 100 million while Atkin is 2.187, we have a gain of just under 50%.
source Fourth implementation
Keep on running and trying to reach 1 billion by implementing Atkin with an array
of bit, it is exactly the same approach that we had with the 3rd implementation of
Eratosthenes.
We will need 2 additional bit operations over Eratosthenes namely : unSetBit
to unset a bit of the array and toggleBit to toogle a bit of the array.
source
For the moment it is still unclear but the complexity of the algorithm leads to 2 trails :
In conclusion
Nous voyons qu'atkin peut aller plus vite qu'eratosthène lorsqu'il est
"savamment" optimisé mais que d'effort pour arriver à ce résultat.
We see that atkin can go faster than Eratosthenes when it is "cleverly" optimized
but so efforts to achieve this |