All the information we need to begin with, is that 2 is the first prime. This means, following the previous approach, that no other even number can be prime. To keep track of this information, we begin by writing a list of all the numbers between 2 and the desired limit, say N:

We now draw a circle around 2, because it is prime, and then strike out all even numbers, i.e. all multiples of 2:

Now, there is a good and a bad way to strike out all multiples of a number from a list. The wrong way is to do trial division to find out if it is a multiple. The right way is to use the fact that two consecutive multiples of a number differ by that exact number. Writing it out in python, you can check the tremendous gain for yourself:

from time import time

def strikeOut(mul, n) :

# note that list index i corresponds to number i+2

sieve = [True for j in xrange(2,n+1)]

# The bad way to sieve

t = time()

for j in xrange(2,n+1) :

if j % mul == 0 :

sieve[j-2] = False

t = time() - t

print "Sieved (bad!!!) all multiples of",mul,"below",n,"in",t,"sec."

sieve = [True for j in xrange(2,n+1)]

# The good way to sieve

t = time()

for j in xrange(mul,n+1,mul) :

sieve[j-2] = False

t = time() - t

print "Sieved (good!!!) all multiples of",mul,"below",n,"in",t,"sec."

No, I'm not taking you, dear reader, for an idiot: you can find sieving algorithms on fist pages of google searches that do sieving this very wrong way...

But lets get back to primes. We had our list, with 2 circled and all its multiples crossed out. The first unmarked number we come across is 3. Just a coincidence that 3 is a prime number? Not really: it it is unmarked because it is not a multiple of any smaller number. Or to put it another way: it is only divisible by 1 and itself, which is hte very definition of prime... So we circle 3 and cross out all its multiples:

I've underlined the multiples of three that where already crossed out. The first number which gets crossed out for the first time is 9. Will it be just a coincidence that 9 = 3

^{2}? Of course not! We have already crossed out all multiples of numbers smaller than 3, and 3

^{2}is the first multiple of 3 which is not a multiple of any smaller number.

So we keep going down our list, and find 5 to be the next unmarked number in the list, which of course is prime, so we circle it, and then cross out all of its multiples. But as we saw before, rather than starting at 10, and crossing out the already crossed out 10, 15 and 20, we proceed directly to 5

^{2}, that is 25.

This goes on until we've searched the whole list, circling numbers and crossing out its multiples. But since we start crossing out at the square of the new found prime, we only need to do the crossing out for primes smaller than the square root of N: once we get there, any uncrossed number up to N will be prime. This is what the beginning of our list will look like once we are through:

So lets try and write our very own sieve of Erathostenes in python, with all this square and square root tricks pointed out above...

from time import time

from math import sqrt

def primeListSofE(n, verbose = True) :

t = time()

sieve = [True for j in xrange(2,n+1)]

for j in xrange(2,int(sqrt(n))+1) :

i = j-2

if sieve[i]:

for k in range(j*j,n+1,j) :

sieve[k-2] = False

ret = [j for j in xrange(2,n+1) if sieve[j-2]]

if verbose :

print "Calculated primes to",n,"in",time()-t,"sec."

return ret

Play around with it and once more feel the thrill of speed. On my computer it does all primes below 1,000,000 almost 5 times faster than the best trial division approach from the previous post. But the real beauty is in how the time now scales with n, because you can no longer find a suitable fit of the form n

^{q}. We haven't simply reduced the size of the exponent, but are treading uncharted territory, as it turns out that the time required to calculate all primes below using this sieve requires (n.log(n))(log(log(n))) time. Actually, for the first 10

^{10}numbers, this is basically indistinguishable from linear time, i.e proportional to n.

This speed does come at a memory cost, though: you need to store an array of size n. And while the algorithm could produce all primes below 10

^{10}in just a few minutes, the program will very likely crash giving an out of memory error. Some memory saving can be done by not storing even numbers at all. Without boring you with the details, below is the python function I use to compute primes at work, which does 10

^{7}almost three times faster than the previous one, and uses half the memory, but still crashes when asked for all primes below 10

^{9}...

from time import time

from math import sqrt

def primeListSofE(n, verbose = False) :

"""Returns a list of prime numbers smaller than or equal to n,

using an optimized sieve of Erathostenes algorithm.

"""

t = time()

maxI = (n-3) // 2

maxP = int(sqrt(n))

sieve = [True for j in xrange(maxI+1)]

for p in xrange(3,maxP+1,2) :

i = (p-3) // 2

if sieve[i] :

i2 = (p*p-3) // 2

for k in xrange(i2,maxI+1,p) :

sieve[k] = False

ret = [2]+[2*j+3 for j in xrange(len(sieve)) if sieve[j]]

if verbose :

print "Calculated primes to",n,"in",time()-t,"sec."

return ret

There are slightly faster ways of sieving, and we could memoize precalculated primes, to speed up repeated calls to the function. But this is probably enough for today's post...

I took the last function primeListSofE and experimented making some changes to it.

ReplyDeleteYou don't need to import math just for sqrt since you can use 'int(n**.5)'.

You can create the sieve with

sieve = [True] * (maxI+1)

avoiding the need to iterate through xrange.

The 'p' loop can be translated into an 'i' loop; and the 'k' loop can be replaced by extended slice notation.

I ended up with:

def primeListSofE(n, verbose = False):

maxI = (n-3)//2

maxP = int(n**.5)

sieve = [True] * (maxI+1)

for i in xrange(0, (maxP-3)//2+1):

if sieve[i]:

p = 2*i+3

sieve[(p*p-3)//2::p] = [False] * ((n//p-p)//2+1)

return [2]+[2*j+3 for j in xrange(len(sieve)) if sieve[j]]

On my PC it runs about 100% faster than the original, but uses about 10% more memory.

PS. Is there some tag needed so that code examples are formatted properly?

The xrange thing is deliberate: for some odd reason python uses less memory with the list comprehension style. I've just tried it once more, and I get a MemoryError with [True]*10**8, but not, the way I've coded it.

ReplyDeleteYou are right about the sqrt thing. Worse than that, I had read, and have just verified, that the built-in exponentiation function is actually 3x faster! Not that it really affects this function, as we are doing it just once, but is definitely good practice to drop sqrt altogether...

The speed gains are probably from the slice notation, or so I read somewhere. I find it more confusing, though, but doubling the speed is probably worth the confusion.

As for the tags, I don't have a clue, as what works in the blog entries (pre, or code tags) is rejected by the comment editor...

Another way to reduce memory use is to use the standard Python array module instead of a list of booleans for the sieve.

ReplyDelete# 8-bit unsigned int

sieve = array.array('B', [1] * (maxI+1))

array0 = array.array('B', [0])

...

sieve[k] = array0

It reduces memory use by about a third (but runs slightly slower). I think the Python list data type has been optimised for speed, and the array data type (which is a useful alternative to the list, but only in specific situations) has been a bit neglected and is not optimised.

But if you are thinking of using the array module to reduce memory, then probably a better alternative is to use the third party numpy module/package - it makes the array module obsolete. Its a shame numpy is not a standard part of Python.

Another way to boost performance using Python 2.6 and higher is to use the new bytearray data type.

ReplyDeleteSimply change

sieve = [True for j in xrange(maxI+1)]

to

sieve = bytearray([1 for j in xrange(maxI+1)])

For me it reduced CPU time by 9% and maximum memory use by 28%.