Find next prime given all prior

Question 1

You don't need all prior primes, just those below the square root of your current production point are enough, when generating composites from primes by the sieve of Eratosthenes algorithm.

This greatly reduces the memory requirements. The primes are then simply those odd numbers which are not among the composites.

Each prime p produces a chain of its multiples, starting from its square, enumerated with the step of 2p (because we work only with odd numbers). These multiples, each with its step value, are stored in a dictionary, thus forming a priority queue. Only the primes up to the square root of the current candidate are present in this priority queue (the same memory requirement as that of a segmented sieve of E.).

Symbolically, the sieve of Eratosthenes is

P = {3,5,7,9, ...} \ ⋃ {{p², p²+2p, p²+4p, p²+6p, ...} | p in P}

Each odd prime generates a stream of its multiples by repeated addition; all these streams merged together give us all the odd composites; and primes are all the odd numbers without the composites (and the one even prime number, 2).

In Python (can be read as an executable pseudocode, hopefully),

def postponed_sieve():      # postponed sieve, by Will Ness,   
    yield 2; yield 3;       #   https://stackoverflow.com/a/10733621/849891
    yield 5; yield 7;       # original code David Eppstein / Alex Martelli 
    D = {}                  #   2002, http://code.activestate.com/recipes/117119
    ps = (p for p in postponed_sieve())  # a separate Primes Supply:
    p = ps.next() and ps.next()          #   (3) a Prime to add to dict
    q = p*p                              #   (9) when its sQuare is 
    c = 9                                #   the next Candidate
    while True:
        if c not in D:                # not a multiple of any prime seen so far:
            if c < q: yield c         #   a prime, or
            else:   # (c==q):         #   the next prime's square:
                add(D,c + 2*p,2*p)    #     (9+6,6 : 15,21,27,33,...)
                p=ps.next()           #     (5)
                q=p*p                 #     (25)
        else:                         # 'c' is a composite:
            s = D.pop(c)              #   step of increment
            add(D,c + s,s)            #   next multiple, same step
        c += 2                        # next odd candidate

def add(D,x,s):                          # make no multiple keys in Dict
    while x in D: x += s                 # increment by the given step
    D[x] = s

Once a prime is produced, it can be forgotten. A separate prime supply is taken from a separate invocation of the same generator, recursively, to maintain the dictionary. And the prime supply for that one is taken from another, recursively as well. Each needs to be supplied only up to the square root of its production point, so very few generators are needed overall (on the order of log log N generators), and their sizes are asymptotically insignificant (sqrt(N), sqrt( sqrt(N) ), etc).

Question 2

For any number k you only need to try division with primes up to and including √k. This is because any prime larger than √k would need to be multiplied with a prime smaller than √k.

Proof:

√k * √k = k so (a+√k) * √k > k (for all 0<a<(k-√k)). From this follows that (a+√k) divides k iff there is another divisor smaller than √k.

This is commonly used to speed up finding primes tremendously.

Question 3

I wrote a program that generates the prime numbers in order, without limit, and used it to sum the first billion primes at my blog. The algorithm uses a segmented Sieve of Eratosthenes; additional sieving primes are calculated at each segment, so the process can continue indefinitely, as long as you have space to store the sieving primes. Here's pseudocode:

function init(delta) # Sieve of Eratosthenes
  m, ps, qs := 0, [], []
  sieve := makeArray(2 * delta, True)
  for p from 2 to delta
    if sieve[p]
      m := m + 1; ps.insert(p)
      qs.insert(p + (p-1) / 2)
      for i from p+p to n step p
        sieve[i] := False
  return m, ps, qs, sieve

function advance(m, ps, qs, sieve, bottom, delta)
  for i from 0 to delta - 1
    sieve[i] := True
  for i from 0 to m - 1
    qs[i] := (qs[i] - delta) % ps[i]
  p := ps[0] + 2
  while p * p <= bottom + 2 * delta
    if isPrime(p) # trial division
      m := m + 1; ps.insert(p)
      qs.insert((p*p - bottom - 1) / 2)
    p := p + 2
  for i from 0 to m - 1
    for j from qs[i] to delta step ps[i]
      sieve[j] := False
  return m, ps, qs, sieve

Here ps is the list of sieving primes less than the current maximum and qs is the offset of the smallest multiple of the corresponding ps in the current segment. The advance function clears the bitarray, resets qs, extends ps and qs with new sieving primes, then sieves the next segment.

function genPrimes()
  bottom, i, delta := 0, 1, 50000
  m, ps, qs, sieve := init(delta)
  yield 2
  while True
    if i == delta # reset for next segment
      i, bottom := -1, bottom + 2 * delta
      m, ps, qs, sieve := \textbackslash
        advance(m, ps, qs, sieve, bottom, delta)
    else if sieve[i] # found prime
      yield bottom + 2*i + 1
    i := i + 1

The segment size 2 * delta is arbitrarily set to 100000. This method requires O(sqrt(n)) space for the sieving primes plus constant space for the sieve.

It is slower but saves space to generate candidates with a wheel and test the candidates for primality.

function genPrimes()
  w, wheel := 0, [1,2,2,4,2,4,2,4,6,2,6,4,2,4, \
       6,6,2,6,4,2,6,4,6,8,4,2,4,2,4,8,6,4,6, \
       2,4,6,2,6,6,4,2,4,6,2,6,4,2,4,2,10,2,10]
  p := 2; yield p
  repeat
    p := p + wheel[w]
    if w == 51 then w := 4 else w := w + 1
    if isPrime(p) yield p

It may be useful to begin with a sieve and switch to a wheel when the sieve grows too large. Even better is to continue sieving with some fixed set of sieving primes, once the set grows too large, then report only those values bottom + 2*i + 1 that pass a primality test.

Find next prime given all prior

Code Sample