You don't need all prior primes, just those below the square root of your current production point are enough, when generating composites from primes by the sieve of Eratosthenes algorithm.
This greatly reduces the memory requirements. The primes are then simply those odd numbers which are not among the composites.
Each prime p produces a chain of its multiples, starting from its square, enumerated with the step of 2p (because we work only with odd numbers). These multiples, each with its step value, are stored in a dictionary, thus forming a priority queue. Only the primes up to the square root of the current candidate are present in this priority queue (the same memory requirement as that of a segmented sieve of E.).
Symbolically, the sieve of Eratosthenes is
P = {3,5,7,9, ...} \ ⋃ {{p2, p2+2p, p2+4p, p2+6p, ...} | p in P}
Each odd prime generates a stream of its multiples by repeated addition; all these streams merged together give us all the odd composites; and primes are all the odd numbers without the composites (and the one even prime number, 2).
In Python (can be read as an executable pseudocode, hopefully),
def postponed_sieve(): # postponed sieve, by Will Ness,
yield 2; yield 3; # https://stackoverflow.com/a/10733621/849891
yield 5; yield 7; # original code David Eppstein / Alex Martelli
D = {} # 2002, http://code.activestate.com/recipes/117119
ps = (p for p in postponed_sieve()) # a separate Primes Supply:
p = ps.next() and ps.next() # (3) a Prime to add to dict
q = p*p # (9) when its sQuare is
c = 9 # the next Candidate
while True:
if c not in D: # not a multiple of any prime seen so far:
if c < q: yield c # a prime, or
else: # (c==q): # the next prime's square:
add(D,c + 2*p,2*p) # (9+6,6 : 15,21,27,33,...)
p=ps.next() # (5)
q=p*p # (25)
else: # 'c' is a composite:
s = D.pop(c) # step of increment
add(D,c + s,s) # next multiple, same step
c += 2 # next odd candidate
def add(D,x,s): # make no multiple keys in Dict
while x in D: x += s # increment by the given step
D[x] = s
Once a prime is produced, it can be forgotten. A separate prime supply is taken from a separate invocation of the same generator, recursively, to maintain the dictionary. And the prime supply for that one is taken from another, recursively as well. Each needs to be supplied only up to the square root of its production point, so very few generators are needed overall (on the order of log log N
generators), and their sizes are asymptotically insignificant (sqrt(N)
, sqrt( sqrt(N) )
, etc).