There's nothing magical about tee
. It's just clever ;-) At any point, tee
clones the iterator passed to it. That means the cloned iterator(s) will yield the values produced by the passed-in iterator from this point on. But it's impossible for them to reproduce values that were produced before tee
was invoked.
Let's show it with something much simpler than your example:
>>> it = iter(range(5))
>>> next(it)
0
0 is gone now - forever. tee()
can't get it back:
>>> a, b = tee(it)
>>> next(a)
1
So a
pushed it
to produce its next value. It's that value that gets cached, so that other clones can reproduce it too:
>>> next(b)
1
To get that result, it
wasn't touched - 1 was retrieved from the internal cache. And now that all of it
, a
and b
have produced 1, 1 is gone forever too.
I don't know whether that answers your question - answering "Does tee() know what I want here?" seems to require telepathy ;-) That is, I don't know what you mean by "with proper caching". It would be most helpful if you gave an exact example of the input/output behavior you're hoping for.
Short of that, the Python docs give Python code that's equivalent to tee(), and perhaps studying that would answer your question:
def tee(iterable, n=2):
it = iter(iterable)
deques = [collections.deque() for i in range(n)]
def gen(mydeque):
while True:
if not mydeque: # when the local deque is empty
newval = next(it) # fetch a new value and
for d in deques: # load it to all the deques
d.append(newval)
yield mydeque.popleft()
return tuple(gen(d) for d in deques)
You can see from that, for example, that nothing about an iterator's internal state is cached - all that's cached is the values produced by the passed-in iterator, starting from the time tee()
is called. Each clone has its own deque
(FIFO list) of the passed-in iterator's values produced so far, and that's all the clones know about the passed-in iterator. So it may be too simple for whatever you're really hoping for.