Python : Turning a sequence into an iterator without iter(), keeping the sequences' properties intact
From time to time, we have come across questions like -- How can I make an iterator out of a sequence without explicitly calling the iter() function?
A follow-up question to that would be -- Can we preserve the sequences' properties intact, with just adding the needed properties to make it an iterator?
In answering these, we are also answering the obviously inferred question -- How can I expand functionalities of a class without modifying the class itself?
All of these questions have the same answer -- a simple one: By subclassing the class in question and adding/overloading attributes/methods to meet the requirements.
Let's see this in action:
I'm gonna show how we can extend the properties of a
list, which is a
sequence (type of
iterable), to make it an
iterator, while keeping the original properties of being a sequence. I'll be using Python 3.
The Python language protocol defines that, for some object to become an
iterator, it needs to implement the
iterator protocol, which essentially means the object must have the following two dunder methods:
__iter__: which must return the instantiated object itself
__next__: returns the next element of the object, raises
StopIterationwhen the iterator is exhausted. Just to note, this is called just
next()in Python 2.
So, first let's see what a vanilla
list class contains:
>>> list.__iter__ <slot wrapper '__iter__' of 'list' objects>
__iter__, that's good, one requirement is met. Let's check for
>>> list.__next__ Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'list' has no attribute '__next__'
__next__. So, we need to implement
__next__ magic method, while keeping all the existing properties of
list as is.
Let's create out new class, which subclasses
class MyList(list): def __init__(self, list_): # Calling `list`'s constructor to preserve # the properties of `list` super().__init__(list_) self.list_ = list_ # Current index self.cur_idx = -1 # Maximum index self.max_idx = len(list_) - 1 # If `list` did not contain `__iter`, we would # need to implement `__iter__` like below # def __iter__(self): # return self # Implementing `__next__` for `next()` def __next__(self): self.cur_idx += 1 # raising `StopIteration` when the # iterator is exhausted if self.cur_idx > self.max_idx: raise StopIteration('Iterator exhausted!') return self.list_[self.cur_idx]
That's that! Now, let's see how it goes:
>>> from collections import Sequence, Iterator >>> base_list = [1, 2, 3] >>> isinstance(base_list, Sequence) True >>> isinstance(base_list, Iterator) False >>> my_list = MyList(base_list) >>> isinstance(my_list, Sequence) True >>> isinstance(my_list, Iterator) True
Perfect. Now, let's check out the
next() call for fun:
>>> for _ in range(len(base_list)+1): ... next(my_list) ... 1 2 3 Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/home/foobar/spamegg.py", line 18, in __next__ raise StopIteration('Iterator exhausted!') StopIteration: Iterator exhausted!
Works as expected! Now, we can also iterate over it to prove that we still have the
sequences' properties intact:
>>> for i in my_list: ... print(i) ... 1 2 3
It has them intact, indeed.
As you can imagine, the above would be applicable for
str too, basically, for any type of
In practice, one would rarely need to subclass the
list like this to implement the
iterator protocol, as simply calling the
iter() would do the same thing. This is just a see in action type example of classs' attribute/method extension (applicable to overloading too) functionalities by subclassing.
Happy coding! Thanks!