Python : Turning a sequence into an iterator without iter(), keeping the sequences' properties intact
From time to time, we have come across questions like -- How can I make an iterator out of a sequence without explicitly calling the iter() function?
A follow-up question to that would be -- Can we preserve the sequences' properties intact, with just adding the needed properties to make it an iterator?
In answering these, we are also answering the obviously inferred question -- How can I expand functionalities of a class without modifying the class itself?
All of these questions have the same answer -- a simple one: By subclassing the class in question and adding/overloading attributes/methods to meet the requirements.
Let's see this in action:
I'm gonna show how we can extend the properties of a list
, which is a sequence
(type of iterable
), to make it an iterator
, while keeping the original properties of being a sequence. I'll be using Python 3.
The Python language protocol defines that, for some object to become an iterator
, it needs to implement the iterator
protocol, which essentially means the object must have the following two dunder methods:
-
__iter__
: which must return the instantiated object itself -
__next__
: returns the next element of the object, raisesStopIteration
when the iterator is exhausted. Just to note, this is called justnext()
in Python 2.
So, first let's see what a vanilla list
class contains:
>>> list.__iter__ <slot wrapper '__iter__' of 'list' objects>
It contains __iter__
, that's good, one requirement is met. Let's check for __next__
:
>>> list.__next__ Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: type object 'list' has no attribute '__next__'
Hmmmm, no __next__
. So, we need to implement __next__
magic method, while keeping all the existing properties of list
as is.
Let's create out new class, which subclasses list
:
class MyList(list): def __init__(self, list_): # Calling `list`'s constructor to preserve # the properties of `list` super().__init__(list_) self.list_ = list_ # Current index self.cur_idx = -1 # Maximum index self.max_idx = len(list_) - 1 # If `list` did not contain `__iter`, we would # need to implement `__iter__` like below # def __iter__(self): # return self # Implementing `__next__` for `next()` def __next__(self): self.cur_idx += 1 # raising `StopIteration` when the # iterator is exhausted if self.cur_idx > self.max_idx: raise StopIteration('Iterator exhausted!') return self.list_[self.cur_idx]
That's that! Now, let's see how it goes:
>>> from collections import Sequence, Iterator >>> base_list = [1, 2, 3] >>> isinstance(base_list, Sequence) True >>> isinstance(base_list, Iterator) False >>> my_list = MyList(base_list) >>> isinstance(my_list, Sequence) True >>> isinstance(my_list, Iterator) True
Perfect. Now, let's check out the next()
call for fun:
>>> for _ in range(len(base_list)+1): ... next(my_list) ... 1 2 3 Traceback (most recent call last): File "<stdin>", line 2, in <module> File "/home/foobar/spamegg.py", line 18, in __next__ raise StopIteration('Iterator exhausted!') StopIteration: Iterator exhausted!
Works as expected! Now, we can also iterate over it to prove that we still have the sequence
s' properties intact:
>>> for i in my_list: ... print(i) ... 1 2 3
It has them intact, indeed.
As you can imagine, the above would be applicable for tuple
and str
too, basically, for any type of sequence
object.
In practice, one would rarely need to subclass the list
like this to implement the iterator
protocol, as simply calling the iter()
would do the same thing. This is just a see in action type example of classs' attribute/method extension (applicable to overloading too) functionalities by subclassing.
Happy coding! Thanks!
Comments
Comments powered by Disqus