Lazy Evaluation, iterables and iterators in Python

Thread Starter

Spacerat

Joined Aug 3, 2015
36
Hello,
I am learning about iterables, iterators, generators in Python and their difference.

Iterable is the easiest to get: an iterable is any collection of items that can be addressed by indexing. Lists, dictionaries, strings, etc. are examples of iterables, i.e. data structure that we can iterate over in a for loop. Ex: a=[3,4,5] and a[0]=3. we can pick elements one by or multiple at once.

Iterators are most sophisticated: all iterators are iterables but not vice versa. In Python, an iterable is converted into an iterator under the hood when th iterable is used in a for loop. Also, we use the function next() on iterators to get the next element out it and the next and the next, etc. Once we extract the last element using next(), we get an exception. Iterators have memory of their state.

Certainly, we can pick elements out of an iterable without problems using indexing so I am not sure about the benefit of iterators here...

It is said that iterators take less memory than iterables...Why? When we create an iterable, that iterable takes a certain amount of RAM and hard disc if we save it.
If we created a list and then converted the list to iterator, memory would be required by both the list iterable and the iterator. I guess we also could create the iterator without starting from an iterable, I guess. Creating an iterator somehow does not occupy as much memory because of this idea of lazy evaluation. Does that mean that the iterator's elements don't live in memory until we use the method next()? Where are they conceptually stored then if they are created on the spot?

Generators are similar to iterators but generators are truly functions which produce an iterator...A regular function, once it is done, is done, but generator are different in that...Not clear on this either...

Thanks for any clarification....
SpaceRat
 

ApacheKid

Joined Jan 12, 2015
1,619
Hello,
I am learning about iterables, iterators, generators in Python and their difference.

Iterable is the easiest to get: an iterable is any collection of items that can be addressed by indexing. Lists, dictionaries, strings, etc. are examples of iterables, i.e. data structure that we can iterate over in a for loop. Ex: a=[3,4,5] and a[0]=3. we can pick elements one by or multiple at once.

Iterators are most sophisticated: all iterators are iterables but not vice versa. In Python, an iterable is converted into an iterator under the hood when th iterable is used in a for loop. Also, we use the function next() on iterators to get the next element out it and the next and the next, etc. Once we extract the last element using next(), we get an exception. Iterators have memory of their state.

Certainly, we can pick elements out of an iterable without problems using indexing so I am not sure about the benefit of iterators here...

It is said that iterators take less memory than iterables...Why? When we create an iterable, that iterable takes a certain amount of RAM and hard disc if we save it.
If we created a list and then converted the list to iterator, memory would be required by both the list iterable and the iterator. I guess we also could create the iterator without starting from an iterable, I guess. Creating an iterator somehow does not occupy as much memory because of this idea of lazy evaluation. Does that mean that the iterator's elements don't live in memory until we use the method next()? Where are they conceptually stored then if they are created on the spot?

Generators are similar to iterators but generators are truly functions which produce an iterator...A regular function, once it is done, is done, but generator are different in that...Not clear on this either...

Thanks for any clarification....
SpaceRat
I'm not familiar with Python but this is likely very similar to .Net's IEnumerable as used a great deal with LINQ.

The IEnumerable is an interface and provides a way to "get me the next item" without any restriction on what that might entail.

So in .Net IEnumerable represents a potentially infinite sequence, the data elements might be local (in a list or in an array) or might be remote (pulled from over a network). The important point is that the 'next' item is not retrieved (that is, the work we need to do to actually get that item is no performed) until the item is actually requested.

So in a .Net world we might ask to get the first hundred items that meet some criteria and then after we've gotten that 100th item we're done.

Well if the items were remote and pulled over the network and there were actually a million items, getting them all is very costly so the IEnumerable isolates the consumer from the provider in that sense.

So the IEnumerable - in the source code - looks like it contains a potentially huge set of items but it does not, it simply represents a potentially huge set of items. No items are obtained until we enumerate/iterate.


Does that help?
 
Top