#
Write Pythonic Code Like a Seasoned Developer Transcripts

Chapter: Generators and Collections

Lecture: On-demand computation with yield and generators

Login or
purchase this course
to watch this video and the rest of the course contents.

0:01
Here is a function called classic_fibonacci and what you do is you pass a limit to it and it will compute all the Fibonacci numbers up to that limit.

0:10
Notice we have a list called "nums" and it does all the work, fills this list up and once it's finally done, it gives you all the numbers.

0:17
Well, what if you want the first million Fibonacci, what if you need the first 5 million Fibonacci numbers, how long will this method take to run?

0:26
What if you don't know how many you need, what if you want to start looking at them and you say

0:30
well, I am looking for the time when I am going to the Fibonacci numbers and the second one is a prime number

0:38
the third one is the cube of the first one in the sequence, who knows when that is, you are just looking through

0:44
and you are going to decide "oh, now it matches, now I have got enough of these". What if you were looking through this

0:50
and you said "I am going to ask for 5 million Fibonacci numbers" and it was really the 5 millionth and first, right, maybe you just gave up.

0:58
So we are going to look at a different way to write exactly the same code that doesn't have these limitations, allows the consumers to process

1:05
as much of these actually infinite series as it needs and yet does this in a very much on demand, high performance way.

1:14
This concept is called generators and it has this keyword called yield. So let's look at this in code. So, here is that same function, we can run it,

1:23
it shows you the first few numbers in the Fibonacci sequence we are passing a limit here,

1:29
we are passing a 100 so we want just the first set of Fibonacci numbers less than a 100. So this is fine, but let's see if we can do better.

1:37
Before we move on, let's actually debug this a little bit. So I am going to put a break point here, we are going to step into this, all right,

1:44
so here we are and let's step into this method, and now we are stepping along, stepping along, and notice we are going through the list,

1:51
you can see up here it actually shows you the list being built, it shows you the numbers so PyCharm is really cool in that sense,

1:56
you can see the list growing, but notice we are the whole time staying here until we get to the limit of 100 which happens pretty soon here,

2:04
right now, and then, we are going through them and processing you can "m" is the various values here.

2:12
So that's fine for small numbers, but what like I said, in the beginning, what if we don't know what the upper bound is?

2:18
Or what if we have to put a really huge number here, what do you think happens to the memory consumption as that number grows,

2:26
obviously we have to gather all the numbers that preceded it and hold them in memory all at once and then you get the answer.

2:33
So Python has this really cool keyword called "yield", and let's come down here and let's call this a generator_fibonacci,

2:42
so we are going to do a few things, that if you have seen this before, you know it's pretty straightforward,

2:48
if you've not seen this, it'll probably blow your mind. All right, so what we are going to do is we are going to say

2:52
instead of having this limit, we would like to work on the infinite series, now if I just run this code, two things will happen,

3:00
first of all it's going to crash in a hurry, even if for some reason it wouldn't crash,

3:05
if we had like infinite memory, it will still never return, right? It's just going to keep adding this infinite series

3:11
but of course it's going to run out of memory. So in Python, we can do something both cleaner and better here,

3:16
so what we can do is we can use this yield keyword, and yield is like return but instead of returning from the method,

3:22
it just says "hey, I want to create a collection or a sequence and here is one of the items, and here is one of the items", so we'll yield "current".

3:30
So, that's cool, so that's going to actually generate - continue to yield the items,

3:35
you might wonder well, how we ever get a value out of it? So let's go find out.

3:42
So we are going to do this, now if I run this, it won't crash or anything,

3:45
it will just keep spitting up numbers, scrolling to the right until it kind of goes crazy,

3:50
so this is an infinite sequence but as a consumer of the infinite sequence, I can decide "OK, I've had enough".

3:59
So what I will say here is let's say "if m is greater than 100", we can use the same test as we have on line 36, we can just break out of our loop,

4:08
all right, so let's run this, we should see the same output, we do, right, classic and generator have the same output but if we go into debugger here,

4:16
it's going to be all sorts of different, all right, so we step in, here we are in generator_fibonacci just like we were before

4:23
and here is our "while True", now watch what happens as soon as we get the current, which is 1 and we say "yield", immediately we are back here,

4:30
we printed it and now look where we return into that loop, we just kind of resume the method back here, see there is this back and forth,

4:38
I'll do this a few times, notice now we are going to jump back into this one and that current is 3 and next is 5,

4:45
this is like a state machine that remembers where it left off and can be resumed,

4:49
but even though it's an infinite sequence, we don't generate all of them, it's more like on demand as you pull items out of it it will compute them,

4:57
so only as much as you pull, you have to pay in terms of computation. The other really cool benefit is nowhere are we adding this to a list

5:04
so nowhere are we using, basically nowhere are we storing more than one item at memory at a time so memory is not a problem in this situation.

5:12
So these generators are really cool and all you have to do is use the yield keyword. If you compared against classic_fibonacci,

5:21
not only is it better performance, more flexible, generates all the numbers and so on, it's actually shorter and once you get your mind around yield,

5:29
it's actually easier to understand. So that's cool, we can also take down here, we can create a something like an even_generator()

5:40
and if I were to pass some kind of set here, some kind of number generator like this, I could say "for n in numbers, if n % 2 == 0"

5:53
our standard even test, we will say "yield n". So given any set of numbers, whether this is a list or a generator, it doesn't matter, it doesn't care,

6:03
it's going to pull the even ones out and then down here, I can define a method called even_fibonacci and we'll say something like this:

6:12
"for n in even_generator()", and then we can give it generator_fibonacci and we can say "yield from this".

6:23
So this will let us compose these things so we can actually create pipelines from one to the next. So let's run our even Fibonacci through here

6:32
and we should get only the even numbers that are also coming from the Fibonacci set and remember,

6:37
this is an infinite sequence because we are starting out with the innermost bit, an infinite sequence, which itself is a generator

6:46
that will take as many items are there and pass them back. But because we don't actually do the work on this part until we pull on it,

6:54
and we don't do the work on this part until we pull on it, it goes something like this,

6:58
pull here, that means go pull this, which pulls on this, which will pull on this piece,

7:03
one item at a time and then when we decide down here we are done, we'll break out. So look at this, we have the even Fibonacci numbers,

7:12
and there is not many so 2, 8, and 144. Here they are, brilliant. If you want more, we can get more.

7:21
Want up to 10 000, no problem, there they are, 10 000; up to a million, there they are up to a million. Boom, like that.

7:30
All right, so let's look at this in a graphic, remember, we already talked about our algorithm here,

7:36
it's a perfect implementation of Fibonacci but it has the limitations where you have to say how many you want

7:41
before you actually get a chance to look at the numbers, and you can't look at too many or you'll run out of memory

7:46
if for some reason you had infinite memory, you'd run out of time. We can switch to a simpler version using the yield keyword

7:53
create this as a generator and it actually does no work until you start pulling on the generator.

8:00
More of what we saw that you can write multiple generators and compose them in a pipeline style which is really awesome

8:06
especially in things like data science.