Write Pythonic Code Like a Seasoned Developer Transcripts
Chapter: Generators and Collections
Lecture: On-demand computation with yield and generators
Login or
purchase this course
to watch this video and the rest of the course contents.
0:01
Here is a function called classic_fibonacci and what you do is you pass a limit to it and it will compute all the Fibonacci numbers up to that limit.
0:10
Notice we have a list called "nums" and it does all the work, fills this list up and once it's finally done, it gives you all the numbers.
0:17
Well, what if you want the first million Fibonacci, what if you need the first 5 million Fibonacci numbers, how long will this method take to run?
0:26
What if you don't know how many you need, what if you want to start looking at them and you say
0:30
well, I am looking for the time when I am going to the Fibonacci numbers and the second one is a prime number
0:38
the third one is the cube of the first one in the sequence, who knows when that is, you are just looking through
0:44
and you are going to decide "oh, now it matches, now I have got enough of these". What if you were looking through this
0:50
and you said "I am going to ask for 5 million Fibonacci numbers" and it was really the 5 millionth and first, right, maybe you just gave up.
0:58
So we are going to look at a different way to write exactly the same code that doesn't have these limitations, allows the consumers to process
1:05
as much of these actually infinite series as it needs and yet does this in a very much on demand, high performance way.
1:14
This concept is called generators and it has this keyword called yield. So let's look at this in code. So, here is that same function, we can run it,
1:23
it shows you the first few numbers in the Fibonacci sequence we are passing a limit here,
1:29
we are passing a 100 so we want just the first set of Fibonacci numbers less than a 100. So this is fine, but let's see if we can do better.
1:37
Before we move on, let's actually debug this a little bit. So I am going to put a break point here, we are going to step into this, all right,
1:44
so here we are and let's step into this method, and now we are stepping along, stepping along, and notice we are going through the list,
1:51
you can see up here it actually shows you the list being built, it shows you the numbers so PyCharm is really cool in that sense,
1:56
you can see the list growing, but notice we are the whole time staying here until we get to the limit of 100 which happens pretty soon here,
2:04
right now, and then, we are going through them and processing you can "m" is the various values here.
2:12
So that's fine for small numbers, but what like I said, in the beginning, what if we don't know what the upper bound is?
2:18
Or what if we have to put a really huge number here, what do you think happens to the memory consumption as that number grows,
2:26
obviously we have to gather all the numbers that preceded it and hold them in memory all at once and then you get the answer.
2:33
So Python has this really cool keyword called "yield", and let's come down here and let's call this a generator_fibonacci,
2:42
so we are going to do a few things, that if you have seen this before, you know it's pretty straightforward,
2:48
if you've not seen this, it'll probably blow your mind. All right, so what we are going to do is we are going to say
2:52
instead of having this limit, we would like to work on the infinite series, now if I just run this code, two things will happen,
3:00
first of all it's going to crash in a hurry, even if for some reason it wouldn't crash,
3:05
if we had like infinite memory, it will still never return, right? It's just going to keep adding this infinite series
3:11
but of course it's going to run out of memory. So in Python, we can do something both cleaner and better here,
3:16
so what we can do is we can use this yield keyword, and yield is like return but instead of returning from the method,
3:22
it just says "hey, I want to create a collection or a sequence and here is one of the items, and here is one of the items", so we'll yield "current".
3:30
So, that's cool, so that's going to actually generate - continue to yield the items,
3:35
you might wonder well, how we ever get a value out of it? So let's go find out.
3:42
So we are going to do this, now if I run this, it won't crash or anything,
3:45
it will just keep spitting up numbers, scrolling to the right until it kind of goes crazy,
3:50
so this is an infinite sequence but as a consumer of the infinite sequence, I can decide "OK, I've had enough".
3:59
So what I will say here is let's say "if m is greater than 100", we can use the same test as we have on line 36, we can just break out of our loop,
4:08
all right, so let's run this, we should see the same output, we do, right, classic and generator have the same output but if we go into debugger here,
4:16
it's going to be all sorts of different, all right, so we step in, here we are in generator_fibonacci just like we were before
4:23
and here is our "while True", now watch what happens as soon as we get the current, which is 1 and we say "yield", immediately we are back here,
4:30
we printed it and now look where we return into that loop, we just kind of resume the method back here, see there is this back and forth,
4:38
I'll do this a few times, notice now we are going to jump back into this one and that current is 3 and next is 5,
4:45
this is like a state machine that remembers where it left off and can be resumed,
4:49
but even though it's an infinite sequence, we don't generate all of them, it's more like on demand as you pull items out of it it will compute them,
4:57
so only as much as you pull, you have to pay in terms of computation. The other really cool benefit is nowhere are we adding this to a list
5:04
so nowhere are we using, basically nowhere are we storing more than one item at memory at a time so memory is not a problem in this situation.
5:12
So these generators are really cool and all you have to do is use the yield keyword. If you compared against classic_fibonacci,
5:21
not only is it better performance, more flexible, generates all the numbers and so on, it's actually shorter and once you get your mind around yield,
5:29
it's actually easier to understand. So that's cool, we can also take down here, we can create a something like an even_generator()
5:40
and if I were to pass some kind of set here, some kind of number generator like this, I could say "for n in numbers, if n % 2 == 0"
5:53
our standard even test, we will say "yield n". So given any set of numbers, whether this is a list or a generator, it doesn't matter, it doesn't care,
6:03
it's going to pull the even ones out and then down here, I can define a method called even_fibonacci and we'll say something like this:
6:12
"for n in even_generator()", and then we can give it generator_fibonacci and we can say "yield from this".
6:23
So this will let us compose these things so we can actually create pipelines from one to the next. So let's run our even Fibonacci through here
6:32
and we should get only the even numbers that are also coming from the Fibonacci set and remember,
6:37
this is an infinite sequence because we are starting out with the innermost bit, an infinite sequence, which itself is a generator
6:46
that will take as many items are there and pass them back. But because we don't actually do the work on this part until we pull on it,
6:54
and we don't do the work on this part until we pull on it, it goes something like this,
6:58
pull here, that means go pull this, which pulls on this, which will pull on this piece,
7:03
one item at a time and then when we decide down here we are done, we'll break out. So look at this, we have the even Fibonacci numbers,
7:12
and there is not many so 2, 8, and 144. Here they are, brilliant. If you want more, we can get more.
7:21
Want up to 10 000, no problem, there they are, 10 000; up to a million, there they are up to a million. Boom, like that.
7:30
All right, so let's look at this in a graphic, remember, we already talked about our algorithm here,
7:36
it's a perfect implementation of Fibonacci but it has the limitations where you have to say how many you want
7:41
before you actually get a chance to look at the numbers, and you can't look at too many or you'll run out of memory
7:46
if for some reason you had infinite memory, you'd run out of time. We can switch to a simpler version using the yield keyword
7:53
create this as a generator and it actually does no work until you start pulling on the generator.
8:00
More of what we saw that you can write multiple generators and compose them in a pipeline style which is really awesome
8:06
especially in things like data science.