Python Jumpstart by Building 10 Apps Transcripts
Chapter: App 8: File Searcher App
Lecture: Generator play: a simple example
0:01 Let's explore generator methods and we'll use the Fibonacci numbers, an infinite series of numbers to do this.
0:09 Here we have this Fibonacci numbers that are defined to be 1 and 1 and apparently I left one out, and then the next one is the sum of the previous two,
0:16 so 1+1 is 2, 1+2 is 3, 2+3 is 5, 3+5 is 8 and so on. Now this is an infinite series of numbers, so let's write a function,
0:27 a traditional function like you've seen us write with lists and building up the data and then passing them back to generate this.
0:34 So we are going to define a Fibonacci function, initially I am going to give it a limit, if there is no limit,
0:40 you'll see we'll just flat out run out of memory and crush our application, so we are going to have something like a list that holds the numbers,
0:47 and then we'll have the way you compute this is you have a current number and you have a next number, then I'll say while current is less than a limit,
0:58 and the implementation in Python is actually really fantastic so what you do is you can use tuple projection and avoid having a temporary variable,
1:08 a lot of times you will sort of see 3 steps to do this we can do it in like one, it's cool,
1:12 so we'll say current,next and we'll do- that's the variable so we want to do a tuple unpacking into them so current is going to be signed next
1:21 and next is going to be next + current. And then, we want to add to our numbers current, and then we'll just going to return the numbers,
1:32 so let's print Fibonacci and let's say we want all the Fibonacci numbers that are less than 100.
1:38 Now we still have the factorial stuff above but don't worry about it, just pay attention to the last line. So here are our Fibonacci numbers computed
1:45 up to a sort of including the last number there, so let's actually print this out in a slightly different way
1:52 where we have a little more control to go through, so let's say for n in Fibonacci like so,
1:58 and we'll just print(n) and we'll do that end = just a little comma thing so we can have on the same line, let's run that, perfect,
2:05 it looks basically the same but here is the key thing, let's put a break point here and actually step through,
2:12 we'll actually step through this function call, now there is nothing fancy here, there is nothing that should be surprising to you here,
2:20 we are just going to step into this code and step through it. So I want to come down here and say while this is not the case
2:26 we are just going to step along, you can see on the right the numbers are changing, 3 next should be 5, then next should be 8 and so on,
2:33 and we are building up this list of numbers here you can see the list is highlighted but here you can see this list is building up,
2:40 and all of the computation is happening in this method, and then, in the very end, we decide we've had enough, few more and we'll be there,
2:51 then we return it and then we loop over like so. Ok, that is how traditional methods work, but what I am about to show you is not a traditional method,
3:03 it's something called the co-routine and it's extremely powerful. So let's take the same method and let's define a second variation of it.
3:11 So let's call this Fibonacci co and down here we'll call Fibonacci co. Now we are going to take two steps to go through this and understand it,
3:23 first we are going to just change this to work exactly the same and then we can see we could even do better.
3:31 So instead of letting all the work happen in the Fibonacci method and then processing the results what would be better
3:37 when we have these large sort of pipelines of data processing would be to pull one back and then inspect it,
3:44 do a little work with it then throw it away. And then pull the next one, inspect it, do some work with it, throw it away.
3:51 That way, even if we are processing a billion items, we only pull one into memory to work with at a time.
3:57 And you can see that we'll do the same thing here, so instead of putting this into a list we can use this keyword called yield
4:06 so I come over and I can just simply say yield current, and the way this works is when I start using the yield keyword this becomes a co routine,
4:14 and when I say yield some item I am basically declaring to the Python interpreter
4:18 what I want you to do is create a sequence that can be one item at a time computed. And, here is one of the items, every time I say yield something,
4:28 I say here is one of the items, I never have to return all the items at the end I just say here is an item, here is an item,
4:35 and when I stop saying here is more items than that's the end of the sequence.
4:38 Let me just use some quick formatting here so that things appear correctly, so we'll say via list and via yield, now it's running.
4:44 We can see via list we get those numbers and via yield look at that, we get exactly the same thing, but have we really gained anything,
4:52 ok we have this cool yield keyword, we don't have to have the list, that got a little shorter, but what you'll see is
4:57 we've actually gained something tremendous, so let me put a break point here and debug it again.
5:02 So we'll step in, now this should look kind of similar, we'll step along here, and we are computing the first round,
5:09 and before we are adding it to our list, watch what happens now if I step into this, well, unpack here I only computed a single item right,
5:17 notice that n is 1 and now if I go around again and I step in again watch, we should go straight to line 49 or maybe 47.
5:29 Straight back to 47, we are not rerunning this method, we are resuming this method, right here and if I step,
5:37 step let me step down here now n is 1 again because that happens twice, then if I step in again here we are n is 3 and so on,
5:46 so as we get them back we get the first item and we've only had to do enough work to compute the first item, moreover,
5:52 we don't have to put this into a list to gather up all the results so then return them, you know we are never holding on to all the results
6:01 at once we are just giving an item at a time. So to make that point kind of extreme, let's suppose we want it all the Fibonacci numbers,
6:10 the infinite sequence of numbers. We could remove this limit and we could just say I want to do this forever,
6:17 now obviously doing this forever is going to sort of be a problem, it won't crash, it will just keep going and getting slower and slower, and so on,
6:25 but it does let us down here as a consumer decide when we've had enough so we could say if n > 1000 then break,
6:35 but we have access to the entire infinite series. Now, if I did this with the list, this would just run for a long time,
6:42 run out of memory and then crash, but that's not what happens here, we just get one compute the next, compute the next
6:48 and any step along that path is super cheap, basically a couple of additions and a return value, so let's run this again.
6:56 Boom, look at that, with yield we got the infinite series of numbers and it only took microseconds to compute it.
7:04 All right, and you can take these generator methods and combine them with other generator methods and create a pipeline of processing
7:13 and that's exactly what we are going to go do in our file searcher app.