Python Memory Management and Tips Transcripts
Chapter: Memory and functions
Lecture: Implementing the pipeline functions
0:00 Okay, so let's implement these,
0:01 and we're going to start out,
0:03 as you can imagine, using some random data as we have been.
0:06 So we're gonna "import random" and we're going to seed it
0:10 so you always get exactly the same result.
0:13 Pick some arbitrary number out of thin air and that'll fix it so it doesn't generate
0:19 potentially different sizes of data, or different numbers which result in
0:22 different sizes say in the filter step.
0:25 Alright, so let's go and do "load_data".
0:30 This is going to return a list which needs importing that type of int. In the
0:35 new version of Python, 3.9 I believe, you'll be able to say "list" like that
0:39 and you don't have to import a thing,
0:40 but currently in 3.8 and below you got to do this.
0:42 Let's go write some code to generate one million numbers between 1000 and 10,000.
0:47 So we can do that in a cool little list
0:50 comprehension like this, we'll say we want to get random
0:53 .randint, Between 10,000, I'll use a little, or 1000, and 10,000, I'll use a little
1:02 digit grouping thing you could do here for nothing in range of 1 to 1,000,000.
1:08 Okay, so that should return us our one million items in that range.
1:17 That's pretty easy, right? And then the next one, what we gotta do is we gotta
1:19 filter_data. This one is going to take some data,
1:23 which is a list of int,
1:26 and it's going to return a list of int as well.
1:30 So this one could be another cool list comprehension.
1:32 These don't all have to be list comprehensions in order to achieve what we're going to
1:36 go for, like the technique we're showing,
1:38 but they just happen to be nice. So we'll say "n for n in data
1:44 if n is not divisible, not divisible by 5".
1:48 Okay, so give us all the numbers that are not divisible by five.
1:52 Final one is gonna be to scale
1:54 data, and this is going to take a data which is a list of int
2:00 and return a list of float.
2:04 And we'll say something much like this.
2:09 So this one takes the data and the factor,
2:11 which is a float, and so this is gonna be n times factor for
2:15 n in data. And that's it!
2:18 So we've implemented these three things and let's just,
2:20 you know, print out some of the scaled numbers.
2:24 Let's print out the first 20, see what we get. Just to make sure things are
2:28 hanging together. It takes a moment to run because we generated a million things and
2:33 then did a bunch of processing on it,
2:34 but those look like numbers that were, you know, started out between 1000 and 10,000,
2:41 and then got multiplied by 2.8, don't they?
2:43 Perfect. I guess the other thing we could do is also we could just print
2:48 the length of scaled so we know how many we actually got back, about 800,000.