Python Jumpstart by Building 10 Apps Transcripts
Chapter: App 9: Real Estate Analysis App
Lecture: CSV Processing with the CSV module
0:00 Ok, so that was kind of a cool start, but let's just save this here and call this load_file() basic or something like that,
0:08 I'll leave it commented you can have that if you like, but now what we are really going to do is something a little bit better,
0:15 we are going to go over here and use a module called csv, so there is a couple of formats that have built in support in Python,
0:22 one of them is comma separated values, another one is Json, Json is actually my favorite text based format, and we also have xml.
0:31 So we can actually import this module up here at the top, I just wrote import csv at the top as you know,
0:38 and then I can come over here and there is a reader, the most natural thing would be to say create me a reader,
0:43 and what do we need to do, we need to pass an iterable, well, what is that iterable, here is one, fin, so we want to get the header out, like so,
0:55 and we'll capture the reader, like so and then we can say for row in reader: I can just print row, let's see what happens here. Perfect, look at that,
1:08 it's already done the parsing and the reader has a lot of support for things like the delimiter and we could say like the delimiter is this
1:15 and we could say what the escape character is the quote character to escape and things that might have a comma in them, like names and so on,
1:24 so now you can see if we use this delimiter nothing gets separated, but if we use commas, boom, now it's separated again, right,
1:31 these are lists of individual bits of data, split on that, just kind of like we did in our own code.
1:37 However, what we get back if I just print out the type here, with the comma how about that, is a list and that still means
1:46 if I want to work with the bed I have to say something like row of 4 and I need to know that the fourth item is the fourth column is beds,
1:58 or whatever right, 0, 1, 2, 3, 4, but if somebody changes this over time
2:03 and they add another column like country or something up front, all those indexes break, our code is super fragile and we have to know like oh yeah,
2:12 what if 4 means beds, of course it does, right, that means nothing to humans, so we can come over here and actually do better than this,
2:19 we can come over and say reader=csv.DictReader(). Now, DictReader doesn't return rows it returns a dictionary for each row,
2:32 so instead of using numbers to find the data, we use the names, what names- the column names, so we come over here and give it the same iterable
2:41 and then let's loop over these again, now notice we have a dictionary and the type is residential, the zip is this,
2:50 the latitude is that, this is so much easier, let me show you how to use it really quick,
2:55 then we'll take a moment and look at the concept of dictionaries in more depth. So instead of using numbers, we can use names
3:01 so if I just wanted to print out bed count, that would be super easy now and I don't care about the order or how many rows
3:08 or how it revolves over time, I just say row of beds, bed count 3, bed count 4, beautiful, and you can see over here in our data there are those.
3:20 So this dict reader is really the way I recommend you work with comma separated value files.
3:24 It gives you a lot more durability you don't have to know that like the fifth element is actually the bed count and if that changes it's fragile,
3:32 all that sort of stuff, it just detects it and then builds up these dictionaries for you.
3:36 So what are dictionaries, let's take a moment and look at this core concept.