MongoDB with Async Python Transcripts
Chapter: Setting Up Your Computer
Lecture: Importing PyPI Data

Login or purchase this course to watch this video and the rest of the course contents.
0:00 As I mentioned at the beginning of the course, we're going to be using data from PyPI.
0:06 Now the data we're using is actual real live PyPI data I've gotten from one of the APIs where you can query and export information about the packages.
0:17 So we're going to be not working with the website but just the data side of PyPI.
0:23 In order for you to do that, over here I've got a data folder in the course repository. And there's a readme talking about how to do all these things.
0:31 So it says, follow the steps we've already discussed in the video. So make sure you have MongoDB and the management tools installed
0:40 and that MongoDB is running and the management tools are in binaries or in your path, okay? Then we need to download the data.
0:49 Now I put this online in a MongoDB format for you right here. So let's go ahead and download that data. See, we got it in the downloads folder there.
1:00 And it's in this bson form binary JSON that MongoDB knows and understands. Okay, so we're going to need to work in that folder.
1:09 Then it says you just need to run this command here, Mongo restore. That's one of the tools that came with well the database tools you installed.
1:18 And it says --drop. Be very careful that says if there is a database called PyPI, we're going to wipe it clean from your system.
1:27 And then we're going to import everything here as the complete representation of the PyPI database and the dot slash means this folder.
1:35 So you got to do it in the right folder. Over here, I will say, open up a new iTerm window. You can say new terminal if you don't have iTerm.
1:43 Here we have our files. So I'm just gonna run this command. First of all, we can ask which Mongo to restore just to make sure that it's in the path.
1:52 On Windows, you can't say which, you say where, basically the same thing. But make sure this comes up somewhere. Then we're gonna run this command.
2:00 You can see it did a bunch of work, reading the metadata for packages, release analytics and users, install those and notice it says no indexes.
2:11 I made sure that we started out with kind of a naive database here that doesn't have any of the performance tuning,
2:18 extra indexes and things along those lines, set up yet, we're saving that for another chapter down the road.
2:24 But we've got 9,188 documents, zero failed to restore. That sounds good to me. So it looks like we probably have it imported into MongoDB successfully.


Talk Python's Mastodon Michael Kennedy's Mastodon