Adding a CMS to Your Pyramid Web App Transcripts
Chapter: Appendix: Using SQLAchemy
Lecture: Inserting the actual data
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
You saw how to insert data, now let's go through the nitty gritty details of actually inserting all the data from real sources.
0:08
So there is an API for PyPI I've used that API to get most of the top 100 packages. So for example, here's click. Let's look at that real quick.
0:20
And I've downloaded this and it's from Armin Ronacher, creator of Flask. Here you can see the licenses and the languages in this funky format.
0:29
The author info embedded in there. You see the license, it's not set here but it's set up here, so it's kind of a kind of confusing and
0:40
tricky to make this work, but that's fine. So here's a home page The maintainers right now, there's no other maintainers set.
0:47
The full summary and then here are a bunch of releases for it, right? Each release has comments and so on.
0:54
Now, this data is not super useful to us in this format. There's not much description but the other ones...
1:02
Let me just throw one in here so you can see. It's sterling, park, parse or an alternative there. Here you can see the description has a lot more
1:10
a lot more detail, it goes over to column 1,614 so that's pretty long. Right, so what we're going to do is we're going to import this
1:19
data so you have actual packages, releases, users, etc. to work with, how are you're going to do that?
1:25
Well, I'm not going to leave that up to you 'cause that's pretty messy but what I am going to do is I'm giving you
1:30
these PyPI top 100 JSON files right here. They're going to be just at the top of your repository right there.
1:41
Now, over here we're writing basically the same code. Initialize the data base, get a session and what it's going to
1:47
do is it's going to go through and check if and only if there are no users, it's going to load those files and it goes
1:55
through and parses out all the users and then it actually saves them to the data base and then it goes
2:01
through and finds all the packages, saves them to the data base, along with their releases and languages and
2:06
licenses and so on. And it prints out a little summary. So right now there's no data and I actually deleted those other two releases I put in there.
2:14
So let's run this and I'm actually going to run it outside 'Cause it has a cool little progress bar.
2:27
So, let's go over here and say I'm going to activate our virtual environment 'cause we need the same packages and everything there.
2:35
And then we're going to say "Python," copy that path and we're going to run that right here. Let's see how we do. Beautiful!
2:47
It's got some extra junk that it printed out apparently but you could take that out. Here's the final numbers it found
2:52
84 users, 96 packages- from those packages we have 5,400 releases, 10 maintainers, 25 languages and 30 licenses.
3:03
Cool, now let's just go over here and do a quick query to make sure that's in there. So if we run that again, now you can see there's tons of releases.
3:11
This is all AMPQ, they go in order basically AsyncIO view packages there's all the packages. Right, here's Tornado and so on.
3:21
These are the popular ones. Really nice, so now we have actual data to work with. You're welcome to look through this load data.
3:29
It's pretty interesting, how to take these actual real world data sets and import them using SQLAlchemy.
3:36
So, we've got all this JSON data now we're putting it in our data base, it's a lot of yucky details about making it
3:42
work so I'm not going to go through it. But, if it's useful for you feel free to grab it we now have data in our database.
3:49
And all the inserts and everything was done just the way you saw in the previous video.