Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: Using SQLAchemy
Lecture: Inserting data
0:00 You can see we've defined our schema and modeled those in our data classes here and everything seemed to be working just fine.
0:09 However let's see what kind of data we are having over here and we can jump to the console, do a little query okay get all the packages, 0 packages.
0:23 Not so interesting is it. Well I guess we're going to have to insert some data. Now the reality of inserting data is
0:30 we're going to need some web forms. We're going to need people posting data to the service
0:36 and things like that. We're going to do that when we get to say the user section. But for the moment we need to bootstrap this whole project
0:43 and get some data in here so we can show it, things like that. Along those lines, we're going to do two things.
0:50 First of all we're going to write a little bit of basic code. And notice I've made a bin folder over here.
0:57 Create a new little Python file called basic_inserts. Now in this basic_inserts what we're going to do is just insert a couple of packages
1:07 and maybe some releases just to see how it works and then we'll go and actually use this little data to load up all the real PyPI data
1:16 using exactly the structure that we have here. Great, so let's get started. Now notice we have two new requirements Python dateutil and progressbar2.
1:26 I'll go ahead and install those. PyCharm is noticing that I've updated down here but I've updated the setup
1:36 so we're going to need that for this load data thing but we're not using it yet. One thing we will need
1:42 is we're going to need to initialize the database. So what we're going to do is we're going to come first import os and our DBSession.
1:53 And we need to import pypi as well. So we're just going to go figure out the path to that file right there.
2:00 Then we're going to call DBSession.global_init. We did that previously nothing new here but you have to do this at the beginning
2:06 of every process is going to talk to the database. Great, so let's define the main method that we're actually going to use.
2:18 I'll do the sort of main convention that Python uses for its processes. So that's going to initialize the database.
2:26 And then let's just go over here and do a method I will call it insert_some_data. So we'll just say insert a package. So what do we want to do here?
2:38 Well let's figure out what we have to set in order to insert a package. So what we going to do is we're going to insert a package
2:44 and we're going to give it a one or two releases. So we'll just come over here and grab some of this stuff so we can see what we need.
2:54 And let's just get started. So we're going to say p = Package(). We're going to allocate one of those like that. That's the one we defined
3:03 we don't need to set the ID. Actually, sorry we do is not auto-incrementing like the others. So let's say id is going to be input package name
3:13 and then the created_date. We don't need to set the summary. Summary is input. Package summary. We won't set the description or the home page
3:25 of the docs for now. We'll leave that alone. Say p.author and also set the email here. Okay that's going to cover that
3:35 that sequence set the license that's easy. Now we're going to create two releases. So what do we need for a release. Let's drop that stuff right there.
3:50 In the release, the id is going to be set automatically. We need those. We have a created_date they'll be set automatically
3:56 and I will just copy this over. So let's say r1 = Release(). So here let's set the version and the created_date will be automatic. comment, url, size
4:09 I guess we'll leave those all alone for now. We'll set the size to a 100k. How's that. And we can do the same for release two.
4:21 Technically don't need to call it release. Just keep the same variable name but just so it all crazy. Got a little bit bigger.
4:28 Okay so we've created these in memory but that doesn't put them in the database. How do we put them in the database?
4:34 What we have to use is what's called a session. So here's this unit of work design pattern as opposed to active record that SQLAlchemy uses.
4:42 So we come over here and we say DBSession go to that factory and that creates our session. And then later we're going to say session.commit().
4:53 Assuming we want to save it. And it's somewhere in between we'll say session.add(p) for the package. We could also go over here and say okay
5:03 r.package_id = p.id. And that would associate them in the database. But we could do something more interesting
5:13 now we could say p.releases.append([r1, r2]) And just put them together in memory the way we would in code.
5:25 And then this will navigate that relationship. All right. Well let's give that a shot. I'll insert a couple of these.
5:32 Let's go and run our basic inserts let's see if we got things working right. So far let's go with a request. requests HTTP for humans, connect, writes
5:46 oops misspelled it but that's alright MIT I don't know, we'll just guess to be 1, 0, 5, 6, whatever, 2, 0, 0. Now let's do one more of these.
5:57 Let's go with SQLAlchemy the ORM for Python, Mike Bayer. I'll go with BSD. I've no idea of that's what it is but I'm just going to make this up.
6:10 1, 1, 1 and 2, 2, 5. Then lets just bail out go back over to our database. I want to run a select of packages. I think I got it open again.
6:29 Look at that, here we have it. It's all inserted we have requests and SQLAlchemy both inserted today. We didn't set a lot of the details
6:37 but we did set many of them. Got our licenses and those now the other interesting question is what about this? What if we go over here
6:45 to over here and say releases and run the same query Look at that. We have the various releases and most importantly the package ID was already set.
6:56 So that's really beautiful. Let us fully leverage the object oriented nature of this. We create the package packages contain releases.
7:06 So we went to the releases and we put the various releases in there release 1 and 2 and we just called add
7:12 for the package and it followed the entire object graph to create these relationships for us. So you get to skip out a little bit
7:20 on thinking of all the ways your data is split up in SQLAlchemy and just think of how you wanted to focus and work in memory.
7:28 Now these relationships can have performance issues and we'll talk about that when I get to querying on how to fix it because SQLAlchemy does have ways
7:35 around the inherent performance problems you might get here. but it's a really really nice way of working and this is how we do inserts for data.
7:45 Over here we create a session. We've allocated the objects. We just add one or more of them and then call commit.
7:50 All happens successfully or with a failure right there. If we wanted to make an update we could do a query.
7:58 Get one of these back from one of the sessions. Talk about query, do a query make a change to it and then call commit and that'd push the changes back.
8:06 Okay so this is basically how we work inserting data. And this is the real simple one and that's how it works.
8:13 Inserts in SQLAlchemy, create the objects add them to the session, called commit. Boom, Inserted.