Building data-driven web apps with Pyramid and SQLAlchemy Transcripts
Chapter: Using SQLAchemy
Lecture: Inserting data
0:00 You can see we've defined our schema
0:02 and modeled those in our data classes here
0:05 and everything seemed to be working just fine.
0:08 However let's see what kind of data we are having over here
0:10 and we can jump to the console, do a little query
0:19 okay get all the packages, 0 packages.
0:22 Not so interesting is it.
0:24 Well I guess we're going to have to insert some data.
0:27 Now the reality of inserting data is
0:29 we're going to need some web forms.
0:31 We're going to need people posting data to the service
0:35 and things like that. We're going to do that when we get to say the user section.
0:38 But for the moment we need to bootstrap this whole project
0:42 and get some data in here so we can show it, things like that.
0:46 Along those lines, we're going to do two things.
0:49 First of all we're going to write a little bit of basic code.
0:52 And notice I've made a bin folder over here.
0:56 Create a new little Python file called basic_inserts.
1:02 Now in this basic_inserts
1:03 what we're going to do is just insert a couple of packages
1:06 and maybe some releases just to see how it works
1:09 and then we'll go and actually use this little data
1:13 to load up all the real PyPI data
1:15 using exactly the structure that we have here.
1:18 Great, so let's get started.
1:21 Now notice we have two new requirements
1:23 Python dateutil and progressbar2.
1:25 I'll go ahead and install those.
1:26 PyCharm is noticing that I've updated down here
1:33 but I've updated the setup
1:35 so we're going to need that for this load data thing
1:39 but we're not using it yet.
1:40 One thing we will need
1:41 is we're going to need to initialize the database.
1:44 So what we're going to do is we're going to come first
1:48 import os and our DBSession.
1:52 And we need to import pypi as well.
1:54 So we're just going to go figure out the path
1:56 to that file right there.
1:59 Then we're going to call DBSession.global_init.
2:01 We did that previously
2:03 nothing new here but you have to do this at the beginning
2:05 of every process is going to talk to the database.
2:08 Great, so let's define the main method
2:10 that we're actually going to use.
2:17 I'll do the sort of main convention
2:19 that Python uses for its processes.
2:22 So that's going to initialize the database.
2:25 And then let's just go over here and do a method
2:28 I will call it insert_some_data.
2:32 So we'll just say insert a package.
2:36 So what do we want to do here?
2:37 Well let's figure out what we have to set
2:39 in order to insert a package.
2:41 So what we going to do is we're going to insert a package
2:43 and we're going to give it a one or two releases.
2:45 So we'll just come over here
2:48 and grab some of this stuff so we can see what we need.
2:53 And let's just get started.
2:54 So we're going to say p = Package().
2:57 We're going to allocate one of those like that.
3:01 That's the one we defined
3:02 we don't need to set the ID.
3:04 Actually, sorry we do is not auto-incrementing like the others.
3:07 So let's say id is going to be input package name
3:12 and then the created_date.
3:14 We don't need to set the summary.
3:16 Summary is input.
3:19 Package summary.
3:21 We won't set the description or the home page
3:24 of the docs for now.
3:25 We'll leave that alone.
3:27 Say p.author
3:30 and also set the email here.
3:32 Okay that's going to cover that
3:34 that sequence set the license that's easy.
3:40 Now we're going to create two releases.
3:43 So what do we need for a release.
3:45 Let's drop that stuff right there.
3:49 In the release, the id is going to be set automatically.
3:51 We need those.
3:53 We have a created_date they'll be set automatically
3:55 and I will just copy this over.
3:59 So let's say r1 = Release().
4:03 So here let's set the version
4:04 and the created_date will be automatic.
4:06 comment, url, size
4:08 I guess we'll leave those all alone for now.
4:15 We'll set the size to a 100k.
4:16 How's that.
4:17 And we can do the same for release two.
4:20 Technically don't need to call it release.
4:22 Just keep the same variable name but just so it all crazy.
4:26 Got a little bit bigger.
4:27 Okay so we've created these in memory
4:29 but that doesn't put them in the database.
4:31 How do we put them in the database?
4:33 What we have to use is what's called a session.
4:35 So here's this unit of work design pattern
4:38 as opposed to active record that SQLAlchemy uses.
4:41 So we come over here and we say DBSession
4:45 go to that factory and that creates our session.
4:49 And then later we're going to say session.commit().
4:52 Assuming we want to save it.
4:54 And it's somewhere in between
4:55 we'll say session.add(p) for the package.
4:59 We could also go over here and say okay
5:02 r.package_id = p.id.
5:08 And that would associate them in the database.
5:10 But we could do something more interesting
5:12 now we could say p.releases.append([r1, r2])
5:19 And just put them together in memory
5:22 the way we would in code.
5:24 And then this will navigate that relationship.
5:27 All right. Well let's give that a shot.
5:29 I'll insert a couple of these.
5:31 Let's go and run our basic inserts
5:33 let's see if we got things working right.
5:35 So far let's go with a request.
5:38 requests HTTP for humans, connect, writes
5:45 oops misspelled it but that's alright
5:46 MIT I don't know, we'll just guess to be
5:49 1, 0, 5, 6, whatever, 2, 0, 0.
5:55 Now let's do one more of these.
5:56 Let's go with SQLAlchemy
5:59 the ORM for Python, Mike Bayer.
6:05 I'll go with BSD.
6:07 I've no idea of that's what it is
6:08 but I'm just going to make this up.
6:09 1, 1, 1 and 2, 2, 5.
6:14 Then lets just bail out
6:17 go back over to our database.
6:19 I want to run a select of packages.
6:24 I think I got it open again.
6:28 Look at that, here we have it.
6:30 It's all inserted we have requests
6:32 and SQLAlchemy both inserted today.
6:35 We didn't set a lot of the details
6:36 but we did set many of them.
6:38 Got our licenses and those
6:40 now the other interesting question is what about this?
6:42 What if we go over here
6:44 to over here and say releases and run the same query
6:49 Look at that.
6:50 We have the various releases and most importantly
6:53 the package ID was already set.
6:55 So that's really beautiful.
6:57 Let us fully leverage the object oriented nature of this.
7:01 We create the package
7:02 packages contain releases.
7:05 So we went to the releases
7:07 and we put the various releases in there
7:09 release 1 and 2 and we just called add
7:11 for the package and it followed the entire object graph
7:14 to create these relationships for us.
7:16 So you get to skip out a little bit
7:19 on thinking of all the ways your data is split up
7:22 in SQLAlchemy and just think of how you wanted
7:24 to focus and work in memory.
7:27 Now these relationships can have performance issues
7:29 and we'll talk about that when I get to querying
7:32 on how to fix it because SQLAlchemy does have ways
7:34 around the inherent performance problems you might get here.
7:38 but it's a really really nice way of working
7:40 and this is how we do inserts for data.
7:44 Over here we create a session.
7:45 We've allocated the objects.
7:46 We just add one or more of them and then call commit.
7:49 All happens successfully or with a failure right there.
7:55 If we wanted to make an update
7:56 we could do a query.
7:57 Get one of these back from one of the sessions.
8:00 Talk about query, do a query
8:02 make a change to it and then call commit
8:03 and that'd push the changes back.
8:05 Okay so this is basically how we work inserting data.
8:09 And this is the real simple one
8:11 and that's how it works.
8:12 Inserts in SQLAlchemy, create the objects
8:14 add them to the session, called commit.
8:16 Boom, Inserted.