Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: Using SQLAchemy
Lecture: Inserting data
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
You can see we've defined our schema and modeled those in our data classes here and everything seemed to be working just fine.
0:09
However let's see what kind of data we are having over here and we can jump to the console, do a little query okay get all the packages, 0 packages.
0:23
Not so interesting is it. Well I guess we're going to have to insert some data. Now the reality of inserting data is
0:30
we're going to need some web forms. We're going to need people posting data to the service
0:36
and things like that. We're going to do that when we get to say the user section. But for the moment we need to bootstrap this whole project
0:43
and get some data in here so we can show it, things like that. Along those lines, we're going to do two things.
0:50
First of all we're going to write a little bit of basic code. And notice I've made a bin folder over here.
0:57
Create a new little Python file called basic_inserts. Now in this basic_inserts what we're going to do is just insert a couple of packages
1:07
and maybe some releases just to see how it works and then we'll go and actually use this little data to load up all the real PyPI data
1:16
using exactly the structure that we have here. Great, so let's get started. Now notice we have two new requirements Python dateutil and progressbar2.
1:26
I'll go ahead and install those. PyCharm is noticing that I've updated down here but I've updated the setup
1:36
so we're going to need that for this load data thing but we're not using it yet. One thing we will need
1:42
is we're going to need to initialize the database. So what we're going to do is we're going to come first import os and our DBSession.
1:53
And we need to import pypi as well. So we're just going to go figure out the path to that file right there.
2:00
Then we're going to call DBSession.global_init. We did that previously nothing new here but you have to do this at the beginning
2:06
of every process is going to talk to the database. Great, so let's define the main method that we're actually going to use.
2:18
I'll do the sort of main convention that Python uses for its processes. So that's going to initialize the database.
2:26
And then let's just go over here and do a method I will call it insert_some_data. So we'll just say insert a package. So what do we want to do here?
2:38
Well let's figure out what we have to set in order to insert a package. So what we going to do is we're going to insert a package
2:44
and we're going to give it a one or two releases. So we'll just come over here and grab some of this stuff so we can see what we need.
2:54
And let's just get started. So we're going to say p = Package(). We're going to allocate one of those like that. That's the one we defined
3:03
we don't need to set the ID. Actually, sorry we do is not auto-incrementing like the others. So let's say id is going to be input package name
3:13
and then the created_date. We don't need to set the summary. Summary is input. Package summary. We won't set the description or the home page
3:25
of the docs for now. We'll leave that alone. Say p.author and also set the email here. Okay that's going to cover that
3:35
that sequence set the license that's easy. Now we're going to create two releases. So what do we need for a release. Let's drop that stuff right there.
3:50
In the release, the id is going to be set automatically. We need those. We have a created_date they'll be set automatically
3:56
and I will just copy this over. So let's say r1 = Release(). So here let's set the version and the created_date will be automatic. comment, url, size
4:09
I guess we'll leave those all alone for now. We'll set the size to a 100k. How's that. And we can do the same for release two.
4:21
Technically don't need to call it release. Just keep the same variable name but just so it all crazy. Got a little bit bigger.
4:28
Okay so we've created these in memory but that doesn't put them in the database. How do we put them in the database?
4:34
What we have to use is what's called a session. So here's this unit of work design pattern as opposed to active record that SQLAlchemy uses.
4:42
So we come over here and we say DBSession go to that factory and that creates our session. And then later we're going to say session.commit().
4:53
Assuming we want to save it. And it's somewhere in between we'll say session.add(p) for the package. We could also go over here and say okay
5:03
r.package_id = p.id. And that would associate them in the database. But we could do something more interesting
5:13
now we could say p.releases.append([r1, r2]) And just put them together in memory the way we would in code.
5:25
And then this will navigate that relationship. All right. Well let's give that a shot. I'll insert a couple of these.
5:32
Let's go and run our basic inserts let's see if we got things working right. So far let's go with a request. requests HTTP for humans, connect, writes
5:46
oops misspelled it but that's alright MIT I don't know, we'll just guess to be 1, 0, 5, 6, whatever, 2, 0, 0. Now let's do one more of these.
5:57
Let's go with SQLAlchemy the ORM for Python, Mike Bayer. I'll go with BSD. I've no idea of that's what it is but I'm just going to make this up.
6:10
1, 1, 1 and 2, 2, 5. Then lets just bail out go back over to our database. I want to run a select of packages. I think I got it open again.
6:29
Look at that, here we have it. It's all inserted we have requests and SQLAlchemy both inserted today. We didn't set a lot of the details
6:37
but we did set many of them. Got our licenses and those now the other interesting question is what about this? What if we go over here
6:45
to over here and say releases and run the same query Look at that. We have the various releases and most importantly the package ID was already set.
6:56
So that's really beautiful. Let us fully leverage the object oriented nature of this. We create the package packages contain releases.
7:06
So we went to the releases and we put the various releases in there release 1 and 2 and we just called add
7:12
for the package and it followed the entire object graph to create these relationships for us. So you get to skip out a little bit
7:20
on thinking of all the ways your data is split up in SQLAlchemy and just think of how you wanted to focus and work in memory.
7:28
Now these relationships can have performance issues and we'll talk about that when I get to querying on how to fix it because SQLAlchemy does have ways
7:35
around the inherent performance problems you might get here. but it's a really really nice way of working and this is how we do inserts for data.
7:45
Over here we create a session. We've allocated the objects. We just add one or more of them and then call commit.
7:50
All happens successfully or with a failure right there. If we wanted to make an update we could do a query.
7:58
Get one of these back from one of the sessions. Talk about query, do a query make a change to it and then call commit and that'd push the changes back.
8:06
Okay so this is basically how we work inserting data. And this is the real simple one and that's how it works.
8:13
Inserts in SQLAlchemy, create the objects add them to the session, called commit. Boom, Inserted.