Building Data-Driven Web Apps with Pyramid and SQLAlchemy Transcripts
Chapter: Database migrations with Alembic
Lecture: Introducing database migrations
0:00 Relational databases are very powerful, but one of the problems that you have when working with relational databases, and this is exacerbated
0:09 by when you're working with ORMs, is this ability to have say SQLAlchemy create all the tables and the structures and the relationships automatically
0:19 is awesome except when you need to change things over time. Once it's already created those tables we saw, it will not change them.
0:27 Your database is somewhat calcified and it becomes hard to change, so we're going to talk about a subject and a set of tools
0:34 and techniques to keep your database evolving as your code evolves and this, I think is one of the biggest challenges
0:42 of running a relational database and is why some of the NoSQL databases with their more flexible schema has some advantages in this situation
0:52 but you'll see the tooling for SQLAlchemy to actually evolve your database is pretty sweet so let's get started. What is the problem?
1:01 Here's what happens. We have a wonderful running site. Here's our fake PyPI and it's doing its PyPI thing running working with this database.
1:10 Then we decide, oh you know, things would be nicer if we could track something else on the package if we had a last updated field
1:18 so we could do simpler queries and maybe make this page a little bit faster. Who knows, something like that. We make that change in SQLAlchemy.
1:26 Then we want to run our code, well what happens? Something that will send shivers down your spine if you see this in production.
1:32 Operational error, no such column, packet, keywords in this case we added. The database schema doesn't exactly match the schema of SQLAlchemy.
1:45 At least if the SQLAlchemy schema is not a subset of that and will contain all the required fields you're going to get something like this
1:53 and it's going to be super bad. So here's the thing, you make some changes. You push your code out to production.
1:59 Instead of your site becoming more awesome it just goes 100% offline. This is any page that touches anything to do with packages will never, ever run.
2:11 That's a problem. How do we fix it? We keep our database exactly in sync with our SQLAlchemy code. At least two ways to do this.
2:19 One way is to manually create an update script. Every time we need to make a change to the database, we'll do that update script
2:26 and we'll apply it to production or staging or some other developer's database, right? It's not terribly hard when using SQLite like we are
2:33 because that's actually just a file. If you have live data going into it you have to stay in sync with that and that can be a challenge
2:40 but the real problem is if you have separate databases like Postgres or something and different people are connecting
2:48 to different databases that are out of sync right, or production versus development things like that. So this is a big problem.
2:54 So the one way we could do it is manual. Another one is we can use something called Alembic. Alembic is from Mike Bayer
3:02 the same guy who created SQLAlchemy and maintains it himself. So these are quite closely tied together and Alembic is a database migration tool
3:12 that will track the versions of databases. So maybe your production database is the latest. Maybe your staging database is some intermediate thing
3:21 and maybe production is the most behind and long as you continue to evolve with Alembic you can have Alembic catch it up to whatever version
3:31 of code that you ship with. It's really, really nice. It doesn't solve every single problem but it solves most of the problems
3:38 and it gives you some plugins for sort of adding the final bits and we're going to use Alembic
3:43 to help evolve our database over time as we evolve our code.