Python Web Apps that Fly with CDNs Transcripts
Chapter: Avoiding Stale Caches
Lecture: The caching fix

Login or purchase this course to watch this video and the rest of the course contents.
0:00 So you may be thinking there's some complex or non-obvious infrastructure way to solve this problem.
0:09 Maybe we tell the CDN some command so that it always knows to come back and look for a new thing.
0:16 And then there actually is an API you could say remove all the stale files instantly. But we don't really need to do that.
0:23 It's really, really simple but not necessarily obvious. So if we have a link like this, cdn.site.com/static/image/car.jpg, this thing is cached.
0:36 And if we change it, how is the web browser and the CDN supposed to have any idea that it's changed?
0:43 It won't, because we said for performance reasons, please don't even ask us just, if you see car.jpg, it's good for three months.
0:49 Let's roll with that, okay? But what we can do is we can change this URL in a very small way. What we can do is we can somehow stick a version onto it.
0:59 And the way we can do that is through the query string. Especially for static images, query strings are meaningless.
1:06 They don't tell you anything about the particular file you're getting. There's like, here's the file, and I guess there's some query string information
1:13 in the URL, but I don't know what to do with that. Here's your static file. Thank you very much.
1:18 So what we can do is we can change this link a little bit and put something on the end.
1:24 end. I'm calling it a cache ID, it doesn't matter. What matters is that this query string
1:30 key and value change any time that the file changes. Now we could do versioning of this
1:39 and if this was coming out of a database it'd be super easy to say every time you update
1:43 it increment the version. Here's version 2, here's version 3, so it could be question and mark V equals two, V equals three.
1:50 That would be great if there was a centralized management place of it like that. But if it's just a file on disk,
1:58 well then you got to keep renaming it and updating it, no. So what we're going to do is we're going to base that version
2:04 or what I'm calling a cache ID as a fragment, kind of like a SHA in a Git check-in on the actual file contents. So you don't have to think about,
2:16 Well, is it a new version? Is it the same version? Does it matter if it updates? You don't worry about it. If the file content changes,
2:23 then that link on the right hand side, this green part changes, which tells the browsers, which tells the CDN, this is not the same file.
2:32 Because while it doesn't actually matter to static files, they don't know that. This could be some server-side generated thing
2:40 that's actually looking at this to generate the file. So it's gonna go back and make another request. And that just means if the file content changes,
2:48 the URL changes. And so yes, you'll have two cached versions of the old car.jpg and the new car.jpg, but the web browser and the HTML
2:57 will only be using the new one instantly, automatically. So that's how we're gonna fix things in this chapter. It turns out to be pretty simple.
3:05 I've got some code to accomplish this for you, in both an easy and automatic as well as high performance way. And we're gonna put that to place.


Talk Python's Mastodon Michael Kennedy's Mastodon