Python Web Apps that Fly with CDNs Transcripts
Chapter: Avoiding Stale Caches
Lecture: The caching fix
0:00 So you may be thinking there's some complex or non-obvious infrastructure way to solve this problem.
0:09 Maybe we tell the CDN some command so that it always knows to come back and look for a new thing.
0:16 And then there actually is an API you could say remove all the stale files instantly. But we don't really need to do that.
0:23 It's really, really simple but not necessarily obvious. So if we have a link like this, cdn.site.com/static/image/car.jpg, this thing is cached.
0:36 And if we change it, how is the web browser and the CDN supposed to have any idea that it's changed?
0:43 It won't, because we said for performance reasons, please don't even ask us just, if you see car.jpg, it's good for three months.
0:49 Let's roll with that, okay? But what we can do is we can change this URL in a very small way. What we can do is we can somehow stick a version onto it.
0:59 And the way we can do that is through the query string. Especially for static images, query strings are meaningless.
1:06 They don't tell you anything about the particular file you're getting. There's like, here's the file, and I guess there's some query string information
1:13 in the URL, but I don't know what to do with that. Here's your static file. Thank you very much.
1:18 So what we can do is we can change this link a little bit and put something on the end.
1:24 end. I'm calling it a cache ID, it doesn't matter. What matters is that this query string
1:30 key and value change any time that the file changes. Now we could do versioning of this
1:39 and if this was coming out of a database it'd be super easy to say every time you update
1:43 it increment the version. Here's version 2, here's version 3, so it could be question and mark V equals two, V equals three.
1:50 That would be great if there was a centralized management place of it like that. But if it's just a file on disk,
1:58 well then you got to keep renaming it and updating it, no. So what we're going to do is we're going to base that version
2:04 or what I'm calling a cache ID as a fragment, kind of like a SHA in a Git check-in on the actual file contents. So you don't have to think about,
2:16 Well, is it a new version? Is it the same version? Does it matter if it updates? You don't worry about it. If the file content changes,
2:23 then that link on the right hand side, this green part changes, which tells the browsers, which tells the CDN, this is not the same file.
2:32 Because while it doesn't actually matter to static files, they don't know that. This could be some server-side generated thing
2:40 that's actually looking at this to generate the file. So it's gonna go back and make another request. And that just means if the file content changes,
2:48 the URL changes. And so yes, you'll have two cached versions of the old car.jpg and the new car.jpg, but the web browser and the HTML
2:57 will only be using the new one instantly, automatically. So that's how we're gonna fix things in this chapter. It turns out to be pretty simple.
3:05 I've got some code to accomplish this for you, in both an easy and automatic as well as high performance way. And we're gonna put that to place.