Python Web Apps that Fly with CDNs Transcripts
Chapter: Avoiding Stale Caches
Lecture: The caching fix
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
So you may be thinking there's some complex or non-obvious infrastructure way to solve this problem.
0:09
Maybe we tell the CDN some command so that it always knows to come back and look for a new thing.
0:16
And then there actually is an API you could say remove all the stale files instantly. But we don't really need to do that.
0:23
It's really, really simple but not necessarily obvious. So if we have a link like this, cdn.site.com/static/image/car.jpg, this thing is cached.
0:36
And if we change it, how is the web browser and the CDN supposed to have any idea that it's changed?
0:43
It won't, because we said for performance reasons, please don't even ask us just, if you see car.jpg, it's good for three months.
0:49
Let's roll with that, okay? But what we can do is we can change this URL in a very small way. What we can do is we can somehow stick a version onto it.
0:59
And the way we can do that is through the query string. Especially for static images, query strings are meaningless.
1:06
They don't tell you anything about the particular file you're getting. There's like, here's the file, and I guess there's some query string information
1:13
in the URL, but I don't know what to do with that. Here's your static file. Thank you very much.
1:18
So what we can do is we can change this link a little bit and put something on the end.
1:24
end. I'm calling it a cache ID, it doesn't matter. What matters is that this query string
1:30
key and value change any time that the file changes. Now we could do versioning of this
1:39
and if this was coming out of a database it'd be super easy to say every time you update
1:43
it increment the version. Here's version 2, here's version 3, so it could be question and mark V equals two, V equals three.
1:50
That would be great if there was a centralized management place of it like that. But if it's just a file on disk,
1:58
well then you got to keep renaming it and updating it, no. So what we're going to do is we're going to base that version
2:04
or what I'm calling a cache ID as a fragment, kind of like a SHA in a Git check-in on the actual file contents. So you don't have to think about,
2:16
Well, is it a new version? Is it the same version? Does it matter if it updates? You don't worry about it. If the file content changes,
2:23
then that link on the right hand side, this green part changes, which tells the browsers, which tells the CDN, this is not the same file.
2:32
Because while it doesn't actually matter to static files, they don't know that. This could be some server-side generated thing
2:40
that's actually looking at this to generate the file. So it's gonna go back and make another request. And that just means if the file content changes,
2:48
the URL changes. And so yes, you'll have two cached versions of the old car.jpg and the new car.jpg, but the web browser and the HTML
2:57
will only be using the new one instantly, automatically. So that's how we're gonna fix things in this chapter. It turns out to be pretty simple.
3:05
I've got some code to accomplish this for you, in both an easy and automatic as well as high performance way. And we're gonna put that to place.