Django: Getting Started Transcripts
Chapter: Receiving Uploads in Django
Lecture: The trouble with user uploads

Login or purchase this course to watch this video and the rest of the course contents.
0:00 I have yada yadaed a whole bunch of stuff in this chapter. I've tried to concentrate on how uploads work rather than all the trouble you can get
0:09 into when dealing with user files. I mentioned that the first problem you have to deal with is that the user has named their file.
0:17 You need to make sure that your code doesn't allow users to overwrite each other's file contents.
0:23 You should also be careful with what characters are allowed. The user's operating system might not be the same as your servers and even without
0:30 malicious intent, the valid list of characters for their os might not match that of your servers.
0:37 One possible mechanism here is to just name the file something numeric that can add its own challenges.
0:43 You might not want to just completely rename it. You might want to keep the file extensions so that when you're maintaining the server,
0:49 you can tell what kind of file it is. That of course means having to do some manipulation based on either detecting the file type
0:57 or trusting what the user used as their file extension. I'm not big on trusting the user for anything.
1:03 Django does go so far as to prevent the use of non image files in the image field though, so some protection is there, if you're just doing pictures.
1:13 A common pattern is to have two fields for each file One that is the file field and the other which is a text field containing
1:20 the name, the user gave the file. This distinguishes what you want to safely call it without losing what the user called it.
1:27 This may or may not be important in the case of an image for an author the user doesn't need to be aware of what it's named on the server.
1:34 In the case of a series of files in something like Dropbox, you're going to want to allow the user to name the file what they want.
1:44 If all that isn't complicated enough. All media files are public. If you know the URL you can see the file.
1:51 If you can see one file like the one you uploaded, you can probably take a guess at what the names of other files are.
1:59 Depending on your web server's configuration, you might even show a listing of the files that are uploaded.
2:03 Yet another reason to let your web server manage this stuff. You don't want to be writing all that in your Django code.
2:11 There are ways around this but they're messy. I'll come back to that in a second.
2:16 Like static files, media files shouldn't be served by Django directly in a production environment
2:21 Your media file directory can be mounted by your web server. Be careful here, it needs to be rideable by your Django instance.
2:29 Both Apache and nginx have passed through mechanisms for serving private files. This would be that messy thing I mentioned before.
2:37 If you want to require login for accessing media files, you can provide a view that is login required, does whatever checks are necessary,
2:45 and then calls down into the web server by setting an http header. If done correctly, the web server will then serve the file out of a private area,
2:55 but only if you've got that header set. The details of this are beyond this course. Google Django protect media files and your server name for details.
3:06 Whenever dealing with user data, you must assume the user is out to get you and will attempt to do horrible things to your server.
3:12 This goes doubly for media uploads. I've already spoken about problematic file names and how to get around them.
3:19 These don't tend to be too much of a problem if your web server is configured correctly,
3:23 but bad characters or relative pathing in the name can cause problems. The other issue, of course is problematic content.
3:32 If you ran a site like this book site, you'd likely eventually get adult content as the author's photos.
3:39 You have to consider who sees what content in comparison to who owns it and what to do in cases where people are being jerks.
3:47 I'm told the internet's full of them.


Talk Python's Mastodon Michael Kennedy's Mastodon