|
|
|
8:32 |
|
|
transcript
|
3:20 |
Welcome to Python Web Security, the OWASP top 10 in agentic AI. I don't know about you, but whenever I publish something on the web, whether that's a web app, an API, I'm never truly at ease, at least not for a while. Who knows what subtle security problems lurk in that code. So we're going to focus on taking the OWASP top 10. These are well-regarded, agreed-upon, and common security vulnerability categories. So when you hear of top 10, it's not really there's 10 vulnerabilities. There's 10 categories of vulnerabilities, each of which has a variety of subtle variations. So we're going to take that and apply it to Python web applications. And we're going to work agentic AI into these reviews, basically use things like Claude Code and agentic AI to review our code in the lens or from the lens of this OWASP top 10. I guarantee you after this course and what you learned in here, if you apply these to your web apps, you will sleep better at night. You can never guarantee anything 100%, but you will be way, way safer and know that by far the most common set of vulnerabilities that both can exist and people look for are likely covered. So this course is going to appear in three acts, a bit of a play, if you will. So act number one is we're going to go through the OWASP top 10 categories one at a time. Number one, number two, number three. And then we're going to look at what is that category and what defines it. And then we're going to see multiple examples in frameworks that you know, like Flask, Django, FastAPI, and so on, where there's a bad version, what's wrong with it, and a good version. So act one is to go through and understand OWASP top 10 through the lens of Python web apps. Act number two is we're going to create an agentic AI personality that knows about these OWASP top 10s and is trained in both them and Python security research in particular. And we'll be able to use that to review our web applications using tools like Claude Code and others. In act three, we are going to do something a little bit crazy. I'm not really sure how this is going to turn out. What we're going to do is we're going to go and find popular open source self-hosted SaaS applications written in Python. We're going to download those because they're open source. We can. And we're going to turn our AI agent, our security inspector, I'm going to call it the security lead. We're going to turn the security lead loose to do an audit and a review of those different open source libraries. Why is this crazy? Well, because I have not picked those apps yet. I have not applied any of these techniques to it. And I don't know what's going to come out. But I can guarantee you, we will get some OWASP considerations that we must look into. It's just the way it is. That's why I'm telling you, after you run the ideas of this course on your own apps, you're going to be better than almost everything else out there. All right. I hope you're super excited to learn about applying the OWASP top 10 to your Python web applications and automating that with AI.
|
|
|
transcript
|
1:15 |
So I mentioned OWASP Top 10 in the introduction. Let's talk a little bit more about it. The OWASP Top 10 is a standard awareness documenter, really set of documents, for developers and web application security specialists. It represents a broad consensus about the most critical security risks to web applications, regardless of technology. You can find it at the link below on OWASP.org. And it's also actually open source. So when you go there, you click on the Top 10, You'll notice there's actually a GitHub link and in there you can download all the docs and stuff. So you have them offline, which is especially helpful for our agentic AI perspective. So we're going to download those, basically the doc, the markdown files that document everything there. And we're going to put that right next to our agent so it can use the exact details to apply them to its research and its auditing. It doesn't have to do web searches and other unreliable things. All right, so the OWASP top 10, they've been around for many, many years. And every three to five years, they refresh it with new ones. We're going to focus on the 2025 edition, which just came out at the end of 2025. So a couple of months ago. Before that, it went all the way back to 2021.
|
|
|
transcript
|
2:38 |
Now, agentic AI programming or AI in programming in general elicits a wide range of responses and emotions from different people. Some people are like, heck yeah, let's go. I can't believe what this does for me. Other people are like, that thing hallucinates. It's junk. I can write code way faster than it. We're not talking about dropping stuff into ChatGPT or some text chat area. We're applying agentic code with Claude code to our application. which gives it way, way better context to work with. And it's much more reliable. But you still may be thinking, Michael, AI, isn't that how bugs get introduced into our code from Vibe Coding, not how we solve security problems and other bugs? Perhaps, perhaps. It depends, right? Humans introduce bugs. AI introduces bugs. There's actually a wide range of engineering practices that can lead to good or bad results with Claude. But you'll see that Claude Code with the right direction, our security lead that we've discussed, is going to be incredibly good at finding, diagnosing, and even fixing these problems. Don't necessarily take my word for it. Ask Mozilla. They make this thing called Firefox. You may have heard of it. So Mozilla recently, as in literally today, the day I'm recording this, they said they're using clawed code to uncover over 100 bugs in Firefox's JavaScript runtime. So not all of Firefox. If they applied it to all of Firefox, it'd be off the hook, I'm sure. But just the JavaScript runtime, including 14 high severity flaws that would become CVEs if people knew about them, right? How insane is that? so they're like you know what let's just ask Claude and see what happens now this is actually a specialized security version of Claude that I don't think we all get access to I think it's an anthropic research project but nonetheless nonetheless we have most of the power that they're working with to work with here OWASP plus Claude code plus our security lead is going to be really something powerful so yes AI may create bugs people also create bugs but you'll see Just suspend your belief if you're not following along. If you don't agree with me on this, suspend your belief until we get to the section where we pull down those applications and we ask Claude Code to review it. It's going to be crazy. So just one interesting data point that came up today. This wasn't actually even in the course materials, but I saw it right as I was preparing for this. I thought, wow, this needs to go in. This is interesting.
|
|
|
transcript
|
1:19 |
Before we wrap up this introduction, let me just really briefly introduce myself. Hey, I'm Michael Kennedy. Nice to meet you. Thank you so much for taking this course and spending your time with me. If you want to read my writing, much of which is about AI and how it applies to code over at mkennedy.codes, that's my website and my blog and more than that as well. So there's a lot of interesting things, many of them related to this course. You can check out over there. I'm the founder, host, creator of Talk Python To Me, which is one of the most popular developer podcasts in the world, as well as the number one Python podcast. I also co-host the Python Bytes podcast with Brian Okken, which is like a weekly news show. So check that out if you want to stay on top of things like this. And I've created Talk Python Training, presumably where you're taking this course, where we have a whole bunch of courses. I'm one of the main authors there, but not the only one. joined by maybe 10 or so other people writing courses for the platform. And finally, I'm a Python Software Foundation fellow. Been there for five years maybe, I can't remember exactly when. But that's a super cool honor that the community bestows upon people who they feel made a contribution to the Python space. So here we are. That's me. If you want to get in touch with me personally, just go to mkennedy.codes. There's a bunch of social media links over there as well. Welcome to the course.
|
|
|
|
7:34 |
|
|
transcript
|
4:25 |
This super short chapter is just going to talk quickly about what tools you need to set up, what tools we're going to use, and if you're going to follow along, which I strongly recommend that you do, what tools do you need to set up on your system? Well, given my conversation at the introduction that we're going to use agentic AI and that we're going to use Claude, would it terribly surprise you? We're going to use Claude Code. Now, if you want, you can just use Claude code alone. I am not a huge fan of cloud code in the terminal. That's a discussion for another day. I think it makes you more disjointed from your code. That's the TLDR. So I like to install it into something like Visual Studio Code. But however you're going to use it, you need to install the terminal version nonetheless. So you can see the curl command here. And if you view this on Windows, I'm sure you see a PowerShell one as well. So make sure that you sign up by clicking the try cloud code and then install it with the curl or powershell command whatever it presents to you here as i said i like to use an editor even if it weren't for the ai i still want to have an editor so we can look at and write code which we're going to do but i especially like having quad code in my editor as well because i find that just the terminal version with stuff flying by it really encourages me to just kind of let it stream by and then trust it. And I'm not a huge fan of that. I like to actually see the code it's writing and maybe stop it and redirect it mid-flight as it's working on it and so on. So we're going to use Visual Studio Code, which is completely free, and we're going to install the Claude Code extension into it. So install Visual Studio Code from code.visualstudio.com, and then install Claude Code from that page I just showed you, claudecode.com, claude.com slash product slash Claude-code. Install the extension for Claude Code if you want to follow along exactly. So once you get Visual Studio open, just go there and type Claude, and you'll see. There are many, many rando things that have to do with Claude, some of them from AnthropTic, most of them not. So make sure that it's Claude Code for VS Code from Anthropic. Make sure you get the right one. So you're going to want to install that. And then you need to create a Claude Pro account. Now, you don't have to do this, but that costs $20 for one month, or if you pay more long-term, it's $17 a month. I'm sure some of you are like, I don't really want to pay. Maybe I'll just use a local model or some free codex thing or something. You're welcome to do that. you will get way, way less good results. One of the truly important things about working with agentic AI, security or otherwise, is that you need to use an extremely top-tier model. Yes, they're slower. Yes, they cost money. But the outcome is stunning. For $20 a month, you can get something that basically is like an intern that is a savant, a genius programmer, but not very well at knowing what to do. So we're going to use Claude Opus. I think probably by the time we record this, it'll still be 4.6. Who knows what happens? Probably Claude Opus 4.6. But choosing the top tier models is absolutely critical. And Claude Code only comes in the Pro account at the time of this recording, I believe. I don't think it comes with a free one. So basically, you don't have to follow along for this course. But if you do, you need a Claude Pro account, okay? If you don't, like, I don't really need this subscription. I don't really want to get it. get it for one month, cancel it after one day. Once you cancel, it'll say, all right, great. At your renewal period in 29 days from now, it's going to cancel. That way you don't keep it running if you really don't want it. But somehow you're going to need a very powerful AI. I'm going to be using Claude Code because I think it is heads and shoulders above the rest. But it's up to you. This is what you need to follow exactly along in the course. But like I said, if you just want to watch, take in the ideas, and then apply it to your situation however you do. Maybe you have Enterprise Gemini account with a Gemini CLI. You could do the same thing there. If you want to apply it, you work. But again, to follow along step by step, Claude Code, Pro Account, one month.
|
|
|
transcript
|
1:49 |
In this course, I'm trying something new. I've got two presentation tools that I think will make a big difference for people watching. So you'll have to give me some feedback. At the end of each chapter, it asks, you know, how are things going? Send me some feedback on the tools, okay? So a lot of times it can be a little bit hard to see the mouse in the screen recordings and so on, or you can kind of lose track, and it can be a bit disorientating. So I'm using a little tool that will let us see what my mouse is up to. And it actually like changes as I click. But if I click, it will move the screen along. So I have this mouse highlighter thing that comes along. It'll hide after a moment. But that's one. And then the other is anytime I press hotkeys, some sort of hotkeys. I'm a hotkey fanatic. I'm a little bit of a weirdo in that regard. I love GUI applications. I'm not a huge fan of things like Emacs or Vim or Tmux or whatever. If you are, that's fine. I'm not knocking it. It's not for me, okay? I've done it. I don't necessarily think it's the best for me. But I'm all about the hotkeys, and I even set custom hotkeys on my development tool so I can do things without touching them. So anyway, it's a bit of a contradiction there. So sometimes I'll forget, and I won't say what I'm doing. I'll just hit something like maybe I was gonna hit Control Alt S and I forgot to say it. You'll see that as it does pop up there at the top. So I'm using these two different tools, one to show any of the hotkeys and one to show the mouse movement and so on. Hopefully these are helpful. Again, let me know if you like them and how the class is going and so on at the end of the chapters when the website asks you. But now you know to be on the lookout for them or why the heck these keys are popping up at the top of the screen.
|
|
|
transcript
|
1:20 |
Now, this course is actually a little bit different than many of the courses that I work on. A lot of the courses start from an empty folder, and we just start typing. But the goal here is not to type out secure code necessarily. It's more to look at examples and situations where it seems reasonable, but it's not actually secure. And then how do we change that? So it's important that you get the source code for this course. And you can see it's at the URL here at the bottom. Also in your video player, there's a little GitHub icon. you click that, it'll take you right to this page as well. So in here right now, it's empty because we haven't started the course, but it'll have the OWASP top 10 documents. It'll have our security lead. It'll have all the code examples with the secure and vulnerable versions with a little bit of background information on each one of them, as well as somehow, I'm not entirely sure how I'm going to store and present those three applications. I'll probably zip them up and stash them somewhere. I want to do that so you can get back to them exactly as you see them in this video, even though they are all open source. So you can go grab them by whatever. You can check them out by GitHub Shaw or Git Shaw or whatever. Okay, so make sure before we move on that you pause the video, wiggle the mouse, click the GitHub icon at the top, and star this at a minimum. Consider forking it so you've got a copy of it for yourself.
|
|
|
|
26:24 |
|
|
transcript
|
1:42 |
This is the first of the top 10 topics we're going to talk about. So I want to take a moment and have a little meta discussion with you, how we're going to present each one of these OWASP top 10 categories, if you will. So here's how it's going to work. I'm going to show you the actual definition, which lists what is the problem? What are some of the types of things that could go wrong? Then we're going to look at some varying number of examples. In this first chapter, we're going to look at three distinct unrelated examples. And what we're going to do is we're going to set up a scenario, like maybe you're a hospital with this type of application with a API that does X, Y, and Z. And then we're going to look at vulnerable code, what might be in this theoretical application, like a web endpoint or a database query or something like that. and then what is wrong with that? Why is it wrong? We're going to look at the vulnerable version, and then finally, for each one of these examples, we're going to look at the secure version, the fix, and how it was fixed. So each one of these will be very specific to the technology, but we'll look at a variety of them. Like I said, in this case, three, sometimes two, sometimes three, depends, but a variety of different situations for each of the OWASP top 10. So just so you know how it's going to work. We're going to do that for the first one, the second one, and so on. And we'll kind of work our way. I'll try to go pretty quick so this doesn't become too long, but I also don't want to skip too quickly so that you really get through, get the information that you're looking for. All right. So that's how we're going to do our presentations for each of the OWASP top 10.
|
|
|
transcript
|
3:53 |
So we're going to talk about OWASP number one, broken access control. Now, really quickly, the way that OWASP top 10 are organized is number one is the worst. Number two is the second worst, and so on. So in the data that they looked at to come up with this list, this one, this category had the most data, the most problems, and it also had the second highest most CVEs, and that is the critical reports about security vulnerabilities in applications. So it kind of goes from worst to least bad. I mean, there's never a good one in the OWASP top 10, but least bad at the end. So they don't really like suspense, right? So they put the worst thing that you need to pay attention to right up front. Just so you know, that's how it's going to go. Each of these, I've condensed them down a little bit from the official documentation, which you can see at the bottom. So it's a little bit shorter for me to read, But I do want to go ahead and just read it verbatim to you for this section, just so it really communicates exactly what the OWASP team intended. Broken access control. Access control enforces policy such that users cannot act outside their intended permissions. Failures typically lead to, you know, unauthorized information disclosure, modification and destruction of all data, or performing a business function outside of a user's limit, like transferring money or deleting another user's account or something like that. So again, each one of these is not a vulnerability. It's a category of problems. So one of these might be a violation of principle of least privilege. So unless you restrict the user, they can do whatever you want. Probably not the way you want it to go. Bypassing access controls. So maybe you've got a URL and the security, if you call it that, would be that in your account, in your menu, you can only click around to the places that you're supposed to go. But what if you log in and you just type slash admin on the back of the URL? If you get through, that's bad, and it would be a number one violation. You could be viewing someone else's account. So this is also really, really scary. If you're on a SaaS with multiple users, aka pretty much any web app that has more than one user, they each have their own set of data. And if you don't do the right checks, maybe one user could see the data of another, or even they get intermingled. elevation of privilege, acting as a user without being logged in or gaining privileges like becoming admin when you're just a regular user. Metadata manipulation such as replaying or tampering with JWT tokens or access tokens or a cookie or something like that. Cores, cross-origin resource sharing misconfiguration. This is rules that limit JavaScript and other browser-type things from accessing resources they shouldn't. This can be really bad because sometimes you do want to share resources. Like you want the JavaScript of your app to be able to interact with the CDN of your app. So you're going to specifically allow that. But sharing is a funny word because sometimes you don't want to share. Maybe you're a single child. I don't know. Like, for example, you're logged into your bank, but somebody's hacked another website. And you go there, and it actually uses JavaScript on the bad website to use your logged in session through JavaScript over to your bank. You don't want to share your bank account in session with that other site, right? So CORS controls that. And if you get that configuration wrong, it's not good. And finally, force browsing or just guessing URLs. Like I said, putting slash admin on the back. All right, so you could possibly get this. This is like a variation of elevation of privilege. So these are the types of things that fall under OWASP number one, the worst. broken access control.
|
|
|
transcript
|
2:06 |
Here we are with our first of three examples of different types of problems that we can run into with broken access control. One specific and common variant is called IDOR, or Insecure Direct Object Reference. Okay, so let's set the scenario. You can see we've got our little doctor's chart here, so you never want bad stuff going on with your health information, do you? So imagine a healthcare patient portal. We probably all have these. I have one. It's like one of the worst pieces of software that I work with. But it has my personal data. I've got to go to it. So each patient in this theoretical patient portal has medical records stored with a patient ID. And the API, it's like, who knows, like a React front end sort of thing or something. And the API that drives this front end exposes endpoints like get slash API slash patients slash 42 slash records that will list a summary version of them. Or if I want to make a specific change to one, I could put some sort of body data to this, to /record/7 to update user 42's seventh record by ID. In the vulnerable version of this, this is the one that got written before the review. In the vulnerable version, any authenticated user can change something in their JavaScript that loads up that changes their user ID. Like 42 is our user ID because of course it is. Maybe we could change it to 43. There's got to be a 43rd user, right? Or 41. That's even more likely, isn't it? So you could change it to any other user ID and start messing with their records. You could do a get against their records. You could do a post to change them. So a curious employee, a disgruntled patient, an attacker who compromised a single account can now access the entire patient data of the entire database, one record at a time. That's bad. Because probably the app is doing something like, let's make sure they're logged in. Yes, they are. OK, let's make sure the record exists. Great. Here's your record, sir. Hmm. We'll see.
|
|
|
transcript
|
1:45 |
Why is insecure direct object references bad? Well, let's see. Why is this vulnerability dangerous? Well, one, you don't have to be a super hacker to figure it out where you're managing to sneak around randomized memory addresses. No, no special tools are needed. You just change the number in the URL, boom. And then you can write an automated script to just do a git on the records for every single number until it starts telling you there's no more patients. That's bad. It's basically a data breach. It's a way of indirectly just reading all the data out of the database. So medical records are among the most sensitive data categories in the U.S. It's a HIPAA violation with fines up to $1.5 million for incidents. Yikes. In the EU, it's a GDPR breach with fines up to 4% of global revenue. Neither of those would I want anything to do with. It affects reads and writes. So if the read path lacks authorization, the write path almost certainly does too. So an attacker could presume, if I'm able to read this data, what else can I do? Not only view these records, but also alter diagnoses and prescriptions. That's bad. It's invisible to the victim. Unlike a login attack that locks out accounts, or you get a new login from this account in some place that probably shouldn't be logging in, IDOR access looks like normal, authentic API calls. There's nothing unusual to detect unless you're monitoring for cross-user access patterns. I mean, if you're monitoring for those things, you would just fix this, right? So here's a bunch of reasons this is a problem.
|
|
|
transcript
|
2:16 |
The first lucky web framework that we are working with here is FastAPI. And I want to be clear, this is not a vulnerability in FastAPI. This is somebody writing code on top of FastAPI, and they're doing it wrong. So no shade on FastAPI, but the problems and the solutions will be a little bit FastAPI-specific. Again, we'll vary our different technologies we choose. So here we have that API endpoint, app.get, And it goes to this URL and it passes the patient ID. So cool, it comes in as a patient ID integer. And then we're using dependency injection. Here depends on this function called getDB. So it automatically creates a database session. Let's imagine this is something like SQLAlchemy or something. And hey, we want to get them the records. Should we do raw database queries right in the view? No, but it's easier to show you instead of trying to pull files together. So let's imagine we're doing a query for this theoretical medical record class here. And then we're going to do a filter where the patient ID is the passed in patient ID. It looks fine, right? The user coming in, they've said they want this. This is the user. We've already put this into the React front end. What is the problem, Michael? Problem is they can just edit that to be whatever number they want, right? Or just use a curl command or something, and it doesn't even go through any JavaScript validation. So here the problem is there's no ownership check. We just simply let it rip, and we get back all the records that are related to that patient. Perfect. That's what we asked for, but we forgot the step where the user making the request needs to be the same one that they're asking for, or it needs to be some kind of super admin like a doctor or something. So the bug is the endpoint authenticates the user but never checks whether the authenticated user is authorized. So that wouldn't be great. Any logged in user, patient, nurse, janitor, whatever, can view any patient's medical records just by changing the patient ID in the URL. So first one at the bottom here, my records, patient one, perfect. Patient two, I'm patient one. Why am I getting away with this, right? So that's the vulnerable version.
|
|
|
transcript
|
2:35 |
Now for the secure version. Oh, we can all breathe a little bit better. How do we know it's secure? It has more code. No, not really. That's not how we know. Notice we're now, we just had patient ID and the DB before, but now we're also injecting this current user. Now this is only one way that you could do it. There's many ways to accomplish this problem or this check here, but it's a particular FastAPI style. So it uses this dependency injection that calls authorized patient access. And it's going to do one of two things. First of all, it takes the patient ID and it gets the logged in user. So what it does is this is going to call a function which takes the requested patient ID. This is the one from the JavaScript. And it looks at the login session and it gets the current user. And it says, one, is there a current user? Two, is that current user's ID the same as the patient ID? If not, it will check the roles of the user. So is the user like a doctor or some kind of admin, something like that? Then it will say, sure, you can access that. Otherwise, if they're not allowed, it'll throw a HTTP exception that results in a 403 permission denied. So basically, this function, per the way that FastAPI works with this dependency injection, will not even run. Like even our little doc string doesn't come to life unless the authorization goes through. And the person is either accessing their own record, patient ID equals user dot patient ID, or there's some sort of super admin that has access to look at other people's records. And then we go on and we do this query here, which works fine. Again, that's only one possible way. like maybe right before here we call a function says get logged in user and we could do the filter or like we could just do an if statement if it's not right there just like return 403 right there's many ways in which we can do this is a kind of a unique way so i thought i'd go ahead and throw it in there so you can see but i also want to point out that it it doesn't have to depend on fast api's dependency injection or anything like that you just somehow need to get the user make sure there is one and make sure that their ID equals the patient ID or that they have permission to look at the one they are asking for. In the source code that I'm handing out, check out this Authorize Patient Access function. It's pretty wild. So you can check that out and see how it works.
|
|
|
transcript
|
1:04 |
Onto our second example from this broken access control area. In this example, we're talking about file management, uploading and downloading files. Maybe you want to be able to upload a PDF or an image or something like that for your users. Well, well, well, you better be careful. This one's tricky. So here's the scenario. A document download endpoint that joins user-supplied file name information to a directory, and they forget validation. Then an attacker can send something like dot dot slash dot dot slash, et cetera, slash password to, you know, read the users and things like that. Not great. And so they can put whatever kind of file path relative, and you'll even see absolute ones, that can give them access to arbitrary files on the server. How about they just read all the SSL or SSH certificates? That can't be bad, right? So the fix involves resolving the full path and verifying that it's still within the right area serving the files, then you let them have access to their files.
|
|
|
transcript
|
1:23 |
So why is this dangerous? It reads files, not just data. So if you have a SQL injection attack, which is really bad, you can read all of the data out of the database. But this is not just that. It's all of the files, credentials, keys, configurations, source code. Maybe you look at an ENV file, then you read the database password. Then you just read all that. There's all kinds of bad things going on here. It's often overlooked. So developers think about SQL injection and little Bobby tables all day long, but they can forget that file paths can be very tricky too. It can chain with other vulnerabilities. So reading the source code reveals hard-coded secrets and database schemas and so on. And then you can even read from slash proc slash self slash environ on Linux, which actually gives you the environment variables for that process. So maybe shared SSH keys. If you get one of them, that's the lateral movement, right? It's not good once you get access to the file system. And finally, deleting data is also very devastating. If the same file path is used for a delete or write operation, then the attacker could remove critical system files, overwrite configurations, or change them. If you have an upload option, a delete could look like replacing a file with bad information. So a lot of reasons why this is dangerous.
|
|
|
transcript
|
2:50 |
Now let's look at the vulnerable version. This time we have our friend Flask involved. Hey, Flask. So here's a route that allows us to download a document. It looks fine. We get the user who's here. We get the file name. So we base this on the user ID. Presumably of an upload directory and then maybe a folder for each user ID and then the file name. We make sure that the file exists. If it doesn't exist, we're going to raise an error and say 404, file not found. And then if it does, hey, let's just send it. How could this go wrong? Let us count the ways. They could ask for dot, dot, slash, dot, dot, slash, dot, dot, slash, et cetera, slash passwords. They could ask. They could know, oh, they might be looking for that. So we don't want that. So let's use percent 2f, which gets encoded back into forward slash by Flask. or they could do this double dot expression here. All of these are really bad, right? So for example, var app uploads, dot, dot, dot, dot, dot, et cetera, password resolves, et cetera, password. So the attacker can read basically any file as long as they can just figure out what the path is. And if they can brute force this, they can just start asking for variations. Like maybe it's three dots or four dots to get back to the root and then down, right? They can just keep trying until they find something. That's not great. Let me show you one thing that's even a little more scary for Python people. Let's say uv run Python. Okay. Watch this, this will blow your mind. So if we have base path equals, so maybe that's our base path and we're gonna import OS. In the user path, let's just say they want a straight read slash et cetera, slash password. something like that. Notice there's not even the dot dot. How insane is that? That is so scary, folks. I feel like this should be an error or an exception, or at a minimum, it should probably look like base path plus user path. Like, I think that's what should happen. But no, if they just put a forward slash there, they can just navigate the file system regardless. So beware that ospath.join is not going to save you. This is really, really scary. Anyway, this is why we have this vulnerability, and we really need to make sure that these kinds of things cannot happen. So how do we do that? We're going to figure out in the safe version.
|
|
|
transcript
|
2:00 |
So how do we fix this? We saw that even things like os path join just don't seem to care. Like forward slash, et cetera, slash password is just, hey, great. It doesn't matter what you're trying to join that with. That is really, really scary. So what do we do? We're going to write this safe resolve function here that's given a base directory, a user directory, and a file name. And it's going to not try to detect the sneakiness of what's happening. It's just going to resolve the file to a complete absolute path after everything gets put together. Then it makes sure that it's still within the tree of allowed areas, right? So within the user uploads for that user. And then we can put interesting stuff in there potentially. Okay. So we do this resolve here, right? Take the base directory. This is our uploads directory. put in the user to make it access to their section and the file name. And then this is the key part. Resolve is going to get rid of all the relative stuff. We should probably do expand user as well. Those kinds of things. It's not vulnerable. Just be maybe more useful, I guess. I don't know. Anyway, it resolves into a target absolute path. And we figure out what the allowed path is with stuff that we control, the base directory plus user directory. and we say if where it's going does not start with that absolute path basically within that tree, then we're going to raise a 403 permission denied exception. Otherwise, we're going to give it back, and then we're going to use this function within that endpoint. Very simple. Instead of doing osPathJoin, we simply go here and resolve this into a path object. Notice this is a path up here. We resolve this into a path object, and then we ask if it doesn't exist. You can't download it, otherwise we'll give it to you because it's guaranteed to be in the area that we've assigned to that user.
|
|
|
transcript
|
0:48 |
Third and final example from broken access control, missing function level access control. Okay, like the little sibling, here we go. So here's the scenario. An admin endpoint, like list users, delete accounts, export data, it's protected for authentication, but not authorization. So we've got it protected by login, but it forgets to check whether they're authorized. Are they admin or are they just a regular user? So any logged in user who discovers admin URLs, which are often called slash admin, a little scary, can perform admin actions. So we're going to add a reusable admin required decorator that's going to solve this. Sort of layer these on onion style via decorators.
|
|
|
transcript
|
0:54 |
So why is this dangerous? If regular users can do admin things just by guessing or forced browsing, as they call it sometimes, URLs, that's not ideal. Privilege escalation, clearly, because you're not being validated. Hidden endpoints aren't that hidden. Often they're super obvious, right? They can just guess it's slash admin, or you can look in the JavaScript and see maybe all of the URLs are in the JavaScript, but just the non-admin actions are put into the UI if you're not logged in, so you can look there. These are hard to detect because they look like they're authorized. You've got your login required and everything looks good. So they look like regular stuff in your logs and so on. So that's why it's dangerous. Giving regular users access to your admin panel or options, not good.
|
|
|
transcript
|
1:22 |
This time we're doing some Django. That's fun. Hey Django. So here's the vulnerable version. You can see we have our pink login required decorators. And what they're going to do is going to make sure that whoever is accessing this, that they have a logged in session as a user, or it's not even going to let these functions run. That is great. And you can see the top one, my projects is actually perfectly fine. That's a regular route. The one below is a problem because login required is not the same as admin required. Here we can see we're doing a request in this admin one to list all users. Fine for an admin, not so fine for a non-admin. So it's pretty easy to do. You can say, oh yeah, these are all, they all got the decorated, they're all good. And we already talked about some of the ways to discover them. JavaScript source maps, maybe API documentations left open. Oh no. You're like, ah, I love to have my open API documentation available for everyone. Maybe not in production. Maybe not in production. Error messages, that leak URLs, all sorts of things can get these to become visible. And a lot of times they just have really predictable names. You can just try a hundred or a thousand variations until some stop 404ing. Maybe now they 403 and then you can figure out how to get an account and so on. Not good.
|
|
|
transcript
|
1:46 |
So what are we going to do? Well, we're going to create an admin required decorator. So if you haven't created a decorator, they're a little bit weird. They have a function that is passed to them, the view funk right here. And then we have our code that runs and then we call the wrapped function. So this happens before the function gets called. And you can see if there's some kind of problem, we're going to return 404 or 403 rather. Otherwise, we'll return the result of calling the view function, which is in effect executing it. Okay. One variation you might consider here is this admin required does not also check that they're logged in. So that would be, that would be a possibility here. I guess we also need to check that there is a user. So you kind of be a little bit careful. I mean, I guess it would crash. It's not going to be that big with you. Anyway, here's a simplified admin required decorator that we can write. Probably logging is important as well. Once this exists, we can say login required and admin required. If we put it in that order, the user is guaranteed to exist. But if you say the other order like admin, then login, then you got to check that the user exists. So this way, we're guaranteed that the user is staff, which is Django speak for is admin, basically. So this won't even let this function run if they're not authorized to have access to it. That is, if they're not is staff. Problem solved. I really like these decorator approaches because you don't get lost in trying to read the implementation of the code. Maybe there's a big doc string and below down there somewhere. That's where you're checking. You're like, we always do this everywhere. You can just really quick collapse all the functions, see that they have admin required, login required, or both.
|
|
|
|
23:04 |
|
|
transcript
|
2:12 |
We're on to the second most popular or important level of badness, and that's security misconfiguration. I'm sure you can see where this is going. Security misconfiguration is when a system, app, or cloud service is set up incorrectly from a security perspective. This includes leaving debug features enabled, that might be the one you're thinking of, but also using default credentials, missing security headers, allowing dangerous XML features, probably like XSLT and so on, or leaving cloud storage publicly accessible. How many people have been hacked by having their data read from a public S3 bucket? That's a low form of hacking, but it is gathering data you're not supposed to have, I suppose. This security misconfiguration category moved up to number two in 2025 with a staggering 100% of tested applications showing some form of misconfiguration. 100%. Woo, okay. What kind of, how does this manifest? What kind of problems do we see here? Well, it could be we're missing appropriate security hardening. It could be that unnecessary features are enabled. Why is the debug toolbar there in production? Not a good idea. Convenient for the developer. Not a good idea. default accounts and passwords. Latest security features are disabled. Maybe you have an older system and got upgraded and they were initially turned off during the upgrade because you don't want to break existing systems. Well, they weren't turned on and migrated up. So even though the new version is safer, it's not implied. Similarly, excessive backward compatibility. I know we added the security feature, but the older apps need to use the insecure API, and if we turn on security, it'll break them. OK, security settings are not set to secure values, things like that. And the server does not send secure headers or directives. This one is really important and often overlooked in web apps. So there it is, security misconfiguration. We're going to look at a couple of really fun examples across technologies again.
|
|
|
transcript
|
3:13 |
Do you work with Docker, Docker Compose, that kind of thing? You might be doing that because you're told running inside of Docker is safer than not because if your app gets compromised, at least it's contained within that container, more or less, and it won't get over to another part, right? So they can't completely take over your server. That's generally true. I guess there are some ways to break out of containers that have been found. Primarily, I think these are more on like dev machines. Anyway, you're told you're using Docker because it keeps you safe. And not only are you using it, you have these really handy Docker Compose files given to you by some open source self-hostable project potentially. It's already set up. Good to go. Well, let's see how good to go it is. Let's see how good to go it is. Okay, the scenario, a small company self-hosts an open source support ticket, like Zendesk or something like that, like an open source variant of that thing. For $20 a month on a virtual server so customers can submit issues and track the issues they've submitted, a developer finds a self-hosted project, follows the quick start guidelines, copies the example dockercompose.yaml file, changes the app's domain name, runs dockercompose up. They're excited to see the app's already running. They log in, set it up. Customers are filing tickets. Everyone's really thrilled. Pats on the back are given. Six months later. The company discovers that every support ticket, including customers' names, email addresses, file attachments, attached files, and all that, have been dumped to a paste bin. The Postgres database has been directly reachable from the internet the entire time on port 5432. This is way more likely than you think. Still, using Postgres slash Postgres for the credentials, which came from the quick start. Holy moly. An automated bot found it, logged in and exfiltrated everything as well. The developer was not that careless. They said, look, we got to make sure things are locked down. So they even set up UFW, uncomplicated firewall on Linux and said, we're only allowing port 80 and port 443 and port 22 onto this server. So things like our database server don't get exposed. Here's the part where this will shock you. Docker bypasses a firewall entirely on purpose. This is insane. This blew my mind when I saw this. There's a couple of configurations that will save us from this problem, but ironically, a software-based firewall on the host are not going to do it. You need a cloud firewall or something like that. So I actually linked to the documentation here on Docker about this. So here's the scenario. We're self-hosting a thing. Hey, it's open source. We're embracing open source. We don't need to build it ourselves. Let's go. And there's not even necessarily a problem with the software. It's the configuration. Plus the use of default credentials that made it doubly or exponentially bad.
|
|
|
transcript
|
1:32 |
Why is this dangerous? Well, do I really need to say? It's invisible to the user or to the developer setting it up. Nothing in Docker Compose Up warns that the ports are internet-facing. Moreover, that they actually work around the software firewalls. The service starts, the app works, everything looks fine, but database is open. Not good. Firewalls give this false sense of confidence. teams that carefully configure UFW or firewall D assume those ports are being blocked, but Docker manipulates the IP tables directly bypassing these rules. You can run UFW status. It'll show that it's denied, but it doesn't matter. It's not actually denied. Compose files get copied and shared. Many people running these things don't understand them. There's like, Docker, Docker, they said run this, copy this here, change these, Docker compose up. Yes, I'm a self-hoster. I use Docker. Yeah, but a lot of times those are written for local environments or dev setup or whatever. And they come with default credentials often so that they'll just work. Nuff said. And it compounds. Each problem alone is bad. Together, they're catastrophic. One problem is that maybe the Postgres port is accessible by the internet. Problem number two is that it uses Postgres slash Postgres as the username password. Double bad. So now you've got something on the internet that is easy to log into. with your private data. Okay, so that's just some of the reasons it's dangerous.
|
|
|
transcript
|
1:36 |
Here we go. If you're not familiar with Docker, don't worry. There's just a few things we're going to look at. But this is our Docker compose.yml file. And we're not even talking about whether this is Flask, Django, whatever. It could be anything. So when you say 8000 colon 8000, what that means is listen on all of the IP addresses colon 8000. And what Docker does is say, Even if there's a software firewall, we're managing our own networking, so we're going to route around those tables and let that in. That's not ideal. That is not ideal. So things like directly accessing the website, skipping maybe your front-end server like Nginx or Caddy, it's not great. It probably doesn't have SSL, but it's not the end of the world. It's still just the website, so that's not good. If we come down farther, though, this is where it starts to get bad. database, 5432, 5432. This is internet facing for our database. And they just gave us these default password, username and password. And even if they were not the default username and password, if you know that this is a OpenDeskZen or whatever the heck they call this app, you can go, because you could access it from port 8000, you could go and look at the self-hosting instructions and see what the default password and username were, and then try that on the open port, right? So that's not ideal. We also have Redis, and it doesn't even have a password. So also not good.
|
|
|
transcript
|
2:27 |
How do we make it right? Thankfully, these are all incredibly easy to do. We can come over here and say, hey, just listen on localhost. That way, if we've got some local dev tools or something like that, or very often a reverse proxy like Nginx or Caddy, we can just tell Caddy, hey, when a request comes in, you terminate the SSL and you do a proxy pass request and serve whatever comes back on port 80 locally, you serve that out. But from the outside of the server, no one can access it, right? It's literally just local hosts. So this little bit here, big, big difference. Even bigger of a difference here because our database is not only locally accessible. We could go farther actually. So right now this makes it available to local development tools on the server. So I can do terminal stuff if I'm on the server or even I could do a remote connect with Visual Studio Code over its remote SSH tooling, and then I could use maybe a Postgres plugin to talk to the server potentially, something like that. Or you could leave this entirely off and then just refer to it as DB like we do right there. Refer to it as DB inside the little Docker Compose network that Docker Compose creates. So potentially we could actually remove this line entirely. It's up to you. We're setting these from.env files that we put next to our Docker Compose, and they're not the defaults, right? They're not the defaults. And we've also done the same thing. We've done what I described here where we are only making Redis accessible through the local Docker Compose network, and we're acquiring a password, which whatever it is that we set from our environment. Incredibly easy to do. It's just often not the default. You go to these open source things. You see, oh, great. There's a Docker Compose file and they give me quick instructions on how to set it up. They do say change these things, like change the username. A lot of times they don't change this, but they do change the username and password. They tell you to, but it's so easy to forget it because you're like, yeah, let's just see if we can get it running if it's worth keeping. And then you kind of forget and you're in trouble. There it is. Great way to make our Docker Compose setups and our Docker apps more secure. by properly not misconfiguring, but properly configuring the security settings.
|
|
|
transcript
|
1:30 |
Next example in this misconfiguration section is active debug code. You know where we're going. So here's a scenario. A freelance developer built an e-commerce site for a local retailer using Django. During development, debug equals true makes their life easy. Every typo in a template or missed database migration produces a helpful error message. Lines of code, local variables, full traceback, all that kind of stuff. They launched the site on Friday afternoon. the developer plans to harden things for the production next week. On Monday, a customer miskeys the product URL and hits a 404. Django debug page cheerfully displays every URL route in the application, including admin slash API slash webhook slash stripe, internal slash export customers, and so on. The page also shows the full Django secrets, like secret key, stripe API key, database connection string username, password, SMTP credentials. A web crawler archives the page. And within days, the Stripe key is used for making fraudulent refunds and all sorts of bad things. Or something to get the product, cancel it. So the developer really never got another keystroke to write on that project. They were quickly fired. They never got to Monday where they were able to harden that code. So people were upset. It wasn't good. And it's all because things weren't locked down properly. They had this active debug code running in production.
|
|
|
transcript
|
1:31 |
Well, why is this dangerous? Errors are trivial to trigger. If you just get a 404 or something like that, obviously everything's going to be showing up on the screen. It's really not good. One error page leaks everything at once. Again, you find a single 500 and you get the secret key, database credentials, API keys, passwords, everything. It's not just a leak. It's the entire vault. The secret key unlocks the kingdom. Django uses secret key to sign session cookies. CSRF tokens and password reset links with it an attacker could forge any user session including admins without ever needing a password that's not good it's cached and crawled debug error pages get indexed by search engines cached by browser extensions and archived by web crawlers even if you flip debug back to false the leak secrets may still be out there and finally django also changes the behavior with debug on. Beyond air pages, debug true causes Django to accumulate every SQL query in memory, which is potentially a DDoS sort of experience, serve static files directly bypassing the CDN and caching, and relaxed allow host checking. So all those things are not good. This is not specific to Django, but Django's extra helpful. So it's a little extra bad in Django. I think around Django, a lot of people know about this debug, but you know, Flask and have debug modes, Pyramid has debug modes. A lot of these things are pervasive.
|
|
|
transcript
|
1:09 |
So you might be thinking, Michael, we know not to set debug equals true in production. We know it. Well, let me show you a vulnerable settings file and then the fixed version. It's way more than just debug equals true. So it's going to span a couple of pages here. So here we have debug equals true. Obviously not good, left on a production. We talked about full stack traces, Python paths, SQL queries, and so on. But what about the secret key? Is this a default or weak secret key or did you set a good one? Allowed hosts accepts requests from any host name. Not necessarily a good thing. We've got our database with our password in here. Also, maybe we want to not leave that in our settings file. I want to make that secure and put that somewhere. We've got the debug toolbar installed as well. So this can expose all sorts of information because you can just pull up the debug toolbar and check it out. And we've got it active here in Middleware. And finally, our internal IPs are set to broad, so the debug toolbars can be visible on all machines, not just localhost.
|
|
|
transcript
|
2:03 |
What can we do to make this better? Well, first of all, turn debug off. You can default it to false. And then if in the environment you set Django debug equals true, like you might do on your dev machine, then it shows up as turned on. Otherwise, it's off. So this is a really nice way to default to have the debug mode off. It's not in source code, so you won't forget to take it out when you ship it to production. Similarly, for the secret key, You can set this in a.env file or set it in your environment, something like that. And it also crashes because the way we're accessing the dictionary by key, not through the get like we do above. So if it crashes, it might sound bad, but it means it won't run if you don't set the Django secret key in the environment explicitly, which is kind of a good check, really. Again, allowed hosts, we can get those and set them to either nothing or the ones you set in the environment. Down here, we have the username and password for the database user set from the environment. Another one is we have our default production installed apps. And only if debug is set to true do we even add the middleware for debug toolbar. And in that case, we set only localhost as the thing that we're allowed to listen on, not the whole internet. And third, last but not least, we can add in more security middleware. And if debug is not set, so in production, we can say we want to make sure we redirect SSL requests. We can set the HSTS settings, which mean, hey, browsers, if you try to request a site, just start with HTTPS. Don't do an HTTP and then wait for a redirect upgrade. Preload as well. Secure cookie. All these production hardening settings can be set, driven by our debug setting at the top of the settings file. So here we're in a much better place and it goes far beyond just saying debug equals false.
|
|
|
transcript
|
1:10 |
Let's close out this section with missing security headers. These sound minor, but they're actually a pretty big deal. Let me give you the scenario. A freelance developer builds a custom portal for a regional insurance company using FastAPI. The portal lets policyholders view claims, upload documents, and make payments. The developer focuses on getting the business logic right, authentication, authorization, database queries, deploys behind HTTPS, and has a clear, clean lighthouse score, which is like an assessment of websites tool from Google. A few months later, a phishing email tricks an employee into clicking a link that loads the portal inside of a hidden iframe, not a malicious page. Because the portal never sends X-frame options or content security policy headers, the browser renders it, but the attacker overlays invisible buttons and actions on top of the approved claim action. So the employee unknowingly approves fraudulent claims, netting this hacker $47,000. Portal had zero code level bugs. The browser just did what it was told, and it was never told to be cautious.
|
|
|
transcript
|
1:28 |
So omitting security headers, why is this dangerous? Well, the absence is invisible. Missing headers produce no errors, no warnings, no logs. Everything looks like it's working perfectly. It turns the browser into the attack service. Without these headers, the browser defaults to permissive behaviors, allows iframes, mime-sniffing, protocol downgrades, all of which happen on the client side, outside of your server's control or view. One missing header can completely undermine security investments. HTTPS is meaningless without strict transport security, because you can intercept it before it's too late, grab a cookie or something like that. Attackers chain these gaps. A missing CSP header alone might seem like no big deal, but then join that with a cross-site scripting flaw and becomes something worse. Automatic scanners flag it immediately. Security auditors and bug bounty hunters run header checks first. A missing header might signal lax security, more issues that they should be looking into. It's like, oh, this app doesn't do all it should, and it's super easy to tell that it's not doing it. So they'll dive in, or they'll even send you a message, hey, I'm a security researcher. I found a bug in your app, and if you've got a bug bounty program and you'll give me some money, I'll tell you, and they'll just say, hey, there's some missing headers. You're like, great. So there's a whole spectrum. That's honestly the least of your problems. You don't want somebody honestly malicious in there. But here are some of the reasons these missing security headers can be dangerous.
|
|
|
transcript
|
1:14 |
Back to FastAPI, our friend. Now look at this vulnerable version here. It's insanely simple. And I cannot tell you anything wrong with this code. App get, you get just the root server and it just returns, welcome to secure bank. Your trusted financial partner. And it's probably got some like badge of approved security, whatever. So this is actually a problem. It's not sent with any security headers, but you don't see it here because it's not here where you do it. You can do it in middleware. You can do it in your front-end server like Nginx. So it's pretty hard to see, actually. Some of the problems. The browser has no guidance on whether to enforce HTTPS to allow iframe wrapping, cross-site scripting, mime type sniffing, information leak via referrer headers, browser features. You can get things like click jacking, cross-site scripting, amplification, protocol downgrade, and data leakage via third party. Oh, we're getting this refer from /admin/whatever. Oh, that's interesting. Let's go explore that. OK, so this is the bad version. How do we fix it?
|
|
|
transcript
|
1:59 |
Depending on your framework, there's different ways to do this. They would do this differently in Django than Flask and FastAPI and others. Here's how we do it in FastAPI. A lot of the different frameworks have something similar, just not exactly the same code. In FastAPI, what you do is you register middleware. What middleware is something that gets a request before your code sees it, make some potential changes either before or after or both on every single request that the web server processes. So here you can see we have this call next here. So this will actually run the request through your code. And then after it gets a response, it's going to say, all right, let's set some, let's some headers here. So we're going to set strict transport security. We're going to set X frames, no, you cannot embed us into an iframe. Content security policy, what type of JavaScript or whatever is allowed to call back into this system. Prevent MIME type sniffing, give it the no sniff. Refer policy, strict origin, win cross-site origin. So it might tell you that it came from your domain, but not your admin page. permission policy, no camera, no microphone, no geolocation, and so on. And finally, cache control, no, no cache. Stop caching. Don't cache anything. And then it sends that response out. So you don't even have to write any code. In fact, I'll show you the same code that I showed you before. And now the comment or the doc string has changed to now serve with full security headers. Weird. That's exactly what we started with. And it was vulnerable, right? But it's because we installed this middleware here for HTTP requests that run every time. And after our content is called, our view endpoint is called, then we're going to add all these security headers every single time automatically.
|
|
|
|
14:39 |
|
|
transcript
|
2:03 |
On to OWASP number three, software supply chain failures. This is a timely one. It's definitely been on the rise. So what's the deal? Software supply chain failures are breakdowns or other compromises in the process of building, distributing, and updating software. They're often caused by vulnerabilities or malicious changes in third-party code, tools, or other dependencies that systems rely upon. How might you be vulnerable to this? Well, you do not carefully track all the versions of the components that you use, both on the front end and especially on the back end. Software is vulnerable, unsupported, or, I've highlighted, out of date. You have old software running. You haven't deployed it in a year. You haven't touched it. Did a CVE come up since then? We don't know. You don't scan for vulnerabilities regularly. Well, there's a reason why you might not know. Components come from untrusted sources across any part of the tech stack. And what's really insidious here is that this actually might be not direct, but transitive dependencies that you're using, right? I depend upon Flask, Flask depends upon something else. That something might have an issue. You do not fix or upgrade their underlying platform frameworks or dependencies in a risk-based or timely fashion. This is getting notified that there's a problem and then taking the time to actually update it. Software developers don't test the compatibility of updates or patches. Your CI system is less secure than your actual code or what builds it or where it lives and so on. So somebody could change the way in which your code is built and sneak into something there. I'm looking at you, SolarWinds. Software supply chain failures. Obviously, there's a wide range of issues here. we're gonna talk about two really cool examples. And fixes, we have fixes.
|
|
|
transcript
|
1:16 |
Let's dive into example one. This one should be pretty familiar to Python people. Unpinned dependencies. So here's the scenario. Greenfield Analytics is a data science startup with a requirements.txt that reads like a wishlist. Pandas, scikit-learn, FastAPI, SQLAlchemy. No version numbers, no hashes, no lock file. We always want the latest, the lead developer explains. deploys work fine for months because the team deploys frequency and the ecosystem is stable. Then one Friday afternoon, a widely used serialization library that Pandas depends upon pushes a new minor release with a subtle change in how it handles pickle deserialization. Greenfield's weekend batch job, which deserializes cached model outputs, silently starts accepting untrusted pickle payloads that the previous version would have rejected. No tests fail because the test suite doesn't cover deserialization edge cases. On Monday, the security team at a Greenfield client discovers that their analytics API is now vulnerable to remote code execution via crafted pickle payloads. Greenfield can't even determine when the vulnerable version entered their stack because they have no record of what was installed in each deployment.
|
|
|
transcript
|
1:35 |
Well, why is this dangerous? I mean, remote code execution, that sounds bad, but there's a wider range of issues here. Every deploy is a roll of the dice. Without pinned versions, pip install or uv pip install on Monday can produce a fundamentally different environment than uv pip install on Tuesday. You can't reproduce bugs, audit security, or even rollback reliably. Supply chain attacks target the latest version. So if some package in PyPI gets taken over, That gets uploaded, usually gets found within a few days, but there's that window from when it's brand new until it gets found. So if an attacker compromises a package maintainer's account, the malicious code goes into the newest releases, exactly the one unpinned installs will grab. Transitive dependencies are invisible. So if we say we want five packages for our product, or 10 packages in the requirements for it to run, maybe we actually have 80 installed. Without a lock file, we have no visibility even to which those are, what versions are running, and so on. That's not great. And incident response requires knowing what changed. When something breaks or a CVE is announced, the first question is, what version are we running? Without pinning, the answer is, well, whatever the newest deploy is at the time. Sure, you could log into a Docker container and do a uv pip list or something like that, but that's just the latest deploy. What about the ones before? When did it come in? And so on. You need to check each environment individually, and there's no real history, just whatever it is at the current state.
|
|
|
transcript
|
0:36 |
Here's the vulnerable version. This actually might be the simplest sample code that we have in our entire project, in our entire course. So we've got four dependencies, Flask, Request, SQLAlchemy, and Cryptography. However, these are only the top-level dependencies. We don't know what version they are, nor do we know what Flask depends upon without having preconceived knowledge of that and Request and SQLAlchemy and so on. That's it. There's no history. When we say uv pip install, we just get the latest, whatever that is for all of these at that particular time and so on. All the things that could go wrong, we've already talked about.
|
|
|
transcript
|
2:16 |
Now here's the secure version. You might be thinking, Michael, just because you made the text green doesn't make it secure. Okay, I agree. But look at the file name, requirements.piptools. There's many ways to manage this. We could be using uv projects or some input file. This is what I like to use, requirements.piptools. This is not what we install. This is where we declare our top-level requirements. Then we use a really clever command with uv to say uv pip compile. Take this as an input, generate our requirements.txt file. And while you're at it, give me the latest. You don't have to do that, but typically this is the workflow. You already have the requirements.txt, so you run this to refresh your requirements when you're ready. And most importantly, look at this part here at the end. Exclude newer than one week old. This is some magic right here. It's a super simple thing, but it just says, hey, I don't want to install anything that is newer than one week old. So when you say, give me the latest Flask, you're like, from last week, from seven days ago. And this means that at least for the popular things, there's a very good chance that if something happens to it, it's going to get taken down. Because it's been out there for a week, 100,000 people have installed it. Somebody said, oh gosh, look what happened to Flask. Did you know? You're like, no, but glad they found that in two days. It's not bulletproof, but it's certainly a nice little defense in depth sort of thing. So this runs. And what do we get? We get a requirements.txt with all of the transitive dependencies and so on. So here you can see we have Flask 313 because it was specified in our pip-tools. We have cryptography, that version, because it's the latest one as of one week ago, as required in our pip-tools. But look, certifi is actually installed via requests, and it's that version. Blinker is installed because of Flask, and it's that version, right? So we have our transitive closures all up to date within a cool-down period that we specified there. Then we just uv pip install-r, this requirements.txt, or however you install that. We can do similar things with uvsync and so on.
|
|
|
transcript
|
1:26 |
On to our second topic in supply chain issues, and that's known vulnerabilities. Known. We know that they are a problem and somehow we don't know. It's like one of those unknown knowns or known unknowns. I don't know. We'll see. HealthTrack is a small healthcare startup that built its patients portal two years ago on Django 3.2 with a handful of well-known and trusted packages. Pillow for medical image thumbnails, cryptography for encrypting records at rest, Python Jose CWT-based auth? The app works great, so the three-person dev team moves on to new features and nobody touches the requirements.txt. 18 months later, a critical CVE drops for the version of cryptography that they're running. A padding oracle attack that lets an attacker decrypt data without the key. Not good. The CVE has a public proof-of-concept exploit on GitHub within 48 hours. And yet, HealthTrack's team doesn't subscribe to any security advisory feeds and has no dependency scanning in CI, so they have no awareness that this is even a problem. Luckily, they were not hacked. However, there were consequences. A penetration tester hired for their SOC 2 compliance audit finds that the vulnerability in minutes using a tool I'm going to recommend, pip Audit, the resulting remediation delays their compliance certification by two months and it costs them a major hospital contract.
|
|
|
transcript
|
1:20 |
Well, why is this dangerous? Exploits are often public before you get a chance to patch. So the gap between the CV disclosure and the weaponized exploit code is shrinking, sometimes to hours. Basically, once the CV is released, it's game on. The race is on. Attackers have automated pipelines that scan the Internet for vulnerable versions the day the CVE drops. Transitive dependencies hide the risk. In our very first requirements, we just saw, oh, look, there's five things we depend upon. Let's check those. But that's not the dependency of the dependency of the dependency, right? So you might pin Flask to 310, but Flask pulls in Vexoeg and MarkupSafe and Genja2 and others. A vulnerability three levels deep in your dependency tree is just as exploitable as in your code. It still works is not a security posture. Team Skip upgrades because the app functions just fine, but a working app with a known CVE is an open invitation, inviting people you do not want. Functionality and security are independent axes. Compliance frameworks now require it. Our SOC 2, for example, that we talked about, as well as HIPAA and PCI DSS, all expect documented dependency management. A known vulnerability dependency isn't just a technical risk, it's an audit finding.
|
|
|
transcript
|
4:07 |
Well, here's a requirements file, and it's following the pinning of our dependencies, just like we said it should. Unfortunately, it's pinned it where our dependencies have issues. Our Flask has CVE-2023-30861, session cookies set without secure flag on redirects. That's not ideal. Werkzeug has high resource usage parsing multi-form data, which could result in a DDoS. Cookie injection via crafted cookie names. Jinja2 has cross-site scripting via the xmlattr filter with crafted keys. Cryptography has null pointer dereference on malform PKCS7 data and use after free, never good, in PKCS12 serialization. Finally, Pillow, our image library, has uncontrolled resource consumption and text length, so you can run the server out of memory, and arbitrary code execution. actually really terrible, via pill.imagemath.eval. So what do we do? Well, this fix is kind of super silly. Just update it to the ones after the vulnerability. Duh, now we're fixed. However, how we go about that is the big thing. Staying on top of security news is certainly important. GitHub has security features you can turn on, and it will scan your PIN requirements and let you know if there's a vulnerability. uv and pip will list out errors like, hey, a CVE was found for this thing that you are installing. And most importantly, we've talked about the pip audit tool. That one is the most proactive. I actually want to point you to an article that I wrote a while ago, last year, but maybe four months ago, something like that. So Python supply chain security made easy. So this article is a two-part article. This is part one. And it talks about using pip audit, but pip audit is only as good as you're remembering to use it. So if you have pip audit, which you just run pip audit, it looks at your pinned dependencies and it says, are there any vulnerabilities for that? Well, if you don't run it, what good is it? It's kind of slow the first time you run it every couple hours, but then it caches. So then it speeds up. So it is a tiny bit of a hassle. I mean, by slow, I mean 10 seconds. right not instant but it still might be something you don't necessarily want to put in like a git commit hook or something along those lines so what can you do you can build it into your continuous integration you can add the uv pip compile exclude newer which automatically skips a lot of these issues that we've already discussed about but you can also put pip audit as part of your unit test so when your continuous integration runs your unit tests your unit test will run a pip audit. And if the pip audit finds a problem, it will fail the test, which will fail your CI. The part two of this actually talks about how you do this more in production. So it uses Docker. So I ship all of my server side apps as Docker containers, build them and then ship them. And you can set up pip audit as a build step in your Docker build so that when you release new code, your pip audit comes after your updated dependencies come in and then it will just run pip audit on your docker container before it goes up and it will fail the docker build if pip audit finds a vulnerability so that's super cool that means at least as far as awareness goes from pip audit it will not allow you to ship a docker container that has a known cve in one of its dependencies i think that's excellent. So I encourage you to go and check out this two-part post series to really dive into how to solve these problems, because it's great to say, well, Michael, you just take the bad version and replace it with a good version. Technically, that fixes the known vulnerability problems, but how? How do you make it easy? How do you make it automatic? This series of articles is how.
|
|
|
|
12:58 |
|
|
transcript
|
2:00 |
On to OWASP number four, cryptographic failures. So what are these? Well, cryptographic failures occur when sensitive data is not properly protected through encryption, hashing, random number generation, those kinds of things. This includes storing passwords in plain text using broken algorithms, hard-coded encryption keys, using insecure cipher modes, and generating security tokens with predictable random pseudo random number generators. So I'm sure you can think of some reasons why they're bad. Here's what OWASP says. Are there any old or weak cryptographic algorithms or protocols used? Are you using default crypto keys? How about checking them into source code? We're going to close out this chapter with a really fun example on checking them into source code. That's not ideal. Encryption there, but not enforced. For example, are we not using the right headers to tell the browser to automatically do SSL? Are passwords being used as cryptographic keys in the absence of a password-based key derivation function? Is randomness not random enough? Are you using deprecated hashing functions? I'm sure this is true in old apps and web apps and APIs and so on. MD5 or SHA-1 or even non-cryptographic hashes that we will see. Are your error messages leaking out details about your encryption? Like, that's not a valid SHA-256 key or whatever, something like that. And can they be downgraded? This happens with TLS, SSL for web browsers. Often web servers have a range of different levels of TLS, TLS 1, 1.1, 1.2, 1.3, and so on. And the idea is, well, we'll try to use 1.3, but if we can't, we're going to fall back to something for older clients. So one thing that you might want to do is go disable the older ones. Look, any browser made in the last 10 years can do whatever level. to leave the stuff below that.
|
|
|
transcript
|
1:05 |
Here's our first scenario for this section. Weak hashing algorithms. But not the one you were thinking of to start right here. Here's the scenario. An e-commerce platform with 2 million registered users stores passwords as SHA-256 hashes. The security team is confident. They're not using MD5. They're not storing plain text. And SHA-256 is used by Bitcoin. It's the gold standard, the VP of engineering assures the board. A SQL injection vulnerability later on exposes the user table. The attacker downloads all 2 million hashed passwords and feeds them into Hashcat on a Rent-A-Cloud GPU. At 22 billion SHA-256 hashes per second, every password under eight characters falls within minutes. Dictionary attacks with common substitutions cracked about 70% of the remaining passwords overnight. The total cost to the attacker was $47 of GPU fees. The team confused strong hash function with good password hash. SHA-256 is excellent for file integrity and catastrophically wrong for passwords.
|
|
|
transcript
|
1:29 |
What's wrong? Why is this dangerous? Well, speed is the enemy. SHA-256 was designed specifically to be fast. That is a feature for checksums and file hashes and so on. It is catastrophically bad for passwords. Attackers exploit the speed's asymmetry. Virifying one password is instant, but so is virifying billions. Unsalted hashes enable bulk attacks. Without a unique salt per user, that's a bit of random text you add to the password and then you hash it, making what lands in the database not actually look like the hashed version of the password. Without that, identical passwords produce identical hashes. So an attacker can crack one hash and instantly compromise every account that shares that password, or use some sort of rainbow table that says, well, the SHA-256 hash of everything in the dictionary is this. GPU economics favor the attacker. Cloud GPU instances make brute force attacks accessible to anyone with a credit card. The defender pays to hash one password. The attacker rents hardware to try billions. And finally, rehashing requires user action. Because these are one way, you can't silently upgrade from SHA-256 to ARGON2 or some other one without original plain text password being provided to you. Basically, the user either changes their password or they log in. Mitigation requires user to log in, meaning abandoned accounts stay forever vulnerable. Not good.
|
|
|
transcript
|
0:53 |
All right, let's look at the vulnerable version of a password or a user account here, a vulnerable user. In our scenario, we used SHA-256, but even worse would be to use a slower, older MD5. So it's super fast, 10 billion hashes per second. Collisions attacks exist. Unsalted hash results in identical output, so rainbow tables and that sort of thing. So what does this look like? It just says, well, we import hashlib, we get the data, and we get the password from it, and we just say MD5 of the bytes of that password, and we create a hash digest to turn the bytes back into a string to store in the database. Done. Only problem is, even though that's a one-way trip, you can't really decrypt MD5. Computationally, it's pretty easy to figure out what it is.
|
|
|
transcript
|
1:34 |
Now, what does this look like in the secure version? It looks like choosing an algorithm that does not play well with brute force attacks, Argon2. So Argon2 is really interesting because all the other hashing algorithm, many of the other hashing algorithms are computationally hard and each bit, each hash is independent of the others. So they're really easy to fan out on a GPU, which has a lot of parallel compute. But Argon2 is actually memory hard, not compute hard. So they use a lot of memory per guess. Here you can see we're setting up Argon2 password hasher to use 64 megs of memory just to hash this thing one time. So anyone who wants to take a guess has to use 64 megs of memory, do that per guess, which is very hard for GPUs to scale that. Plus, the amount of time we're adding gives it about 300 milliseconds of compute. Those two things together make this a much stronger one-way hashing algorithm. And one more thing was we'll see. So now we set up this very similar hasher, but a much better algorithm. And down here, we just call ph, password hasher, dot hash, give it the password like before, and we get the password hash. And we store it in the database. But this time, way, way safer. If an attacker tries to guess, well, let's say an attacker tries a billion passwords, they would need 10 years on a modern GPU. That's way better than 22 billion a second, don't you think?
|
|
|
transcript
|
1:08 |
Number two, hard-coded encryption keys. This could also be API keys. This could be database passwords. There's a lot of secrets, basically many of the secrets, but this one is especially bad because it allows reversing of data. Maybe that's in the database. All right, so here's the scenario. A fintech company builds a payment processing microservice. The lead developer generates a strong AES-256 key and happens to save that in a config file to encrypt the stored credit card numbers. This is great encryption. The service passes PCI DSS scans because the card data is in fact highly encrypted at rest as it should be. However, 18 months later, a disgruntled contractor who was fired six months ago still has a clone of the repository on his personal laptop. He opens up ConfigPy, copies the key, and decrypts the entire card vault from the database backup he kept, you know, for debugging. The company discovered the breach only when Visa flags a pattern of fraud across 12,000 credit cards. Whew, that's not good. It's not good. We don't want to hard code our encryption keys.
|
|
|
transcript
|
1:43 |
So what goes wrong if we store our encryption keys? Well, let us count the ways. Every clone of the repository of the code is a copy of the key. The encryption key lives in every developer's laptop, every CI runner, every fork, every backup. You're not managing one secret. You're managing hundreds of copies that you have to keep track of, and it decrypts the rest of the data from there on. Fired employees can keep the key forever. Revoking someone's Git access doesn't delete their local clone. Anyone who ever had access to the repository retains the key. Git history is permanent. This is rough. We'll see this at the end of this chapter. Even if you remove the key from the current code, Git log dash P reveals it in commit history. You could rewrite the history, but that requires access to every single clone. So unless it's very small number of clones, maybe a couple people on a team that you really, really trust, it's practically impossible. So the key is public to anyone with access to the repo. Not good. One key compromises everything. Without key rotation or per-purpose keys, a single leaked key decrypts the entire database, every credit card, every social security number, every record ever encrypted with it. Finally, you can't rotate the keys without re-encrypting. Now, unlike password hashing, this is possible. You could do it. But changing the hard-coded key means decrypting all the existing data with the old key and re-encrypting it with the new one. For large data sets, this could be expensive, risky. Who wants to potentially lose access to all the data? So those kinds of things can get deferred indefinitely. Keeping the problem still outstanding.
|
|
|
transcript
|
0:41 |
What does this look like? Well, here's an example. We just have a constant encryption key. Here's the bytes. You know, this is a nice, strong encryption key, but here's the raw key. And with it, all you have to do is create a Fernet encryption library, give it the key, and it can decrypt everything. So again, it's on every laptop. It's in the Git history, CI, backups, and so on. Then anytime you want to decrypt it, that's totally fine. You say fairnet.encrypt or decrypt, and you just give it the bytes. stored, it seems safe. However, anything that gets a hold of that key and the database can now decrypt anything in the database. That's not great.
|
|
|
transcript
|
2:25 |
Luckily, this problem is pretty easy to solve. How do we do it? We don't put the encryption key in the code and check it into source control. So instead, what we're going to do is we're going to say, put it somewhere in the environment. This could be a.emv file that's brought into through some kind of Docker build, or it could be somewhere you actually just set in an environment variable. However, it gets in there. It's in the environment and our app can use os.environ to pull it out. Also, we're using multiple encryption keys so that the key that decrypts credit cards is not the same thing that decrypts social security numbers. By doing that, at least we have a chance of if somebody gets a hold of one, they don't necessarily get a hold of the other data. So we create two separate encryptors of that. And suppose we do want to go and change our encryption key. We've got to rotate it. Maybe just this is a thing that we typically do, or we want to periodically change it for whatever reason. We can use something called a multi-Fernet. And when you do that, you just give it a list of keys. So you can see at the bottom, we're saying Fernet of k.encode for k in keys, which is awesome. Because if we just store that as like comma separated and split on it, we can use any of those keys to decrypt. but only the newest key is used to encrypt. So we can add a new key and anything new that gets in there, gets encrypted, will be saved in the new one, but that doesn't invalidate the old saved data. So this might be a good migration path to rotate those keys out and then take the other ones out. For example, you don't have to stop the web app, change the encryption key, read all the data, write all the data with the new encryption, hope it was fine. Here you can just, in the background, read and write this data, or at least the new data will be encrypted with the new key. Pretty neat. And then to use it, super, super easy. For our card encryption, we say get rotatable fair net. We give it the environment variable that it's going to pull from, key card encryption keys, and it just creates that multi-fair net encryptor, which is cool. And then the encryption and decryption goes exactly the same. So we say cardencryptor.encrypt, give it the keys, and we save that in the database. Perfect. and you can see with a comment here, encrypt with the externally managed rotatable keys, encrypts with the new ones can still decrypt with the old ones.
|
|
|
|
12:58 |
|
|
transcript
|
1:29 |
You know we had to talk about it first. SQL injection. Search little Bobby Tables XKCD. If you haven't seen this cartoon, it's incredible and it captures the essence. But let's talk about a more practical scenario. A regional e-commerce company built their product search in Flask using raw SQL for performance. The search endpoint takes a username parameter to look up customer profiles, and the developer used Python's f-strings to build the query. It was faster to write than figuring out how SQLAlchemy's query builder worked. That's true. SQLAlchemy is way complicated for what it does. The feature sat in production for two years without an incident. Then the company launched an affiliate program, giving external partners API access to the same search endpoint. An affiliate discovers that single quotes in the search field produce a 500 error with a SQL stack trace. That's interesting. Within an hour, the affiliate uses Union Select to pull the entire user's table, email addresses, bcrypt password hashes, physical addresses, and order history for 200,000 customers. They then pivot to the payment methods table where the last four digits of the credit card are stored alongside billing addresses. The breach notification costs more than the engineering team's annual budget. That is not great. Even if they don't sell the data, they could use the 200,000 customers to stand up a similar business instead of being an affiliate of this one. Not good. SQL injection, not good.
|
|
|
transcript
|
1:13 |
Almost feel like we don't even have to cover why this is dangerous. It's so well known, but it's so bad. And you'll see there's two other examples for which I will be showing you that are not SQL injection. The database trusts everything the application sends. There's no firewall between query structure and query data when you use string concatenation. The database has no way to know which parts came from the developer and which parts came from the user, I mean attacker. Automated tools make exploitation trivial. Tools like SQL Map can detect and fully exploit SQL injection in minutes, extracting the entire database without any manual SQL knowledge from the attacker. Script kitties. The impact extends far beyond the target table. Through union selects, subqueries, and database-specific features, load file in MySQL or copy in Postgres, a single injection point can access every table, read files from disk, and sometimes even execute OS commands, which is not good. While ORMs do protect somewhat, they don't protect you fully. Raw SQL fragments in SQLAlchemy's text or Django's extras or Rails' find_by_sql are all vulnerable, and developers reach for them whenever the ORM feels too restrictive. So be sure to double check those.
|
|
|
transcript
|
2:08 |
All right, here we go, little Bobby. He's in there somewhere. So check this out. Here, we're gonna skip over some stuff for a second. We get the username from the request, the search box or whatever, and we get the database. And we're gonna execute a query against the database. Using our friend fstring, we say select ID names, email from users where username equal, and then Michael, M. Kennedy, Sarah J, whatever, whatever goes in here, we execute it and we get, all of that information back, right? We get specifically just that user, right? So here's the users that we search for or something along those lines. However, this is not always what people submit a name. They might submit something like parentheses or one equals one to go there. And so first you got to cancel out that one with this one, right? That and that goes there. And the last opening one is the one that closes that string, that sub, that escaped string. So you can say username equals nothing or a tautology, a always true statement here that that's going to return every user. Not good. Get more dangerous. We could say union select password from users, dash dash, SQL select star from users where username is nothing. Union select password from users returns all the passwords or something destructive like here's little Bobby tables, drop table users, right? So the select that comes in is select star from users where username is nothing. In that command, start a new command, drop table users. In that command, comment out anything that comes after it to make sure it's still well-formed. Again, same type. This is the resulting SQL there. Ooh, deletes the entire users table. SQL injection, not good. But it happens anytime people are concatenating user input but into database query strings. Seems at first sight, well, what are we supposed to do? How can we not do this? We have to query for the username. All right, we'll see how to fix that.
|
|
|
transcript
|
0:47 |
So how do we do it right? Well, first of all, we can use what are called parameterized queries. Remember I said that the query part of the thing and the data part of the thing are indistinguishable from the database? That's true when it's just a string sent over, but databases know about this. It's not just for safety. It's also a performance enhancement to use parameterized queries. You can say, what we're going to do is we're going to do a query. Give me this from users where the username is some variable. And then when we run the query, we can say, and here's the variable. Now the database knows that whatever is in username is literally just inert text. And even if you put something like print, you know, parentheses or one equals one, it just goes, well, that's a weird username. We don't have one named that. So no results really good.
|
|
|
transcript
|
1:25 |
Now, just so you think, well, it's SQL that's a problem and other databases are fine, like NoSQL, like MongoDB and so on. Not true. Now, it is harder, quite a bit harder to do injection attacks against Mongo than it is against something like a SQL database because Mongo expects structured data. Like here's a dictionary that goes into that section. So you kind of got to get it in a dictionary form rather than inert text. But nonetheless, it does exist. So I threw in here an example of, we're not going to go over it, but here's our SQL injection version we just saw in the repository. But there's also a NoSQL version in here that uses PyMongo. So you can see over here, we say find one where the password is this. Again, the thing is for this to be exploitable, you have to somehow turn that from a string into a dictionary, a JSON, maybe from an API or something. It's hard to do from a form, but if you can do it from an API, like a nested JSON thing that's the password, but really looks like that, that's not so good. So it's still possible here on outside of regular SQL. All right, we're not going to go into any more detail that it's pretty similar to SQL injection, but there's the secure and the vulnerable version for MongoDB as well.
|
|
|
transcript
|
1:38 |
All right, another form that doesn't happen in the database, but this happens in the front end in HTML and JavaScript to the user. And I don't know if this is worse, probably not worse, but it's certainly quite bad. So this attacks an individual user instead of attacking all the data in the database. But here's the idea, the setup, let's say. A SaaS project management platform built in Flask lets users comment on tasks. What might they say that's bad? I don't like your comment. No, that's not what they're going to say. They're going to say, you know, bracket script equals. That's what they're going to say. So it lets users comment on tasks. To support rich text, the developer used the pipe safe Jinja 2 filter tool on comment bodies. This is so that they can maybe write in Markdown but render in HTML. And they plan to add a sanitation library later, but they kind of never got around to it. Months will pass. A disgruntled user on a shared workspace posts a comment containing a script tag that silently forwards every viewer's session cookie to an external server. Because comments are stored in the database and displayed to everyone on the project, every team member who opens up the task page is compromised. Their session's hijacked, their accounts taken over. The attacker uses the stolen admin session to export the workspace, entire history, client contracts, internal discussions. By the time the team notices the unusual API activity, the data has already been downloaded. A single missing sanitation step turned a collaboration tool into a data exfiltration platform.
|
|
|
transcript
|
1:39 |
Why is it dangerous? Well, let us count the ways. It hijacks the user, not the server. So cross-site scripting runs in the victim's browser with their full session context, and the attacker can do anything the user can do, and the server sees all that as a legitimate activity, as if the user is just going about what users do. Stored cross-site scripting is self-propagating. So once malicious script gets saved in the database, it executes for every user, like in our example. No phishing link required. They just go to the regular site and they're out. High traffic pages can compromise hundreds of counts passively. Modern spas increase the blast radius. A single page application often hold auth tokens in JavaScript. For JavaScript, accessible storage and making API calls from the client. Cross-site scripting in a spa can silently call every endpoint the user has access to. Auto-escaping creates false confidence. Frameworks like Jinja 2, which is used in Flask and others, auto escape by default this is good but it leads to teams skipping the cross-site scripting testing but every use of pipe safe the markup class instead of a string class or percent auto escape false creates an unescaped hole and these tend to accumulate over time finally csb headers are rarely deployed correctly content security policy could mitigate mitigate cross-site scripting by not allowing the JavaScript to run in the page. But most applications either don't set it or use unsafe inline, which kind of cancels out content security policy or breaks functionality when they try to tighten it, which is why they go back to using unsafe inline.
|
|
|
transcript
|
0:47 |
While this is very dangerous, it's actually pretty easy to fix. So notice here, this is our HTML template. And then imagine over in the view side, I'm sticking it all together so you can kind of see a little better than jumping around files. We're going to render this template right here. This is the name of it. And we're going to provide some data like this username right there. And maybe we want to make sure that we're rendering rich data so we can say send a markup class instead of setting the username string directly. Well, what happens if we pass the name Alice, it shows Alice. But if we pass the name script document.location equals evil such and such, such and such, well, the output is, the header is, the script, you don't even see it. It's like, huh, weird, there's no title on this page. And that page took a moment to load. Oh, well.
|
|
|
transcript
|
1:00 |
Let's talk about injection. And yes, this does include little Bobby tables. Injection vulnerabilities occur when untrusted user input is sent to an interpreter, database, a shell, a template engine, or a browser that causes it to execute parts of that input as commands instead of inert text. Injection remains one of the most tested categories with 100% of the applications tested for some form of injection and over 62,000 related CVEs. Not ideal. How does this look? Well, you might have user-supplied data that's not validated, filtered, or sanitized. You might have dynamic queries or non-parameterized calls without context-aware escaping. Unsanitized data passed to ORMs, which are usually immune to this kind of stuff, but maybe for search parameters or other sort of pass this through as native query syntax. Potentially hostile data is directly used or concatenated. All these things, not so good.
|
|
|
transcript
|
0:52 |
Luckily, the fix is not too bad. So in this case, we're passing or have exactly the same template. And when we render it, instead of using markup, we're just passing pure strings. Jinja looks and that says, well, we're going to make sure that if there's any HTML in here, we're going to escape it. So if it comes as Alice, it goes out as Alice. But if it comes in as script such and such, it actually renders as escaped HTML. So the brackets become ampersand less than semicolon, ampersand greater than semicolon. And this is just looks like view source kind of on the page, which is not good, but at least it doesn't cause actions, right? It's just messes up what gets displayed on the page. So very good. Try to just very, very judiciously use pipe safe or markup or auto escape false, those kinds of things.
|
|
|
|
13:41 |
|
|
transcript
|
1:47 |
On to OWASP number six, insecure design. What is this one about? Well, insecure design represents missing or ineffective security controls at the architectural level. I'm sure you've run into some of these. We have some really, really high likelihood common things, examples, scenarios I've chosen for you. So this category involves design flaws that can't be fixed by a perfect implementation because security controls were never designed in the first place. This includes missing rate limiting business logic flaws, unrestricted file uploads, and relying on client side enforcement for security side security. These are not good. As I said, we've got two really cool examples we're going to dive into. And I'm sure you, if you have not seen them, you almost ran into them, had to catch them, something like that. So they're going to be good examples. Let me just throw out one more just to give you a sense of like, wait, what? Long passwords. We want long passwords, right? These are good things. So a long password makes our site secure, right? Maybe we said you're going to have to have a 12-character password. You can't use monkey123 anymore. I know you used it all your life, but you're done with monkey123. If you want to use our system, you've got to have a better password than that one. Well, what if the password's really long, like a million or five million characters, And then you feed that into a memory hard Argon2 type of algorithm for hashing it to check to see, well, they gave me this password. Is that really a user? I got to hash it and find out. Turns out that can be a bit of a distributed denial of service type of thing. So even simple things like long passwords that are too long, you forgot to check those. That's an insecure design.
|
|
|
transcript
|
2:15 |
Let's look at example one, no rate limiting. This one I think is very, very common. It's not usually discovered. It's not a true data leak, but it certainly causes problems. And it allows people to break into systems by brute forcing different things. So it is a serious problem. So here's a scenario for it. TripNest is a travel booking platform with 2 million registered users. Their FastAPI login endpoint is clean, well-structured, and returns proper 401 responses for invalid credentials. What it doesn't do is count how many times someone has tried to log in. An attacker buys a credential dump from a recent breach, 10 million email password pairs for $200 on a dark web forum. They spin up a small cluster of cloud VMs and begin submitting login attempts against TripNest at 500 requests per second. Since many people reuse passwords across sites, roughly 1.5% of the credentials work. Within six hours, the attacker has valid sessions for 8,000 TripNest accounts. Each loaded with saved credit cards, loyalty points, and upcoming reservations. TripNest's monitoring dashboard shows a traffic spike, but since every request is a well-formed post method to a legitimate endpoint, no alarms were fired. The breach isn't discovered until users start reporting unauthorized booking charges to their account. Now, I will point out that these 10 million emails that this person bought on the dark web, They have to be related to the users of TripNest. I'm not saying it's a TripNest dump, but maybe there's TripTogether or some other site that did have a breach. And there's a very high likelihood that if you have a TripTogether account, you also have a TripNest account. So you got to think about, okay, well, there's the 10 million and how many of those accounts actually exist on TripNest? and then what percentage of those are reused passwords. But nonetheless, if you have a breach, you know where it came from, you could start applying that to sister or similar websites or web apps and start checking. Well, if they got an account there, there's a good chance we got them here. And I think that's kind of what's going on here.
|
|
|
transcript
|
1:41 |
Why is it dangerous? We've touched on it a little bit. Credential stuffing, which is what this is called, to just get some known credentials and just jam them against a different site. They're not known to come from that site. They're from somewhere else. But because passwords and emails are reused, there's a chance it'll work in a different place. Credential stuffing, it's cheap and effective. So billions of leaked credentials are freely available. Attackers don't need to guess. They just need to replay known passwords at scale with reuse rates of 1 to 3 percent. the least prepared of us get suckered every time. It defeats strong password policies because even if you have a really long password, if you reuse it, attackers can try many, many times. Now, how hackable is this? It really very much depends on not just rate limiting by explicitly putting rate limits in, but also rate limiting by how expensive is your hash. So you want that to be fast for a quick login, but slow. So brute force attempts do slow down. So number two is a little bit up in the air, but this is what OWASP says. Rate limiting is a design decision, not an afterthought. Bolting on rate limiting means choosing between breaking legitimate high traffic patterns or leaving gaps. So definitely think about the beginning. 2FA becomes somewhat bypassable if you don't rate limit your 2FA attempts. Now, again, the 2FA changes every time. You've got to have a login before you get the 2FA. There's a lot of if things line up here, but yes, theoretically, you could brute force, you know, spend a long time and log in, try hammer on it for 30 seconds, log in, hammer on it for 30 seconds. You'll get through eventually if there's zero rate limiting.
|
|
|
transcript
|
0:56 |
Well, let's look at the bad version, the vulnerable version. It looks so simple. If I took this big, somewhat scary-looking green comment out, it's five lines of code. We get a login request, which is a Pydantic model that has a username and password, presumably. And then what we're going to do is say, well, let's get the user by their username, and then make sure that exists, and we'll verify the password by the passed in plain text password and the hashed password, and we'll, you know, argon2 and those things have a way to say, Here's a plain text input given this algorithm. Does this match this hash that we've got saved in the database? If yes, create a token, otherwise 401 permission denied. Well, this is actually well-written code, except that you can request it as fast as the machine can possibly process it. You can get a botnet and hit this thing hard and possibly get some free accounts, not good.
|
|
|
transcript
|
2:40 |
The secure version is a bit more interesting. It has a FastAPI specific fix that also has a flask parallel, but in principle, you could do this with your own code super easy. So notice up here, we have this limiter. This is going to be, this limiter here is going to be a decorator that we can put onto our code. And it comes from a project called slow API, which uses either Redis or memcached or in-memory memory for how often a certain request for an endpoint plus an IP address exists. We're also going to simulate this database that stores failed attempts, okay? And we've got some stats. So first we want to check for account lockout. So we're going to track how many failed attempts has there been. It's going to be stored as a list. So we'll do a list comprehension here to just say, filter this down to the recent ones. And then we put that back. And if there's more than the threshold, in this case, more than five, we're going to raise an exception. Your account is locked. Try again later. So it kind of is a self-maintaining in-memory dictionary. I would probably put it somewhere else if you have any real web app because you're going to fan this out into a web garden with multiple processes. So something like Redis, Valkey, disk cache would be a better place than memory for this. But, you know, it fits on a slide, right? We also want to record a failed attempt. Username, they tried to log in, and it was at this time. We just append it to the list. And then here's our new login method. It doesn't look that different. But notice we have a rate limiter up here that has 10 attempts per minute per IP address. So that's one level. this is sort of a two level protection. You technically could probably get away with just the inner one, but it's always good to have more. And then we have our check lockout. Remember if the account is locked here, we actually raise an exception, a 429 too many requests sort of thing. So we don't get past this if their account is locked, right? It's not like we forgot to do an if. And then we basically write the same code before. Get a user, we verify it. They've logged in correctly, so we reset their failed attempts, given their access before. But if they do not succeed in logging in, then we record a failed attempt. The next request will run the check account lockout and so on. So here's one option that you can use, one way that you can do it. There's a lot of ways that you can have this work, but make sure you put something there to slow people down.
|
|
|
transcript
|
1:55 |
Example two, situation two is client-side only enforcement. You've got a React or Angular or Vue web app, and it feels like almost all of the logic of your code is in the front end, and you just have this dumb API endpoint that just exchanges JSON on the back end. So you put all your smart logic in the Vue, not realizing that you don't have to go through the Vue to get to your web app. Things can go right around it, and it's a problem. I've had debates about this with people online, and I said, well, you have to do this on the server. Like, no, no, no, we just have required, or we have some setting in HTML5 for an input. I'm like, you have no idea what you're talking about. People don't have to use your HTML to submit to your website. They can do all sorts of things. So that's what this is about. Here's the scenario. A FinTech startup, VaultPay, builds a sleek React dashboard for a business expense management app. The front end enforces a $5,000 per transaction transfer limit and disables the send button if the amount goes above it. See, you don't worry. You can't even submit it. The send button is gray and it has disabled. It does nothing. So it grays out on restricted accounts. The API behind it is a nice, FastAPI endpoint that accepts the amount and recipient ID and processes the transfer. A disgruntled employee who happens to know a little bit about the web opens the browser dev tools and watches the outgoing post request and then converts it to a cURL terminal CLI request, setting the amount to $95,000 and the recipient to their personal account. The server processes it without hesitation because it never checked the limit. The front end was the only gatekeeper and the attacker simply walked around it. By the time the anomaly shows up in weekly reconciliation reports, three transfers totaling $280,000 have left the company's operating account. Not great.
|
|
|
transcript
|
0:58 |
Why is this front-end-only validation bad? It bypasses, the bypass requires zero sophistication. You got your browser dev tools, you can right-click, you can set up proxies to watch it, you can give it to AI and say, look, it did this, help me automate it. It creates a false sense of security. The development team sees validation code, the QA tests the happy path through the UI, and everyone believes the constraint is enforced. The gap between perceived and total security is the most dangerous kind. Every API endpoint becomes an attack surface. If one endpoint trusts the client, the pattern is likely systemic. Attackers who find one unprotected route will methodically test every other one for similar flaws. Compliance frameworks won't save you. PCI DSS and SOC 2 and other similar audits check for server-side controls. Client-only enforcement is a compliance failure that results in fines on top of direct financial loss.
|
|
|
transcript
|
1:29 |
Now let's look at the secure version. This comes in a couple of steps. Remember I pointed out the Pydantic model, the discount request, that's already doing some of the type checking and stuff. Here you can see we've got the product ID as an integer, and we've got the discount percent as a float. That used to be all there was there. But now in this new one, we've added this field validator for the discount percent. And it says, hey, if it's less than zero or greater than 20%, not allowed. That's our business rule. Now, these people have never heard of Black Friday. 20% might be too small, but that's the rule that they put in here. It can never be discounted more than 20%, and here it's enforcing that right at the FastAPI level. So here you can see this is the model that comes into FastAPI. Before your apply discount code even runs, FastAPI is going to have Pydantic parse the inbound JSON. Pydantic will look at it, throw an exception if it's wrong, and we're done. it never gets to your code. So it stops right there. That's where the validation happens. So now after that, we can assume that all the Pydantic validations have run. So we get our product and we're checking if there's no product raising a 404, should have been the first one, but that's fine. And we also have our discounted price, which has exactly the same math as before. But remember, Pydantic is validated before it even populates the model. So we know it's within bounds of what we set. Pretty cool, right?
|
|
|
|
16:06 |
|
|
transcript
|
1:56 |
Time for OWASP number seven, authentication failures. This one is a prime target for issues that could show up and things that you can pretty easily fix, which is nice. So authentication failures covers weaknesses in how applications verify user identity and manage sessions. This category includes 36 weaknesses ranging from credential stuffing and weak passwords to session fixation and hard-coded credentials. Any flaw that lets an attacker impersonate a legitimate user, those are not good. How does this manifest? Well, it could permit automated attacks such as credential stuffing, that is taking known passwords from some breach unrelated to your site and trying over and over on your site to see if that user also exists on your site and they've reused their password. Or brute force or other automated script attacks. Weak passwords such as password or monkey123, things like that. It could allow users to create an account with already known leaked passwords. So credential stuffing is where somebody gets leaked passwords and tries them. This problem is where you know that this account with this email and this password exists in some breach somewhere already, but you still let users create an account with it. How are you going to know? Track all the dark webs? We'll see. allow weak or ineffective credential recovery or forgot password type things, such as your mother's maiden name. You can't reset your mother's maiden name, so if that gets leaked, it's a problem. Missing or ineffective multi-factor auth. Or weak fallbacks like, oh, you forgot your 2FA? What's your mother's maiden name? Or even invalid user sessions and single sign-on scenarios. is.
|
|
|
transcript
|
1:16 |
Our first area we're going to focus on is weak password policies and in a multifaceted way. It's going to be interesting. So here's the scenario. A university deploys a Django-based research collaboration portal where faculty share unpublished papers, grant proposals, and experimental data. To reduce support tickets, the IT team removes all password validation. Let users choose what they want. It's just a simple internal thing, right? Registration only requires a valid.edu email. A security audit six months later reveals that 23% of accounts use passwords from the RockYou breach list. The top five passwords are actually password, 123456, research, and the university's name, as well as let me in. An undergraduate computer science student curious about upcoming grant decisions runs a dictionary attack against the portal's login endpoint and cracks the department chair's password, science1, no caps, in under 30 seconds. The student accesses three unfunded grant proposals, a tenure review letter, salary data for the entire department. The breach triggers a mandatory disclosure under the state's data protection laws, and the university research partnerships are frozen pending a full security review.
|
|
|
transcript
|
0:54 |
So what's wrong here? Well, instant compromise. Passwords like password123 and QWERTY and so on are the first entries in a brute force dictionary attack. An account with these passwords opens up straight away. Credential stuffing amplifier. So weak password policies mean that users pick passwords that are already used elsewhere, like in those breaches or incredibly simple. So credential stuffing works more easily. Regulatory exposure. You don't want to get on the wrong side of regulation. NIST 800-663B and many other compliance frameworks such as HIPAA, SOC 2, and so on require checking passwords against known breach lists. We talked about like Ulet a user created an account with an email and a password that are known in a breach list. So that's not good. It's not good because people can break it, but it's also not good because it's a regulatory violation.
|
|
|
transcript
|
1:26 |
All right, as usual, let's look at the bad version. So here we're doing Django, as we said in the scenario. And this is our part one of our method, our view method that handles registration. So you can see it's a post and so on. So we come and we get the body. We pull out the username, password, and email. We make sure, hey, validation, no problem. We make sure the username, password, and email are all set. And we check and make sure there's no user with this username already. We should also check email. But we don't want users to be creating an account for one that's already taken, right? That's not great. But there's other issues as well. At the end, let's just cut to the chase. Creating an account with that set of validation in place. So what could go wrong here? There's no password strength validation at all. Just the letter A, my favorite password. No minimum length requirements. Again, my favorite password, the letter A is allowed. We're not checking against known breached password breaches and dumps and so on. We're not checking that the username is not the same as the password, like whatever your email address is, and then just use that again so you don't have to remember it. Django has built-in auth validators, but if you look in the settings, they're just empty. So that's not great. And then we just create it. And this literally just accepts pretty much input, any input we want. Not ideal, right?
|
|
|
transcript
|
3:27 |
All right, so how do we make this right? I actually want to show you kind of two variations. There's going to be a little bit of redundancy in this example, but I want to show you the Django way to address some of these issues and provide you tools that even if using Flask, FastAPI, whatever, you can also use some of the tools that are in the source code. All right, so here's the Django-specific settings. We can set up our auth password validators. So we can reject similar passwords that are too close to username, email, or first and last name. We can make sure that there's at least here using the minimum length validator, we can have 12. Set it to 12. People are going to be like, oh, I got to remember this. I can't use monkey123. But you know what? They should be using a password manager. And Django checks against Django's built-in 20,000 common passwords. Monkey123 surely is among them. All numeric passwords, not allowed, not enough entropy, and a breached password validator against have I been pwned. This one actually is not implemented by Django. It doesn't come with Django. This actually comes from the source code that comes with this course. So we got this breached password validator. It uses the have I been pwned API, part two. So we set up the Django app that way. This is a little more general. So again, we get our password information in here, and we validate that all of these things are supplied. Excellent. And then we validate length. So Django is already saying it's 12, but if you're not using Django, you're going to want to definitely check explicitly what is the link. You can say bad request. Also in the source code, I've included the isTop10,000 password checker, this common passwords class, and you can check that. Again, this is probably unnecessary in this Django example, but any other web app, you're going to need to do this like so. So you need to check for HIPAA and other things. Has this username and password been used in a breached password breach, a known password breach? Well, you don't want to submit the actual password, but there's this really clever API using hashes and just part of hashes and so on from have I been pwned. So in this common passwords module in the source code that you get with this course, it has an is breached password that uses the free have I been pwned to check here. So that's how we're checking. Again, we've already registered this in Django, but if you're doing it somewhere else, you're going to need to call it explicitly. And then finally, we'll check if the username exists. Still probably should check email as well. We're going to create a user in memory so that the Django validators can run. and then we're going to save it to the database. You don't want to save it to the database and then see if the validators are good, right? You want to check. So we're going to check and return that error here. And after all that, the user is created. There's a lot of stuff. This is the multifaceted thing that I talked about. Is the password long enough? That's one form of weakness. Is it a common one? Another form. Is it a breached one? That's another form. And Django had those auth validators for similarity So lots of really cool things we can do, but it's a multifaceted approach. It's not enough to just say, well, it's an eight character password, so we must be good.
|
|
|
transcript
|
1:17 |
Example two, password reset, in particular, insecure password reset. So the scenario, a SaaS project management platform that serves about 200 small businesses, a disgruntled ex-employee of one of those businesses wants to access the former company's workspace. They know their old manager's email address, and they know there's a weakness in the password reset, so they use that email address to create a reset. The manager probably goes, well, I didn't ask for this. And, you know, those emails typically say, oh, you can safely ignore this if you didn't ask for it. Maybe. So in this case, the reset endpoint returns a token that's just an MD5 hash of Unix timestamp. Not good. The ex-employee notices the reset email had arrived at such and such time. And they quickly create a script to hash every second in a five-minute window around that time. On the 87th guess, they have a valid token. They reset the manager's password, log in, and download six months of confidential project files, client contacts, and salary spreadsheets before the manager even checks their email. Because the token has no expiration, the attacker bookmarks the reset URL pattern. Even if the manager resets their password again, the original token will still work. Platform never invalidated it.
|
|
|
transcript
|
1:18 |
Oh, what could be wrong here? Well, let's see. Predictable tokens. Timestamp-based tokens have very low entropy. An attacker who triggers the reset could know approximate time and then use that as a starting point for guessing. There's no expiration, so tokens found in old logs, email archives, browser history remain valid forever. So two years later, someone breaks into your Gmail, they look at all the reset things, see if they still work. Token reuse, without a single use enforcement, tokens could be reused over and over again, right? Once you reset your password, does it use up or mark that reset as used and make it permanently invalid? Account enumeration, so if when you reset, attempt, you ask for a reset, it says different things, like, sorry, that user doesn't exist. You can't reset a password with an email that doesn't exist versus an email has been sent. That tells people which emails are there. So you just try to reset every email you want to guess for. Token leakage, returning this in the API or maybe saving it into a log, something like that, so people can run across them. Or if you're trying to skip waiting for the email in a development scenario and then you forget and ship that to production,
|
|
|
transcript
|
1:55 |
For this one, we're going to look at an invalid password reset or a weak password reset scenario using FastAPI. So here we've got a request. We've got a nice, pydantic model that maps that data to the request. So we get our email pulled in from however we submitted it, JSON or JSON body or something like that. And we're going to check, see if there's a user. And we're going to say, sorry, user not found, as opposed to something like reset email sent. Check your email. So this will allow that enumeration for users. You don't know what their passwords are, but as you send a bunch of emails here, you'll be able to go, oh, this one actually has an account and that one does not. So this is the problem. We have a bug where the token generation is predictable, like we talked about using something like time as your randomness. If you actually use sequential tokens, oh my goodness, like reset one, reset two. No, that's a really bad. also there's no expiration here saving to the database in the sort of pseudocode way we just save here's a reset with this token as the key and then the email is whatever the email is but no date when it was created so we can figure out if it's still valid and so on also here we have this print statement standing for some form of logging it says hey we reset we sent the reset link to this email address this is great to put in your logs. You should do that. But if the link itself contains the token, then you're saving the token. And then of course, if the token doesn't expire, people find it two years later in some kind of breach, and then they can go reset all the stuff. It's a cascading problem. Then finally, we're returning email sent. Oh yeah. And here's the debug token. So you don't have to worry about going to that pesky email and waiting for it to arrive in dev mode, right? But accidentally gets left in production.
|
|
|
transcript
|
2:37 |
Let's make it right, folks. Here's the secure version. So first thing is we're going to set up a generic message that always looks identical no matter the scenario. We're going to put it at the top of the function and reuse it as a variable so there's no chance that there's any variation. So here, if we try to get a user and there's no user, it says a password reset request for not existing email logged. And then we're going to return if an account with that email exists, a reset link has been sent. So either way, they're going to get that message. They won't be able to enumerate the users. Also, we're using the secrets library to create a much safer 256 bits of entropy. So that is much less guessable. And then we didn't talk about this yet. We're going to store a hash of the token in the database, kind of like passwords. So if somebody were to find some kind of SQL injection vulnerability, or they just get access to the database, they wouldn't be able to read all those reset tokens and then reset and take over all those accounts, right? So this way, much like password hashing, we're going to store a hash of this. It's not super, super intense. We could do more, right? We could do an Argon hash or whatever, but storing some form of hash will allow us to make it a little bit safer if somebody gets hold of our database. Continuing on, we can set a short expiration. So here you can see we've now added this created at and whether it has been used. That solves kind of two problems in one there. And OWASP recommends no more than 15 to 30 minutes. After that expiry time, it's done. Also, if you, this is a little bit annoying for the users, but it's good. If you request multiple resets, the subsequent reset invalidates the previous one. So here we're going to go through and find all the reset tokens associated with a particular email address, and we're going to delete them out of the database. Once you request a new reset request, then you go through and delete all the old reset tokens that you could use. Finally, when you log, the password reset has been granted or generated for this user. We don't say, and here's the link with the token. We just say, this user did a reset. We don't actually save what that is. If we really need to know, we could go back in the database, do a query by email, sort by created date, created at, and then boom, that's the one. Okay, so not too hard, but a lot of subtlety here that you want to make sure you take into account for your resets.
|
|
|
|
13:09 |
|
|
transcript
|
0:33 |
Next up, number eight, software or data integrity failures. This is a good one. So software or data integrity failures occur when code and infrastructure fail to protect against untrusted data, unsigned updates, unverified external resources being treated as trusted. This category covers insecurity, serialization, missing signature verification, mass assignment. This is going to be one we'll cover as well as CD and trust without integrity checks or or CZ and Trust pretty much at all, and client-side data at Tamperin.
|
|
|
transcript
|
2:04 |
The first one up is mass assignment. Now this abuses a feature that I absolutely love. You'll find it in Django, you'll find it in FastAPI, find it in ASP.net. Lots of cool web frameworks have it, but it can be a problem. Here's the scenario. An online marketplace built with Django REST framework lets independent sellers list products. Okay, so far so good. The product model includes fields sellers should control like title, description, and price alongside fields only administrators should manage, but users need to see, like is approved, is featured, cost price, seller, and so on. Now the Django serializer has fields dunder all because it's faster than listing all the fields individually, and it's also future-proof. You add a new field, it automatically starts getting serialized. Good, right? Maybe. The seller discovers that the patch endpoint accepts any model field by inspecting the API's browsable interface. They send a request with, is approved, is true, is featured, is true, because why wouldn't you want those at the seller? And their product immediately appears on the homepage, bypassing a two-week admin review. Other sellers catch on, and within days, the features section is flooded with unreviewed products, including several counterfeit listings. Add on to that, one particularly savvy seller realizes is that they can also set seller 42, where their ID, their seller ID is not 42. So what they do is they can reassign a listing of their own to a competitor's account. And the one they're going to reassign is some counterfeit good that they've uploaded. And now they're moving over. So they can make it appear like this counterfeit good is being sold by one of their competitors. The marketplace gets hit with a fraud complaint and the mess takes weeks to untangle because the audit logs only show product updated with no record of which fields were changed. So mass assignment, while very, very handy to say this class models the inbound data, you've got to be careful because you could assign everything potentially.
|
|
|
transcript
|
0:41 |
So why is this dangerous? Four reasons. Privilege escalation. A seller can set is approved, which bypasses the review. Data manipulation. A seller can change cost price or internal notes meant for only internal use. Ownership hijacking. By changing the seller ID attacker can reassign a product to a different seller's account. And they can also bypass the placement that normally requires some kind of editorial thing, like setting it to be featured, maybe some low quality product you don't want featured. There's an editorial review that gets things featured. So there's a lot of ways that people can manipulate how the web app works and take advantage of it.
|
|
|
transcript
|
1:13 |
Now there are three different pieces in play here. So we're going to click them together with these different little squares. So at the top, we've got the class that represents what is exchanged. This is the thing being mass assigned to. So what we can say is we can say our API takes a product and we've specified all the elements of it. It's got seller, title, description, and so on. And this is totally fine. There's nothing wrong with it inherently. But then we've set the serializer to use dunder all, which exposes every field, both for reading and writing. Maybe sometimes we don't want to send those, some kind of internal element across. And we also maybe don't want to let people change things like is approved, but it's easier to say dunder all for the fields. So we're off to the races. Now, an attacker could construct some kind of HTTP requests like we have down here at the bottom. They've got a proper URL, proper content type, it's their token, but they could set things like one cent for the price, reassign the seller, make it featured, make it approved, all of these kinds of things. So this is the kind of stuff that can go wrong with mass assignment.
|
|
|
transcript
|
0:52 |
Let's fix this pesky mass assignment problem. So our product is unchanged, same as before, but we're going to change the serializer. So what we're going to do is we're going to change the metaclass and not just say dunder all for the fields. Set the product and the fields we're going to list out each and every one of them. So ID, for example, and is approved. We do include those, but we're going to set it to read only in this particular case. So we're going to say ID is approved, the seller company, all of these things are set to read only. That way, when somebody makes a get request for that to display their page, all the data they need to show approved, not yet approved, is there, but they can't mass assign it back. It can only assign all of these fields and these we've set as read only right there.
|
|
|
transcript
|
1:46 |
On to untrusted resources. In this case, we're going to talk about untrusted CDNs. So here's the scenario. A financial service company. When you hear the word financial, it should probably make you double down on security, right? A financial services company builds an internal reporting dashboard with Flask. The front end uses Vue.js, Chart.js, and Tailwind, all loaded from a public CDN to keep the Docker image small and leveraged browser caching. All good things. The script tag points to versioned URLs on JS deliver and CDN.js, but no one adds integrity checks, no attributes. If you pin the version, the file can't change, so why do we need to set the integrity? That is basically a hash that goes on the script or CSS definition HTML that says, if you get this from the CDN and it doesn't match this integrity hash, Don't load it. One morning, a CDN provider's deploy pipeline is compromised through stolen maintainer credentials. You know, one of the maintainers was phished or something like that. The attacker modifies the served copy of the popular JavaScript library to include a key logger that captures input and sends it to an external endpoint. Because the attacker modifies the file at the CDN layer, the version number is not changed, but the file is. The dashboard's users, financial analysts with access to revenue data, client portfolios, and trading positions, start their day by logging in as they always do. The keylogger captures the credentials for 60 internal accounts before the CDN compromise is detected and reverted four hours later. The company only learns about the breach when stolen credentials appear on dark web monitoring alerts two weeks later.
|
|
|
transcript
|
1:04 |
What's the harm? Well, full page access. Injected JavaScript can read the DOM, steal cookies, potentially depending on how the cookies are set, capture keystrokes, all sorts of bad things. Session hijacking. Malicious scripts can steal session tokens as well and send them to attacker-controlled servers, which then can use those sessions onward. As a massive blast radius, a single compromised CDN file affects every site that loads it. not theoretical either in 2019 the browse aloud incident injected a crypto miner into thousands of government websites simultaneously not thousands of users users of thousands of websites that's really bad css exfiltration even tampered css can steal data with certain attributes so you definitely don't want to do this you're basically saying that site over there can run arbitrary code for every user of my site if they want. That's a high level of trust, actually.
|
|
|
transcript
|
1:18 |
Now let's look at some HTML. Yes, just straight HTML. Now this is probably in a Jinja template being Flask, but it could be in the HTML really. So this template loads multiple JavaScript libraries from external CDNs with no integrity attribute on the script or link tags. So the browser trusts whatever the CDN is going to give them back and it's going to run on the user's computer in their browser. So we've got, you know, CDN, JS Deliver, and so on. And here we're getting view, and again, we're panning the version, so that's good. And we're getting the minified version, so that's good. No integrity check. Same thing here. Same thing with our Tailwind and Tailwind.js and CSS and all these kind of things. Not good, folks. Not good. So on top of just the security problem, there's a potential reliability problem. So if this CDN goes down, then your website stops working, especially if you're using a strong front end framework like Vue where the page doesn't even really do anything. But I guess even if the CSS is gone, it's pretty whacked. So you both have a security problem potentially, but even just a reliability problem if the security is fine.
|
|
|
transcript
|
3:38 |
So how do we fix it? Let's look at a better version. There's a couple things you can do. You can put integrity hashes on to those links. Interestingly, I actually just checked on JSDeliver and said, I want the link for Tailwind. And it gave me the link that I showed you without integrity. So kind of encourages you to do it that way, which is not ideal. So what do I recommend you do? I recommend that you take ownership of these external libraries as much as you can. So you can just simply install these locally. You could either use npm install, copy them over from node modules, or just download them. Then I have either a vendor folder or an external folder, or each one of these gets copied into. So the view folder would be in there, the chart.js tailwind, and so on. Because sometimes the tailwind might depend on other things that import, so you might want a little more structure. But if you take these and download them and serve them out of your site, yes, you'll have a little bit more traffic, maybe, but you're guaranteed that they're going to be there. And that solves both the problems that I highlighted before. The reliability, you're not depending on any other external systems. So there's no way that your site can stop working because the CSS or JavaScript stopped loading. Yes, your site could go down, but if your site's going to go down, it doesn't matter if people can get your CSS, your site's down anyway. So it kind of isolates that reliability issue, which is great. And it's your own code. It only changes if you go and copy a new version in here. So the chances that something gets compromised is way, way lower. Now you might say, well, we got a lot of traffic. We can't possibly serve it all. This is what we do over at Talk Python Training. You're watching this course right now. You got code exactly like this. And what I've done is I've just set up an external CDN. And I actually really like bunny.net, but you can use Cloudflare or whatever you want and just specifically load all of our static assets, images, videos, CSS, JavaScript, and so on. All of those load out of a separate CDN with 120 global points of replication, POPs, points of presence. But they are all based on the origin of our data that we control. So basically the fix I'm suggesting is Vendoring it in, if you have tons of traffic and you need a CD in, make your own CD in. It's incredibly cheap. It's incredibly easy to set up over at bunny.net or anywhere else. Now, once you vendor it in, you probably want to set long cache times on it so that it actually, not every request requests that file. You want it to be cached for an hour a day or cache times or a year. So that when people come back and visit, they only load these things once. It makes a really good user experience. So you're going to need some way to say this file has changed. Otherwise you get stale JavaScript and so on. And that's its own problem. So you can do some kind of hash here. Like you could put it in the file name, which is really popular with static site builders. I put it as a query string on the end. It doesn't really matter, but you just need to somehow tweak the URL based on the file content. That way, if you ship a new version, the file name is actually different. and eventually the older unused version just fades from the browser cache. So this is my recommendation to solve the depending on a CDN that might get hacked, taken over, go down, all those things. If you don't wanna do this and you still want to just use those CDNs, make sure you put an integrity hash on the links.
|
|
|
|
12:37 |
|
|
transcript
|
0:36 |
We're on to number nine of the OWASP top 10 logging and alerting failures. I really like this one. It's so easy to add and it even makes understanding your application better, but it's super easy to forget these things. So what is it? Without logging and monitoring, attacks and breaches can go undetected. And without alerting, it's very difficult to respond quickly and effectively during a security incident. Insufficient logging, continuous monitoring, detection, and alerting to initiate active responses occurs anytime.
|
|
|
transcript
|
1:17 |
Example 1. No authentication logging. Here's the deal. A boutique law firm uses a Flask-based client portal where attorneys upload privileged documents and clients download them. The portal has reasonable security. Bcrypt hashed passwords, HTPS everywhere, session timeouts, but the developer who built it never adds authentication logging. Logins succeed or fail silently and the application simply redirects the user and moves on. A disgruntled former paralegal terminated two weeks ago discovers that their account was never deactivated. They log in from their home IP at 2 a.m., download case files from three high-profile clients, and log out. They repeat this nightly for a month, accessing increasingly sensitive materials related to an ongoing merger negotiation. Because no authentication events are logged, there is no record of these sessions ever occurring. The breach surfaces when confidential merger terms appear in a competitor's counteroffer. The firm's IT consultant is asked to determine when the leak happened and through which account, they find nothing. No login records, no session history, no access time stamps. The firm cannot even prove to their malpractice insurer that the breach happened through a specific vector. Their security liability claim is denied for insufficient security controls.
|
|
|
transcript
|
1:00 |
Why is this dangerous? Brute force attacks become invisible. Without failed login records, an attacker can attempt thousands of credential combinations with no evidence accumulating. And rate-based limiting on logs is impossible. And of course, if you combine this with no rate limiting and no logging, well, that's not good. Account compromise has no detectable signal. Even if you suspect an account is compromised, you can't confirm it by looking at unusual IP addresses, times, and so on. Incident response always starts from zero. You can't go back and look at the logs. You have to start by adding logging and then go from there. That's no. Insurance and legal liability increase dramatically. Cyber liability policies and regulatory frameworks expect authentication logging as a baseline control. And as our scenario laid out, deactivation gaps go undetected. Orphan accounts from former employees are a top attack vector. And login monitoring is a primary way for organizations to discover that offboarding was incomplete.
|
|
|
transcript
|
1:40 |
Let's look at a deficient login. It's surprisingly simple because it doesn't have to do anything like logging or checking or so on. So here's a Flask-based login. You post to slash API slash login, grabs the JSON data, and goes from there. So the bug is that there's just no login present. And you see that if an attacker tries to brute force credentials from a breached database, we're just guessing. We get thousands of failed attempts from a single IP we don't know. successful logins from unusual locations are not detected. Login attempts from emails from known breach lists are also not detected. And, you know, credential stuffing, many users, few passwords, so on. Okay, so here, how does it work? We go get the JSON body from the request. That gets our data. From that, we pass that to our database, get the user back. And if there's no user, hey, there's no user by that username, so we're not going to let them log in. and hey, look, we're checking our password against a hash, maybe bcrypt, maybe argon2, lots of good things. And we just say, nope, invalid 401. We don't do things like log that this user from this IP address at this time did this thing. We're also not adding any rate limiting, but that's not the point of this example, right? But they should be combined. Also, when it does succeed, if they do have a username that matches and a password that matches, we still don't log in. This is our scenario where we talked about this paralegal who got fired, but their account never got shut off. This is the path they would take. They log in just like they did three weeks ago before they were fired.
|
|
|
transcript
|
2:27 |
So how do we fix it? Maybe we should call a guru, a security guru. No, a logging guru, log guru. We can fix this with any logging framework, but I really like log guru. I like the way that it works. And so I'm going to show you an example of log guru. This is what we use at Talk Python for many of our things. So what we're going to do is we're going to import and configure our logger. This is a multi-step fix here. Somewhere at the top of the app at the beginning of the application. I would say you want logging as soon as possible so you can even log application startup stuff correctly. We're going to create a security audit log specifically. You could put this in your main log and that would be fine, but having a dedicated security log lets people focus on that stuff if they want. And then we're going to set up a filter. And with Log Guru, you can say when you do a log, you can log extra information and you say, what is this log message about. So in this case, we're going to say it's about a security auth. So only when we see that extra in here does the output go to the security auth log. The message looks like this. We're going to pass in a JSON, rich JSON thing that we could read back later, potentially lots of details about what's happening. And then we rotate every 100 megabytes. And then we just bind that here to create our logger object, our security logger. Okay, so this is the setup. And then we're going to create a function called log auth event. So this way, we just call this function whenever we want to log an auth event. And we generate this JSON data here. And then we just call security logger.info, warning, critical, so on, and dump out that as a string. Finally, we come back to our login that was deficient before. And everything from here down to there is exactly the same, except for now if something goes wrong, we log the auth event, type is login, outcome is failure, pass in the username, and so on. And then we return the details. Same thing when a successful login happens, we just log the auth event, login, success, user ID, username, and so on. If we jump back to our function, you can see here we're automatically pulling out of the request, the IP address, user agent, and so on. These things are actually really helpful for looking at issues like this in the logs.
|
|
|
transcript
|
1:27 |
Our previous example with logging was great, but how often do you watch the logs? So the next scenario talks about alerting on security events. A government contractor runs a Flask-based document management portal that handles controlled, unclassified information, CUI, for defense subcontracts. Their compliance officer checked the logging-enabled box during their CMMC assessment because the application does, in fact, write detailed security events to var logs portal security.log. The log captures failed logins, permission changes, document downloads with timestamps, and IP addresses. As I alluded to, the problem is that nobody reads the logs. There are no alerts, no dashboards, no log aggregation service. The security.log file rotates weekly, and old files are compressed into an archive directory that has grown to a whopping 47 gigabytes over three years. The ops team checks the logs only when a user reports the problem, which means they check for application errors, not security events. An attacker compromises an employee's VPN credentials through a phishing campaign and begins downloading the CUI documents at a rate of 200 per night for six weeks. The log file faithfully records every download. The breach is discovered only when the partner agency notices the documents circulating on the dark web forum and traces them back to the contractor. By then, 8,400 documents have been exfiltrated and every single download was logged.
|
|
|
transcript
|
0:44 |
What's wrong if we don't have alerting? Here are some things that can go bad. Logging without alerting creates a false sense of security. Teams believe they have it covered because events are being captured. Attacker dwell time grows unchecked. The longer an attacker operates undetected, the more data they can exfiltrate. Compliance frameworks require monitoring, not just logging. CMMC, SOC 2, PCI, DSS, and HIPAA all distinguish between recording events and actively responding to them. High value patterns hide in volume. A single suspicious download is invisible in a log file with 50,000 entries per day. But a simple threshold alert, more than 50 documents downloaded per hour from one account, would set off alarm bells.
|
|
|
transcript
|
1:07 |
This time, this login has login, but if something goes crazy and a whole bunch of logins start failing or happening from one place and so on, there's still very, very small chance someone's going to notice that. So they're logged, but nobody monitors the file and there's no mechanism for alerting them. If an attacker runs a credential stuffing attack generating 50,000 failed logins during an hour, the log entries exist, but sit there silently on disk. Again, we get our data from the perform post or API post or whatever. Then we're going to authenticate it correctly. We've got our user. And if that's a failed login, we don't get our user back. It checks the password hash as well. We say, hey, look, we're logging it. Failed login from this person at this address. And then we just tell them, nope, 401. And we log if it succeeds. And we just send them on their way. Here's your logged in status. Great. But how do you know, right? No one, not no one, very few people are going to be checking these logs and watching them go by.
|
|
|
transcript
|
2:19 |
Let's fix this. Let's add alerting in addition to just logging. So we're going to create a class called a security monitor, and it tracks certain thresholds. And once you exceed one of those thresholds, some kind of alert will fire. In production, we're going to use a dedicated service like Datadog, Sentry, PagerDuty, rather than in-process tracking, right? So we would have an alerting webhook for Slack, potentially your PagerDuty API key, and so on. then we're going to set some thresholds how many failed logins are allowed per ip address every five minutes that still seems pretty high but you don't want to wake people up in the night just because somebody forgot their password right then how many total failed logins per five minutes in seconds so five minutes here across all ip addresses and if somebody does an export how many records do they have to export before we set off this trigger, right? 10,000. So here we have this track failed login, and it's going to say, get the time. It's going to do a little list comprehension to filter it down to just this five minute window that we're interested in. And then we're going to add it to our in-memory failed logins. Again, this should probably be some kind of database or something. Usually you fan these out across multiple processes, so you need to be a little bit careful. But just an example, right? Save it somewhere that we can get back to next time it runs. So count how many they are. If it's greater than the threshold, send this alert over to where? Over to Slack, PagerDuty, whatever it is, right? Credential stuffing attack detected, how many attempts there were, the last username who was tried, and so on. So with this security monitor in place, then we go back to our login in Flask. This is how we're going to create this monitor. We're going to do the same thing. Check, check, check. But here's our fix. If the user does not log in correctly, we're going to track failed logins from this address for this user and so on. If this happens enough times, then all of a sudden we're going to set off alarm bells and say, hey, credential stuffing attack. too many failed logging from this IP, whatever.
|
|
|
|
13:30 |
|
|
transcript
|
0:47 |
We've come to the end. The final and number 10 OWASP vulnerability category. Mishandling of exceptional conditions. Mishandling of exceptional conditions occurs when programs fail to prevent, detect, or respond to unusual and unpredictable situations, leading to crashes, unexpected behavior, information leakage, data corruption, and security vulnerabilities. This can involve one or more of three failings. The application doesn't prevent an unusual situation from happening. It doesn't identify that the situation is happening. And or it responds poorly or not at all to the situation afterwards. Okay, this is going to be a fun one. We're going to dive into two final examples here.
|
|
|
transcript
|
1:50 |
Scenario and example number one, verbose error messages, leaking sensitive information. For most, error messages are okay, but leaking sensitive information, not so much. So here's the scenario. A mid-sized e-commerce company runs its storefront on Django, serving around 50,000 customers. A security researcher, or maybe in this case, a less well-intentioned one, notices the site returns an unusually detailed error message. They start probing. A request to slash API slash user slash 999999 comes back with a JSON response containing a full Python traceback, the Django ORM query that failed, and the database engine. Postgres 14.2 on db-prod-east-internal on the default port 5432. That's already way more information than an outsider should ever see. But they start playing around. So next, they try the product search endpoint passing in a little bobby tables. as the search term. The application luckily does not execute the injection. Parameterized queries as we've discussed prevent that, but the error response is a goldmine anyway. The traceback reveals the raw SQL template, the table names, products, inventory ledger, and so on, the column names, wholesale costs, supplier ID, and the internal host name of a read replica. Armed with this map of the database schema, the attacker shifts tactics. They find an old admin endpoint that doesn't use parameterized queries, and they craft a precise injection using the exact table and column names from the error messages handed to them from other endpoints. Within an hour, they've exfiltrated the customers table, names, emails, hash, passwords, and shipping addresses of all 50,000 users.
|
|
|
transcript
|
0:59 |
So why is this dangerous? Four reasons. It's free reconnaissance. Good reconnaissance. Attackers normally spend a significant amount of time fingerprinting a target's technology stack. Verbose errors hand them this information for free, accelerating the attack timeline. Two, it enables chained attacks. A verbose database error might reveal table names and column types, which directly inform SQL injection attacks, as we saw in the scenario. It leaks credentials. Database connection strings, like just postgres colon slash slash, often have hostnames, ports, and sometimes username colon password before the server. That is not something you want to share. And finally, compliance and regulation. It violates compliance requirements. PCI, DSS, HIPAA, GDPR, all require that sensitive system information not be exposed to end users. So sending that kind of information out also violates them.
|
|
|
transcript
|
1:06 |
Now let's look at a search API that is insecure because it gives away too much information when there's a bug. So even helpful error messages can leak too much. This handler tries to give a useful error message for bad queries, but exposes the raw SQL and database error message. So we're using try except that's really good. We try to write this query or try to run this query and return it as a And we give the full trace back of the error, the full string information of the error. And then the query attempt, you search for what the query input was and check if the search term is valid. And it gives the actual type of exception as well. So not good. And then we return a 500 because the system crashed. Right. A couple of things bad here. Obviously, the pink part is part we're focusing on, but also what about logging? What about alerting? There's multiple levels of things here that are not great.
|
|
|
transcript
|
1:46 |
Now let's see the better version, the nice version here. This time we're using our ORM, which actually under the covers uses parameterized queries to prevent little Bobby takeables, SQL injection attacks. So here we're going to use our products or say object filter, name, case insensitive contains the query and give us the first 50 back, generate those results. Looks great. And if there's an exception, this is really, really clever. I love how this works. So what we're going to do is we're going to use UUID, universally unique identifier, and we're going to generate this eight character random string. And then when we log the exception with all of the details, we're going to log that using the reference, the IP address, and all the things that went wrong potentially. And we're going to include that ID, but we're also going to return an error message to the user, which is general, not specific. Like we got this exception type, this kind of thing, and so on. It says, sorry, search is temporarily unavailable. Try again. Here's your error reference. So if somebody sends you a message or a developer is testing this and they see the problem and they go back to the server, they can go to the logs and get the detailed information using that reference ID. Whereas you don't actually have to give all those detail back to actually have them accessible to you for debugging later. Of course, if you use something like Sentry or another thing, you can just say Sentry SDK log the exception and you get those there. But if you're not logging it to a third-party exception place, this is really good. It's also still good to put into the logs. So you can go look in Sentry and you can look here for all those things.
|
|
|
transcript
|
1:57 |
Our final example of our OWASP issues is missing transaction rollbacks, missing transactions in general. So here's the scenario. A regional credit union with 12,000 members launches a new online banking portal built on Flask. Transfers between accounts work fine in testing, debit the sender, credit the recipient, log the transaction. But each step is its own DB commit, its own action in the database. A member named Dana tries to transfer $1,500 to her landlord's account on rent day. The debit goes through, but the credit fails. The landlord's account was flagged and frozen by compliance that morning. Dana's balance drops to $1,500, but the landlord never receives it. The transaction log also never commits, so there's no record of the transfer even attempted. Dana calls support who can see her lower balance but can't find a matching transfer. It takes three days of manual reconciliation to locate the orphan debit and reverse it. Meanwhile, a fraud analyst at the same credit union notices something worse. Someone has been exploiting the same flaw deliberately. They open two accounts. They set the recipient's account to trigger a constraint violation on credit, manipulating the account status through a separate endpoint, and then initiate dozens of transfers in rapid succession. Because there's no row-level locking, the concurrent transfers all read the sender's balance before the debt lands. Five $500 transfers all see a $500 balance and succeed. The attacker drains $2,500 from a $500 account, and because the credits all fail against a frozen recipient, the money simply vanishes into the system's books. By the end of the month, the credit union discovers $40,000 in unreconciled discrepancies across the platform, some legitimate failures like Dana's, some from deliberate exploitation. The root cause is the same. Each step commits independently, so a failure halfway through leaves the system in an inconsistent state that should never have existed.
|
|
|
transcript
|
0:30 |
Why is this dangerous? Well, pretty straightforward, but let's go through them real quick anyway. Financial loss, money disappears from the system. In the transfer case, the sender loses the funds that the recipients never get. It's difficult to detect and recover. Without a transaction log entry, which also failed because it was not committed, there may be no record of what happened. Race conditions compound the problem. Without row level locking, then concurrent transfers all happen against the same original state.
|
|
|
transcript
|
1:28 |
Let's look at a vulnerable or not well configured payroll endpoint here. So it's called process payroll. And the real problem is, as we described, we do a bunch of steps with the database. We're not both locking to make sure that only one of them runs at a time, and two, or committing or saving this information, or just running insert or update queries without actually using a transaction at all, depending on what programming model you're using. This one, so it's kind of a long one, so I put it across two pages. This sets the stage. We're getting the database. We're summing up the total we're going to do for all the payments we need to make so that we can balance the company's account, and then we're going to make a bunch of individual payments for all the employees. So over here, we're going to update. I'm going to say set the balance to the decremented amount where the company's account is, whatever it is. And we're doing this nicely. We're using a parameterized query. So that's all good. And we commit this work here to make sure this gets saved. Hmm, not great. Then for each payment and payments, we go around, update that employee's payments, commit it. If there's an exception, we log that and just keep going. That employee is going to be unhappy that day. And then we return the amount. Okay, so there's a couple of cool Python techniques we can use to put this right.
|
|
|
transcript
|
3:07 |
So here's the secure version. In this case, what we're going to do is leverage SQLAlchemy. That's how we're doing our work here. And most importantly, we're going to use a really cool way to work with transactions in SQLAlchemy through a context manager. You may know this as a with block. So here we have a with session local. This is just a session created by SQLAlchemy. Say begin. So begin a transaction as session. I've colored that blue to highlight it for you throughout the way. So we're going to come through and we're going to say, we don't need to call commit at all because the way the context manager works is perfect. So we're going to do some kind of query here where we're going to select the balance and this is going to lock this row so you can't read it again in the concurrency race condition. And then if it doesn't work, If it's not going to succeed, like there's no company or they don't have enough money, then we just return a response. Hey, there's no money here or there's no account, no company, so we can't do this transaction. That will leave the context manager, leave the with block, and just nothing happened, right? It'll stop the transaction, which means this row becomes unlocked. Then, if there is enough money to do all the work, we're going to say, We're going to debit the company, which we did, we saw before. And we're going to go through each one of these and set the value for the new input for all the employees, right? So we're going to go and give us all the accounts. We're going to go through and get the employee. If they don't exist, we're going to raise an exception, which will kick us down to here. But that, more importantly, will leave the width block, which is defined right along that bit. When you exit the with block with an exception, SQLAlchemy will roll it back automatically. If you fall through normally, or actually in that return, I said roll back, actually commit. So when you fall through normally, it will commit the transaction. So exit the with block without an error, commit. Exit the with block with any kind of exception, even maybe that line right there caused an exception, roll back automatically. So this with block up here, basically this is the guard. And just as the intended code suggests, like that's what happens in the transaction. Outside of it, it's either committed on success or rolled back on error. Super, super cool. So it automatically handles all of that stuff. Then we just log the result, return it. We get to this line right there. Everything is great. All happens in that one commit. You get down to here, the way the with block works, The exception automatically rolls everything back. So you can record it in your logger, return it in response, but it's already rolled back. Very nice, you don't have to think about when to do the commit, what times, what conditions, and so on. It's all just automatic.
|
|
|
|
14:17 |
|
|
transcript
|
2:14 |
Act one is finished, the OWASP top 10. We've had our intermission, and now it's time to come into act two. We're going to talk about building a specific agent and using that agent ultimately to find problems in open source code, as well as when you're done with this course, you can take that agent and most importantly, apply it to your code. Now, I want to talk about this in the large before we get into the specifics, because this concept of, hey, I have a security specialist where I've got a designer or I've got a an accountant agent or whatever is so overblown. Everybody in their friend is making an agent and they've got an agent framework and they've got all these things going on. So I get it like this is like kind of, OK, we've had enough of this people. What are we doing? And on the other hand, these concepts are extremely powerful. I pulled up this article here that is literally bewildering to me. The headline is Wall Street just lost $285,000,000,000 because of 13 markdown files. Excuse me, what? What? It was called the Sasspocalypse. And the idea was, well, oh my gosh, Anthropic released a legal plugin tool, which consisted of 13 markdown files for different ways to guide Claude. It proved to be so incredibly useful that a lot of these large consulting companies that did legal and accounting were just getting wrecked by Wall Street because of 13 markdown files. Will we make a $285 billion dent with what we're going to do in this course? No, we're only making one markdown file, but I bring this up to point out just how incredibly powerful and influential these agents actually can be when done right, done with a lot of attention, a lot of knowledge about the problems you're trying to solve and so on. And we're going to do that for security in this chapter.
|
|
|
transcript
|
1:21 |
Introducing the security lead. This is going to be a specialized agent that you can give to Claude Code, you can give to Cursor, or many other AIs really. Anything that you can point a markdown file and say, act like that thing. Now, this is a very specialized AI tool that specifically focuses on the OWASP top 10 plus generalized SaaS multi-tenant application security in Python. So I've spent probably four or five hours tuning, building, refining, adding to this security lead. So it exactly focuses on what it should, uses authoritative sources, and does not mess around with stuff that is not relevant to us, in particular for security people doing Python code. If that's not what you're doing, you're just taking this course anyway, well, you can see how I did it here and you can adapt it for whatever it is that you are actually doing. So in this video, what I want to do is walk you through the high level ideas and the core concepts of this security lead agent, which is a single markdown file we're going to provide to cloud code. And then we'll spend a little bit of time diving into exactly what that means in specific details. Of course, the agent, the security lead is included as part of the code for this course.
|
|
|
transcript
|
5:24 |
So what are the core components, the core essence of this security lead? There's quite a bit going on here. So I'm going to walk you through the high level, like I said, and then in the next video, we'll go into actually look at it. So what is the identity? Now, if you haven't done this before, what you're doing is you're putting Claude Code into a specific mindset, which focuses it on certain problems and tells it to disregard others. Claude Code could do Shakespeare if it wanted to. We're telling it like you're not interested in Shakespeare. you're not even interested in Java. You're interested in Python and security and SaaS in particular. So here's how we're doing that. We tell it that it's identity and role. It's a vigilant Python SaaS security specialist that acts as a guardian and vulnerability hunter. Proactive, pragmatic, and educational rather than gatekeeping. This is important because if you find a problem, it helps you understand in the context of OWASP and the greater security and reliability space what's going on. The security leads primary responsibilities are review the code for vulnerability before they're exploitable. Audit user data isolation, cross tenant ID, access authentication authorization patterns, identify injection and data exposure risks, verify security logging and audit trails, guide secure development with actionable fixes. Basically, it is based on built from Python plus the OWASP top 10. So a lot of this should sound familiar, right? And here we are, the governing framework is the OWASP Top 10 2025 edition. Every review is explicitly mapped to an OWASP Top 10 categories, so coverage is systematic rather than ad hoc. The agent fetches the raw markdown files as the definitive source of truth for what OWASP says, and then not just using it as training, but it goes and reads the open source documentation that is the OWASP Top 10, and it uses that as the canonical text. Now, I also told it you don't have to map everything that is an issue to an OWASP top 10 because there might be something that's a problem, but it's not exactly categorized in the OWASP top 10. Like, for example, if you're creating the database client wrong, maybe you're not using the connection pooling for the database, making every request a little bit slower and exhausting connections to the server. So things like that it also will identify, but it doesn't map them to OWASP top 10. the security lead has three core workflows it will identify attack service apply the OWASP lens trace data flow the threat model it'll verify mitigations and then document and then when we ask it to it will remediate them user data isolation now this is not exactly an OWASP thing this is the SAS side this is something I'm very concerned about with having an application that has multiple users and each user gets a subset of the data. You just think of Talk Python training and where you're taking the course. It could be you have a certain number of courses that you own and you have activity like I've watched these videos and so on or certificates you've generated. I want to make sure that all of that is restricted and private to you and same for other users for their data. This multi-tenancy issue is I think something very scary for people that creating SaaS and multi-user types of applications. So I put an entire section, this user data isolation audit section. Map user data boundaries, audit query patterns for user ID scoping, test IDOR, and review common patterns such as purchase, progress, enrollment, and so on. Third of the three, security logging review. Make sure that you do logging. This is another one I put in there very explicitly. I want you to pay attention to logging, logging not just the success case, but error conditions, starting to do a thing, failed to do a thing or succeeded and so on. So there's a whole section on that as well. Finally, standard deliverables. When the security lead does a report, what does it deliver to you? It's like you've hired your own pen testing security analysis company, and they're going to give you a report. This thing will give you a report in the end. So here we have standard deliverables. Structured markdown reports for each workflow, security review, data isolation, security logging, security implementation plan. Like what are we going to do to fix these things? OWASP references and concrete remediation. And finally, it should map all the fines to an OWASP category when it can. Be specific about user impact, not vague. Celebrate secure design patterns. One of the things that I want these things to do is show me what we're doing right and what we're doing wrong, not just wrong, wrong, wrong. You know, it's nice to hear how things are working well in addition to what's not working well. So that's part of it. and provide secure fixes, not just a warning, and educate the team so that the next time they just build these habits. All really good things. I think actually this education side is underused in AI programming. It's so powerful that you can just say, why do you suggest this? Why could that be wrong? How does this other change make this more secure? And you can really, if you take a moment and slow down, you can totally use these AIs to learn these techniques that they're trying to solve or these problems, the fixes they're applying to solve the problems, you can learn them, not just have it grind through and say, well, I guess it's good now. So this is the security lead at the high level.
|
|
|
transcript
|
5:18 |
All right, that is the security lead concept, but let's look at it in particular. So here I've got our source code open for our project. And we notice there's a dot clod, that just Claude Code has generated that. But I've created this not hidden folder just so it's more obvious so you can see it there. On Windows, it wouldn't matter, but on macOS and Linux, the dot means hidden. So I've created commands and an agent. The command is really simple. It just says, read the security lead and then apply it. Okay. And it takes arguments. There's not much to that. This is where the interesting thing is. This just lets us access it more easily from chat. So over here we have, how long is this? 553 lines of definition that defines what our security lead is. Now, sure you can read it like this. To me, it kind of scrambles my eyes just a little bit. There's a lot going on here. So what I'm going to do is in Visual Studio Code or other Visual Studio variants, in the markdown content, if you hold down Alt or Option, you get this reader view. So there we go. Better, better, better. So let's look at this. A lot of it's the same, but I'm just going to skim a few interesting sections so you get a sense, and you can go back and read. So this defines the security lead. It says, a vigilant security specialist focused on SaaS security, Python web application vulnerabilities and user data isolation, ensuring robust protection across the entire application stack. It has a color. This doesn't apply in the extension in Visual Studio Code and others. It only applies in the terminal, I believe, but it can get different output colors. Here's this personality. We talked about that. It's identity and memory. So you remember common vulnerability patterns, secure coding practices, and so on. You celebrate secure patterns, and you flag bad ones. There's critical warning, review, and so on. It's just examples of what could be going on. Be specific. Explain how this could be a problem, what an attacker could do. Don't just say, this is insecure. The critical rules are the OWASP top 10 framework, and here they are. And we're going to actually give it more detail in just a minute. But I also told it, look, you don't have to fit every single problem with this application into a specific OWASP category It doesn't fit like creating this database client or open too many file handles or whatever. That's not necessarily an OWASP problem. That's just something that should be addressed. So those go into the general hardening category. Talks about how it addresses, you know, here's an example of good stuff. Here's an example of bad stuff for each category. It's workflow. We already talked about a lot of that. Here's its deliverables and a more concrete example. You're going to see this come through when we run it. And this is really important. Common vulnerability patterns for Python, Quart Flask. Here's some of the things that could come along. If you're a Django person, hey, add a Django one or whatever. Mongo or the Beanie ODM or SQL and Postgres specific. These are all the issues that are very common for those technology stacks. So if you're using Mongo, look at these. Don't worry about SQL injection because we don't have it, right? We don't have SQL. Similarly for our SaaS multi-tenancy isolation and so on, communication style, educate, don't lecture, et cetera, et cetera, you're doing well when security issues are caught in review, not production. Developers understand and can apply your recommendations without your help. If you double-click, it opens it up. No cross-user, right? Gives it some sense of what's working, and then it gives a little more about its review, and then this is really important. It knows about all the OWASP things, and this is great. and even knows the changes. But if we go down here, this is the good part, the authoritative part. So you're not just based on hallucinations that it might make about what an OWASP thing is. No. Are you working on a security misconfiguration? Go over here, read the documentation from OWASP, from their GitHub, in Markdown. This is way more efficient and faster to give it just the raw Markdown. Instead of here, go read the website and figure out what part of the navigation and sidebar and stuff doesn't apply and all that kind of junk. So this way it's much quicker. You could even go so far as to download all these, put them in your project and say, just reference the local file. That's true. That'd be really good. Okay, so that is our security lead. Now, this is a really short section, but you're going to see this plays a pretty significant role in what we're going to find for our application. So in the next act, We're going to take the security lead. We're going to choose a subset or just a handful of open source, Python-based web applications that are not brand new. They have multiple, maybe many contributors. They've been run in production for years. We're going to turn the security lead loose on them and just see what happens. So it's going to be a lot of fun. I think you all are going to be really surprised. I've done a little experimenting with this and it's kind of blown my mind. So I hope you're excited. This is gonna be awesome.
|
|
|
|
8:14 |
|
|
transcript
|
5:41 |
Here we are, we're ready to pick our three applications and we're going to go to Awesome Self-Hosted, which is over here on GitHub, Awesome Self-Hosted dash Awesome Self-Hosted or right here as well, a list of free software network services and web applications that can be hosted on your own server. I'm a big fan of self-hosting. I love to take some of these applications, run them in a Docker container, nice and isolated and instead of, say, sending all of my users' traffic to some tech giant that also happens to be an advertising company, I can save it locally, both for my benefit and for my users' benefit and privacy. Win, win, and win. I will tell you, however, I've applied this technique we're about to apply here to one other application off of this list. I'm not using the same one because I want to do it from scratch. It did take my enthusiasm down a notch after seeing how many issues this project actually had. I'm like, well, do I really want to run that? I don't know. So not running that one, not running that one. And we're also not testing it here. So I've picked out three already because I just want to have a little bit of structure and not spend too much time going through this list. So let's begin by holding page down and observe the scroll bar. Look at how many apps there are here. I don't know if it says actually how many there are. I should say how many there are, but there are many, many of them here. So what we're going to do is we're going to go through and we're going to just search for Python basically, and we're just going to look for things that look kind of interesting here. Some of them will be kind of small scale. Some of them will be large scale. Like I said, big, large companies, many, many contributors, lots of money behind it. And what comes out, I think, is going to be pretty interesting here. So we come down here. We have stuff for analytics, booking and scheduling. We've got file transfers, genealogy, low-code platforms, remote access, URL shorteners, et cetera, et cetera. So let's just pick automation. And down here it says, okay, here's automation. We've got active pieces, a no-code business automation tool like Zapier or Trey. And this one just says it's Docker, which doesn't tell you very much because Docker is not really a technology. You can't implement something in Docker. So I guess we could go find this on GitHub or wherever. Oh, here we go. This one is implemented in TypeScript and it has 402 contributors. Trustworthy? I don't know. That's a lot of contributors. It might actually be pretty trustworthy. Automatisch, which is business automation. German, again, Docker is not enough for a language, folks, but okay. So you can just go through and, you know, find interesting things. So which ones did I choose? Kibitzer, personal web assistant. So stay on the edge of events, your way. So basically the idea is this will pull authenticated and public web pages, stuff on the web, and then it's going to come back and it's going to notify you over messenger or email. So you've got accounts, you've got outbound email, you've got outbound messenger, you've got polling, you've got inbound data from what you're polling, if it could be messed with and so on. I think there's interesting stuff. Look, we've got raw SSH to complex browser scenarios, integration with Slack and Mailgun. I think there's enough here that this won't be interesting. And this is the least popular, least used of them all, 711 stars. and it's 10 plus years old, so that's good. It does have 22 contributors, and it's been running for 10 years, but it's not mega big tech scale, right? So I think this will be an interesting example. So number one, Kibitzer. Number two, Apache Superset. I'm Hane. Come on. It's from Apache.org. It's one of those projects. And it's an open source, modern data exploration and visualization platform. Maybe this one comes up roses. Everything's perfect. I don't know, but you can see 17,000 people have forked it. 72,000 people have started. There are 1,400 contributors. This is a mega, mega funded project. Will it have issues? I don't know. We're going to find out. By the way, significant section, probably the back end is almost entirely Python. The front end is Jupyter and TypeScript. So it should have some good Python things going on. And then we come to Paperless. It's a community-supported open-source document management system. This one actually scares me the most. Like, yeah, we need to manage our documents. So let's put those really important contracts up there. Oh, yeah, my W9 that I submit to the government for my taxes, we'll store that over there. These are my documents. That makes me nervous. Okay? So it's got all these cool features, right? A lot of features. So potentially a lot of stuff that could go on. It's extremely popular. Not quite superset, but close. 38,000 stars, 2,000 forks, 389 contributors, and it's mostly Python, back-end Python. Looks like a cool app. I'm not here to knock on any of these apps. I chose them because I thought they were cool. We're going to find out what the security story around them turns out to be.
|
|
|
transcript
|
2:33 |
Now, let's get these all on to my machine and set up with the agent. Just so you can follow along, you can do the same thing, and then I'll take them one step at a time. But let's get each one of these projects set up all in one go here. So I'm going to come over here, copy the URL. Let's fire up warp. And just for now, I'm not going to leave them here, but just for now, I'm going to put them on my desktop so we can quickly fiddle with them. Say git clone, go over to Apache Superset, git clone Superset, a little bigger. Git paperless. Okay, then I want to go back and actually copy this clod folder. So reveal that in Finder. Copy. Close. I guess we need to be there. So paste it here. And rename it to dot Claude. Okay. We're going to rename it in a way that will be allowed. Oh, you know what I think I can do? I can say show hidden. And then it will allow me to rename it to dot Claude. Yes. There we go. Sweet. On macOS, you say show hidden by shift command dot. Yeah, I know what I'm doing. It's the only way it works here. Oh, this already has a clod. Well, well, well, guess what it's getting. I don't care if we mess up there. I want to say merge. Beautiful. So we got up here, we've got our security lead and we got our asset security lead. Beautiful. So we have these three open And then we can just say code, I think we're ready to get going. We're not going to commit our changes. We're just going to play around and see what we can come up with. Stage is set. We have our three projects. We've copied our agents and our commands into them. There is a way to register this as globally on your machine, but then it's not stored in source control. I kind of like to keep it associated with the project. So regardless of how the machine is set up, like maybe my laptop versus my main desktop versus my streaming desktop, you know, the one I'm on now, that kind of stuff. It's stuck in a source control, but it's up to you. Anyway, we're ready to go for these three projects.
|
|
|
|
55:10 |
|
|
transcript
|
5:04 |
All right, we're ready to work on Kibitzer. That's the first one. I want to go from least popular to most popular, right? This is possibly going to contain the most issues that we can address and so on. So let's just make sure your Visual Studio or whatever is set up correctly, if you're going to follow along with those is you want to go over here and type Claude and make sure Claude code from Anthropic is there. There's a lot of downloads and it's not some rando thing. So I've already got that installed. Perfect, you can see it says disabled. And so then you might think you come over here, but it can use Claude Code. I really like to use it in that thing. So instead, I'm going to hit Command-Shift-P and type clod tab, open Claude Code in a new tab. Cool, cool. So let's just start by saying, hey, Claude, Ready to work on this project. You might think, well, that's very wasteful. Yes, I wouldn't really do that. But on this brand new machine, I would like to ensure that Claude code is installed. Claude code is sufficiently up to date that the extension is set up, that I'm logged in, all of those things. There is one other thing I would like to do. I want to verify that I got the agent in here. say, ""Tell me about the security lead. See, good thing I asked. It was kind of going a little bit crazy. Let's start over and I updated this 'cause I had to add the dot Claude here. So I can say, ""/ask the security lead. And let's see what that does for us. Perfect. So the command lives where we put it, when you invoke it, It tells me to read the security lead and fully adopt that persona. And stop scrolling, please. Substitutes whatever you pass in as arguments. So the usage is ask security lead your question or code features to review. And what the persona does is says, I become the security lead. We don't need to rehash that. You know what it is. Okay. This is tuned for Python, Court, and Beanie. This is a web scraping tool, not a multi-tenant SaaS. That's fine. So I think everything is set up for our source code to go here. We're ready to ask Claude how this works. We make sure that it does know the security lead. Because if we just say, you know, do a security review, it may or may not participate with using this security lead we've created, right? So one more thing, I do not see a Claude code here. So let's do one more thing. I'm going to start over so it's not worried about any of this stuff that we've done before. which is slash clear. And the other thing we want to do is we want to give it full thinking and make sure it's using Opus. Now, I know some people get frustrated, like, oh, Opus is slow. If I choose Haiku, it's fast. It's like, yes, it'll be fast, but you want this thing to be 100% right and thorough. And that just takes time. Pick the best model. So even if the agent gets done fast and the model does it wrong, it's way longer for you to go figure out what it did wrong and fix it or figure out what it skipped. So choose the most thinking, the most you can get out of it. So before we are ready to do our security review, what we're going to do is I just wanted to analyze this project and create a Claude Code file, a claude.md file, because that helps it get started. We can put rules like, hey, when I ask you about security, always use a security lead or ignore the stuff about MongoDB because that's not what this app is about. All right, we put it on maximum thinking. We said, please create this claude.md file. I'll let it do its work. Hey, look at that. We have it. So it says that the repo root, it covers. And now it's thinking, what is it? It's architecture, top level, looks orchestration, extension points, conventions, and so on. Few things I deliberately did not include dependencies, per notification and transfer of behavior, anything about your role or preferences. OK. Perfect. I think this project is ready. We've got places that we can put our rules for what the cloud code should do in this cloud.md. It's got an initial understanding, so it can go back and reference this, so we don't have to have it think longer, waste more time, and so on. So I think our project, our Kibitzer project, is ready to go and start analyzing for security issues.
|
|
|
transcript
|
13:13 |
Okay, project is ready. We're about ready to start talking to it and having the security agent do its security review. Then we're going to talk through all the problems and potentially fix them. One thing to keep in mind is this conversation we've had so far about setting things up, while it did reference the security lead, is really not germane to what we're about to do. So what I'm going to do is start over. Now, managing your context, like how much stuff is going on here, right? This context is grayed out right now. But managing how much stuff is in the thinking memory, working memory of your AI really matters. Claude Code has got tremendously better at this. It used to always go working, working. Oh, we got to summarize or compact the conversation and we'll continue. And the way it's gotten around that is it started using more sub-agents. So instead of saying, I'm going to read all the test files, what it will do is it will fire off a sub-agent with its own context that will read and analyze all the test files. And then that agent, sub-agent, will return a summary to the main agent, and then its context is freed and reset. So I've had work where it's kicked off five or six sub-agents in parallel and had them all work for 15 minutes, and then it circled back and pulled all the details back together. So we don't need to worry too much about context like we used to. However, given that this doesn't matter here, we could either click this to start a new conversation or we could just type slash clear because we don't really need to keep that. It was just like create a clot MD, right? That's not super relevant. And you can always get back to your history right there as well. We're going to do a security review. Now you can just say ask security lead and hit enter, but then it starts going straight away and I would like to give it a little bit more detail. So we're going to say, probably more than it needs to know, but please have the security lead give the code base a deep analysis. I'm considering using it for a critical project and we need to be triple sure the security is top notch. I'll be happy to work with you to solve these security issues we find together. Let's go. Let's see if it finds the security lead agent, actually. I didn't say explicitly ask the security lead. So it found the skill. It's going to run that. Now it's reading the actual agent, and it's going to be the agent. Here we go. All right. It's adopted the security lead. You can see it's talking about looking for the compile the OWASP aligned findings reports and all that thing. So we're just going to let it keep going. Notice how it finished like three or four of those all at once. So it had review, review, review of all those. and then the next thing it did was click, click, click. That's probably a bunch of sub-agents processing each section there. All right, here we go. We got our security review of 8.0.1. Let's find out what we got. Overall risk level high for a critical project without significant operational hardening. This is concerning. Let's find out. Medium, if Kibitzer is run as an isolated single tenant operator. So medium is probably how it's intended to be with tightly controlled config and no untrusted users on the host. So the attack surfaces, the YAML, you know what? I don't think it's fair to say attackers have access to the source code, right? So I'm going to tell it once it's done talking. I'm going to tell it to say, assume that it's running in isolation and the only access by malicious actors is from the outside. Thank you for the report. Please reanalyze assuming that we're running this on a production server with no unauthorized access to the source or configuration files. Analyze it only from the perspective of someone trying to break into or influence it externally. It's still going on. So I do think that some of this may be an issue, but possibly the sandboxing is pulling code in and then running it. So let's find out. I could stop it, but I'm gonna let it finish and then I'll tell it to adjust. Okay, it looks like it got it. I wanna make one more change before we continue. I'm gonna go over here and say, make a folder called plans. And this is a little bit misnamed 'cause right now it's reports, but it's like reports plus plans. But what we actually end up getting out of this report is a recommendation of fixes. That's like a plan for fixes. So anyway, what I'm going to tell it is also, you know, thank you for the report. Please reanalyze, assuming that we are running this on a production server with no unauthorized access to the source or configuration files. Analyze it only from the perspective of someone trying to break into it or influence it externally. I don't think it's fair to say, well, you could read the settings file and mess with it. All right. It's like, that's not fair. because there's a lot of projects. If you can read the source files and mess with them, you're already in trouble. So what I'm gonna tell it is to save, I'm gonna tell it the plans for the show, I'll say @plans/, it doesn't think that's a thing, but it's gonna work anyway. So have it redo it again and save it to a file. Then we can iterate on that file, review it, make changes, have it update how it was fixed and so on. It's really, really great to persist these things and not just leave them in the memory of the chat, the conversation, but to create concrete files and say, we've created a plan, here's what you wrote, here's what we've done, here's how it's evolved. Really good way to get very little drift. It said these are the things, it remembers what they are, and as you address them, it keeps a history of that. So really, really much more valuable than just chatting, if you will, here in this interactive window. So again, it's going to take a minute, I'll let it grind and then we'll review the plan for real. Note that even though the extension didn't auto-complete it, it figured out the directory. All right. it saved the security review into this file. It took a while. I would say at least five minutes. But again, five minutes. How long would such a report take if you hired an external pen testing team? I guarantee more than five minutes. Okay, so it's given us just a summary of the file. We'll open up the file in a second, but let's just see what it said. headlines shift under the new threat model. So remember I said, don't assume that you can get into the code and mess with it. That's kind of silly. What we're going to assume is it's just running as an app and people can do what you do to things running on the internet externally. With a hardened host and trusted operator config, most of the previous critical findings drop out. Yes. What remains is concentrated in two areas. What commits it does with bytes from a hostile target page and the network confidentiality of credentials to the notifiers. That's actually pretty low in terms of issues. Okay, so it has five critical findings. The SSRF via redirect chasing and DNS rebinding. Hostile monitored URL pivots into IMDS, DNS rebinding, loopbacks, fetching. The fetch bytes are then exfiltrated through whatever notifier uses the check. Okay, we'll look into that. Start TLS. If that fails, it just, okay, fine, We'll just do that without encryption. That's a problem. Selenium drives full Firefox against attacker pages with persistent profile. Oh, interesting. Okay. Carries cookies from other sites that you could be logged into. Yeah, that could be interesting. That's trouble. Denial of service via unbounded body. Set recursion limit pretty high and so on. And Telegram chat ID auto-imprint. When chat is omitted, Kibitzer promotes whatever stranger DMed the bot first to official notification channel. Okay, that sounds like something you'd want to fix maybe. We have eight warnings and then how we can be protected. Okay, so that's good. This is our analysis. I'm going to go to the next video and we're going to actually review what the full plan that we got. But let's keep it ahead real quick. There it is. We've got our plan. we've got our plan so and really quick just how big is this plan let's scroll down 348 lines okay not huge but it's certainly not small and i would also like to point out like this did take some time i think probably the whole process of the clod md file to understand the project then the first pass we did that kind of with the wrong assumptions so we're going to do So another pass with just this external thing that we said and create the plan and write it down probably took 15 minutes. And I, of course, condensed that for you because I had to wait. There's no reason you should have to wait for the course, right? Like in real life, you'll wait. But that's, you know, for a project you really care about. So how much of my AI credits did I use? How much did this cost me? I mean, it didn't. Cloud code just kind of charges you a flat fee. But in a sense, like how much did this use? So we can come over here and we can say, it's kind of reset. So I'm going to put it back. So I can go to account usage and you can see I have the Claude Max plan, but this is not the Max Max. This is not the $200 a month. This is the $100 a month. So there's the Pro, which is $20 a month. This is five times that much usage. The Next Max, the Super Max one is another four times more than what I have here. Okay, so this is like, it's not that much. $100 a month, a lot of money, but for what we get, not too much. I've not done any AI things today. I've been in meetings, writing email, working on videos. So this is 100% the usage for this analysis of this project. So we've used up 14% of what we get in a five-hour window. Hmm. That is nothing. So I did a bunch of AI work yesterday. Actually, this is for the whole week, five days. I've used it, and it's only used up 5% of my credits, which is absolutely insane because I used it a lot on a couple days. So this is very little work, right? Very little cost. It will cost more to apply the fixes, but I would imagine we could actually fix the issues, most of them, or somehow make notes what we could do. Some of them maybe are, we'd have to do like breaking changes to Kibitzer potentially to actually fix them. But we could fix them, probably with just my four-hour window credits. Easy. Think, compare that to what you would get if you were hiring an external security company or bringing in one or two security consultants. Would take days. It would cost probably $5,000 to $25,000 to do this work. This might be like a dollar, folks, and it took 10, 15 minutes. This is stunningly powerful and quick and effective. We haven't read the review yet, so maybe it's bunk. I don't think it's going to be, but this amount of spend for what we got out of it is absolutely off the hook, a good deal. All right, next video, we'll dive into it.
|
|
|
transcript
|
13:29 |
All right, let's review the plan. So remember, we had Claude Code generated plan over here. I opened it up, put it in the reader view by going here, holding down Alt or Option, clicking that, pulls this up. Okay, so let's go through and just see what it said. I've not read this. We're reading it together for the first time. So this report assumes Harden at the host layer. I think that's fair. Y'all can agree, disagree, rerun your analysis. Otherwise, if you want. So the kibitzner binary source tree, Python interpreter installed packages are trusted tamper-proof. True. YAML is owned by a maintainer. So these are the assumptions, right? The process and working directory are not reachable, right? Slash pages. I think that's the database. I think that's fair. Okay. Under that posture, the entire config is code pickle and CWD and other things fall away. They drop out because you have to be running on the machine to already do that. I feel like if you're running on the machine, game over. All right, what's left across the network? So here we go, external attacker risk summary. Overall risk, while we're working, I'm gonna ask Claude here and the security lead, please give us a security grade, let's see what it says. Oh, C plus, I think C plus is where it's at, right? For the external attacker mode, why it earns a passing grade, What drags it down? Full thread, again, we don't care. Would you run this on your data on your server? Honestly, not as it ships today, but it's close enough that I'd happily run it after a short hardening pass. What I'd actually do is I would run it on a personal VPS to monitor a few public pages and ping me on Telegram. Yes, probably. Would I run a server that holds data I actually care about with credentials that matter, not without fixing these five things? All right, well, let's go to the report. Here they are, highest impact findings. You can see they're probably mirrored. A monitored URL can pivot the fetcher into the cloud metadata service and so on. Failure to do TLS over email fails. You could be logged into your bank and then have Firefox nav around. Again, is that real if you're actually running on, say, in a Docker container or something like that? Not sure that's really relevant. But if you're running it just on your computer, then it becomes that. Unbounded response body, Telegram chat auto imprint, and HTTP cache poisoning. Okay. First of all, we can jump right into the code. It says, I think that's great, right? Like right out of the report, we can just jump right to that line. And it says, this fetch has retries. like, I don't know, this doesn't scream super experienced user, I would do something like this. I would say at tenacity or stamina retries equals three. You don't need that other stuff. Anyway, put it back. This is what we got to work with. So it says, I want to go through, and let's just see what it says about this retry area. It says the operator monitors a particular page. Could be any third-party site who own owner turns hostile, a CD and hosted blob documentation site, a competitor's pricing page. So one thing that might be interesting worth knowing is, do the fetch requests signal that it's the Kibitzer system or is it hiding behind a common user agent? Spelling aside, should be good enough. okay so it does say it's kibitzer which i appreciate but it could mean that somebody could you know look at that user agent and take an action to do something malicious on it honestly i think this is really low right because you're picking what you're monitoring it's not monitoring random things right the chance that you just put in oh some hacker ip address Possible, but not likely. So let's go in here and say, while this is true, so let's ask it. Let me show you something really cool as well about the working with the agent here. So if we get it, all I guess I had to do is just double click there. So if I go and highlight this, notice it says down here, two selected lines. So I can just say this when it's talking about that section, which I think is pretty cool. So let me just ask, what is the proposed fix for this? How hard is this to fix? Because my first impression is this is not worth fixing. It's so unlikely that you monitor something out of the blue, then it turns evil, right? It's talking like your own team city server and so on. But let's see what it says. It has three layers to fix it. Disable redirect following by default. Fair, but that could also result in less reliability if something changes about the thing, but possible. Egress IP denial is enforced at the socket level. A custom HTTP adapter that resolves the host name before the request connects validates the resolved IP address. I'll loop back. Okay. Validate every redirect hop when redirects are opted in. So it's really proposing that we do a socket level denial list, and it catches layer one misses. So two breaking changes for legitimate use cases. Checks that monitor URL whose authoritative server is some internal admin page won't be allowed. Followed redirects must be added. I'm going to tell it that we don't care about this, but I'm going to put it as not addressing. So you may choose differently, but for me, I'm going to say this issue seems unlikely to justify breaking change. Please update the plan to indicate that it's an acceptable risk and skip it. You might be saying, like, Michael, you could just go skip or whatever, right? And, like, you could write that? Sure I could. But this is probably referenced in multiple locations. There's like a review at the end of this document. There's the fixes section. Having it comprehensively fix these things could be good. So here I can show you. Let's put this into stage changes and see what comes out when it's done here. What was the stage changes? All right, let's keep going and reading while this is going on here. Okay, I'm going to say one. Probably okay. This one, this is not okay. I think this one, we're gonna need to fix this. We're gonna come back and fix it. It just jumped around. What happened up here? Accessible, acceptable risk, not fixing. And still working on the other location. So yeah, we're gonna fix our mail server. And it says, we don't wanna do plain text. So we'll have it fix that. Execute a hostile JavaScript on a long lived persistent profile. I also think that this is probably worth the risk. I mean, not worth the risk. I'm going to ask it what the fix is. How hard is it to fix? Because if it's super easy, that's fine. But they also might be using features of this. Maybe somebody's logged into an account in TeamCity and then has it pull it using their existing session. Yeah, it could be an issue. But you're reaching out again. So I'm going to call it maybe acceptable. If I was running this, it would probably be in a Docker container where I'd never, ever, there's no UI. I'm not running Firefox. So it also could be a cross host thing. Okay. Hostile target can out of memory kill or freeze the daemon. Okay. That's a thing we can look at. This one seemed like it could be an issue. Somebody could trick the telegram. Then we have some warnings. HTTP cache poisoning leads to silent monitoring blindness. All right, fair. There's also some with timeouts. What else have we got here? Verbose error messages and tracebacks routinely exfiltrate to notifiers. Yeah, that we talked about. We don't want that. Notifier responses are logged at debug verbatim. So log injection or secret leakage from hostile webhook responses. Yeah, probably want to look at that. Cache control. quest retry back off path is dead code in Python 3.10. Okay, we're going to look into this. Wow. All right, so here's the breakdown from the OWASP. It's a little scattered brain, you know. Like I said, I'm reading this along with you. But let's see what it does for our analysis here. So broken access control. Everything's fine. No inbound surfaces. So that's good. security misconfiguration. The TLS email fall through is not great. We're not monitoring for anything in continuous integration. So we probably want to add a pip audit. I think that that's absolutely fair. Our cryptographic failures we talked about. I think that's definitely a big one. Low on injection issues. We said the insecure design, like the host, the thing we're talking to turns evil. That's not a problem. Authentication failures. We're going to address that. The running on machine issues are, you know, number eight or out of scope. I told it security logging cache hits, not logged air tracebacks, exfiltrated, no alerting on hooks. Yeah. We need to fix this. Okay. Mishandling of exceptional conditions, unbounded body. Check that raised recursion limit. That could be an issue. Dead retry path, swallow TLS. There's a lot to do there. And finally, under general hardening, the cache is unbounded, and I bet the cache is in memory. Cache checks are serial, no watch dollar. Okay. Finally, the recommendation, the recommended action plan. Remember when I said there's other places it's going to change this? Like right here, it slashed that out and said, whoops. It said not, don't worry about this. All right, so let's just look at the tier one. Must do before production. I said that this is acceptable. This TLS fall through is an issue. Fetch capsize and recursion and timeouts, all of those things seem reasonable. They might not necessarily be a security issue. They could be kind of a DDoS, but also just a general hardening. This doesn't look great. So even if it's not a malicious attacker, like you say, let's put it into a channel. So everybody in that channel can get notified by Telegram. Well, then maybe all of a sudden something goes weird and somebody just responds, and now they become sort of the primary thing to chat to. Some kind of typo here for Fetcher Resilience. All right, so I'm going to fix that. Pardoning, Firefox isolation, scratch per profile. Yeah. Cash control, markdown injection, air reduction, and optional hygiene. What do you think? You know, I've been talking a lot, kind of just rambling about what the security agent found. Honestly, it doesn't look super terrible. This application looks like it's in pretty good shape. There's a few things that we're going to address when we get to the fixing step in the next couple of videos. But some issues to be addressed for sure, but it doesn't look terrible actually. It looks pretty interesting, I think. It does not have, at least as far as the security lead knows, any one-shot remote code execution vulnerabilities that an attacker can fire. That's good news. The bad news is its core function, fetching attacker-controlled bytes and shipping them to the operator via visible channels is the kind of pipeline that needs very careful control. For example, the un-whitelisted that are the unescaped content going straight to these channels is bad. All right, so they want to help you work on this, and sure enough, it will. So that's what we're going to do. I'm going to pick a couple of these out, and we're going to go apply the fixes to the code base here. One of the things I really like about this is, you know, I talked about how quick it was, how inexpensive it was, and so on. Not only is it quick, but once it's come up with this report, We can use Claude Code and our security lead to fix these issues in really good ways, like really decent ways of fixing these issues. So it's not just now our job is to go figure out what matters and how to fix it. We'll continue to work with our AI to actually solve these problems.
|
|
|
transcript
|
8:09 |
All right, we're ready to work on this. I did remember as this was making changes, I told issue one is just let's skip it. And I pointed out, yeah, I could go and write somewhere like skip it or delete it, but it's woven throughout the report. So I actually had it keep track of the changes. You can see here, here, here, here, here, here, here, here, here, here, here, and here. All of those are places where it's gone and made a change. And so while it's kind of, you know, like why didn't you just delete that line or go and edit the one file. Like I would have edited it right there, but all these bits are like woven through here. So it's, even though it's a little bit slower and sometimes it can seem wasteful, it's helpful to go in and actually just have the AI revise the report based on what you're saying, because it could be subtly mixed throughout different locations. Okay, that said, let's go and address one of these items. The recommended action plan is to, we said we don't care that the web page that we chose used to be safe and turns evil. That seems like such an unlikely scenario, even though, yes, it does use a custom user agent that you could detect and then go loose on it. Still seems unlikely. This one, if we come over here and where we're just creating the default context and then catching that, instead, basically, the idea is we can just let that start TLS failure become a failure to start the application instead of falling back to plain text, which I think is reasonable. So I'm going to say, come over here, and we can just highlight this, remember, and it says one line. It looks like a multi-line, but see that it's got soft wrap, so it doesn't shoot off the page. I'm tempted to clear this, but I'm also tempted to keep it the same. I'm just going to leave it here. The new clod these days is quite good at managing this context. So I'm just going to leave it and it'll probably start a subagent to solve this. So this is one we definitely need to fix. Give it some more room. Make sure my head's not blocking it. There we go. Here we go. This one we definitely need to fix. Let's mitigate it. Failure seems like the right action to me too. And this is important. Please keep the plan updated as we make fixes. So it will say we fixed this here and so on. instead of just having this big list, you've got to remember, well, what have we done? What haven't we done? And so on. It's good to have Claude come back and use this as a living document for our progress. Well, let's see how it does. Give it some room to breathe. All right. It's got a plan. It's rewritten the notifier. It's updating the unit tests for new KWARGs and TLS coverage. It's going to try to run Flake 8. So let's quick set up a virtual environment. So I can have this shortcut, just create a virtual environment with uv. And we've got a uv pip install Flake 8. And pytest, what else does this thing need? The setup.py, so, how to install itself. All right, now it has at least a chance when I say go. So it looks like it's going looking for some random other thing. But now it's trying to use pip, but pip doesn't exist because this is the world of uv. Come on, Claude, catch up. It never uses that. It uses uv pip install. There we go. Let's see how that works. This is the kind of stuff you have to just set up like once, you know, you've got to create the virtual environment and all those kinds of things. And once you get your project and you're working on it, you know, it's just going to be set up and these things work much more seamlessly. All right. It looks like it installed something. That's good. Now it wants to run flake 8. Fine. I would prefer rough, but we could tell it. but we're just going to let it run pytest and see how things go. All eight tests pass. Let's see what it say up here. You know what? You are screaming by. Four updated, four new. Now run the full test suite, so that's what it's up to. And update the changelog, which is beautiful, and mark finding two is fixed. How about that? What does it say for the changelog? Let's give that a quick read. Unreleased. SMTP notifier no longer silently downgrades to plain text when start TLS is unavailable. The TLS now is mandatory whenever credentials are configured and start TLS failure aborts sending. Now another thing worth looking at is the actual code, you know, the thing we told it to create. So let's go over here and I'm using stage changes for the old work and ending changes for the other. So let's go and see what it wrote here. Let's first see what it wrote for tests. It can just set those to true or false here for inbound parameters. We might want to tell it to use defaults. Let's see if it has them. Well, let's jump over to the actual file. You can see the changes right there. So use SSL equals get that from the inbound credentials. Acquire TLS. If and only if there is either set or there's a username and password. If there's no username and password, just let it talk to us. Probably some local email server or something like that. Okay, great. And if we go to send mail, this looks like this is new or send mail. Yeah, it's got defaults here. So that means it's not a breaking change to add those two flags. Perfect. This looks like pretty good code to me. The tests are passing. So scroll back down and let's see what we got. Finding two is fixed and the plan is updated. How was the plan updated? I'll just read it in line. There's a lot, a lot of stuff going on the screen here. It's added a status. Oops, added a status remediated and it's noted the resolution here. It's pretty awesome. Was rewritten to use SSL as a parameter. No, SSL.createDefaultContext for certificate verification. Treat startTLS failure as fatal whenever the credentials are configured. Okay, that looks pretty good to me. You can see all the different places in the plan. Finding to update, it's updating the table it created, right? For example, like right here, this part of the table when it had an X, it now has a not too bad and this part's fixed. Okay, so got a nice little summary of what's changed. Full rewrite of the connection TLS path. The unit test still passed, that's great. It even is talking about it. Checks are clean. The unit test pass, and in particular, the ones that do SMTP testing all pass. What's left? So we're finding four, finding five, Telegram, and finding 12. A typo. That's pretty amazing. So I think maybe our biggest issue, I don't know, maybe the Telegram one is the biggest issue. So either the failing on the connection and just sending plain credentials for SMTP, or this Telegram one seems like the two biggest ones. So call this one done. and then we'll come back and fix the next one.
|
|
|
transcript
|
15:15 |
All right, let's keep going. We're back here, and I have maybe finding four, this set recursion limit is the thing that we're going to work on. And I'm going to drop these changes into staged, so we can basically use our pending changes as our diff window for the latest work. Let's go and open up our plan here. Find this section, finding four, so basically this line. now I can come over here and say amazing work Claude let's do this next notice it's got the line selected whatever it is finding for fetch size cap and so on so that gives it all the context that it needs give it a little something playful are you ready to crash some hacker's dreams let's see what it says oh I'm ready let's go ruin someone's afternoon. This finding is actually three separate fixes bundled together. Let me lay them out before I touch the code so we can agree on scope. I'm here for a cloud. Let's do it. The plan, fix a bounded fetch body. Switch from response.txt to a streaming read with a hard byte cap. Sure, that seems reasonable. I guess it matters what it's pulling in. I'll do fix a, and then let's give it a little bit of comment. Agreed. 5 megs seems pretty small. Remember, we're trying to limit out-of-memory issues. Great, agreed, but we're just trying to guard against serious issues. Let's set the limit to 50 megabytes. Okay, what's it saying about this one? Fix B. We can give our thoughts on it here. There is an interview mode, but it doesn't always pull up the interview mode. It can inside of the extension, just like it can inside of the terminal. But it didn't this time, and that's fine. I mean, we're just kind of going back and forth. This is a more free-formed way that I prefer anyway, most of the time. Remove the deep recursion context manager. With the HTML5 parser and Python's default thousand frame limit handles real-world HTML fine. The only thing that raises the limit was protection against pathological nesting, which is exactly the hack we want to not tolerate. Okay, fine, fix that. I mean, 50,000 nested divs. All right. Delete 10 lines. You've got to love that. Okay. This one, we definitely should have some timeouts. Like, running forever? Not good. Connect five seconds. Read 30. Maybe a little. Be more forgiving. 15 connect and two minutes read. maybe we're reading something really long or you know whatever all right so here's my analysis let's see what it says this is an opinion of choice real world page monitoring use cases rss feeds and so on are well under five gives generous headroom if you'd rather start with two or ten i'm happy okay but i just pushed back as a no default fair failing on oversized is intentional the alternative silently truncate the cap and partial body to transform would let an attacker feed malformed but parsable garbage into Jinja or XPath. Yeah, fair. The recursion limit removal is technically a breaking change for the operators currently relying on it to scrape rural sites that legitimately nest 1,000 internal nested HTML things. I don't know of such a site. I agree. Telegram timeout is tricky because the Python Telegram bot SDK is doing HTTP. modern versions take a request parameter, but on bot, will you pass the configuration with the timeouts, okay? I'll handle it if the version we pinned supports it. Otherwise, I'll note as a follow-up. I think that's fair. I should be pinned already in the requirements. Finally, the single-threaded scheduler issue. Even with all those above, one slow notifier can block the loop for 30 seconds. That's better than forever, but it's not ideal. Note it as a follow-up. You know, it wouldn't be terribly hard to say each notifier gets its own thread or create an async-based loop and then just kick off a bunch of tasks for each notification all at once. It would also just make it more performant. But I think it's okay to leave. It's not a security issue. Let me go ahead with these fixes, do-da-da-da-da. No, what I want you to do is what I just said there. Let's see what it says. It will be asking us, can we do this thing, which might cause a problem? You know what? Most of the time, it's something really ridiculous. Can I, like, list the files in your project directory? Yes, you can list the files in my project directory because guess what? We're working on this project. Oh, nice. Okay, so it does have the HTTPX request available, so we can set a timeout on the Telegram as well. That's great. All right, compress our changes. We don't need to see that. That doesn't really matter. That's just me telling Claude it can LS or Python import tests and files or some content. Now we're getting to real changes. So let's go along and watch it as it goes. Click over here. You can see it's importing an abstract base class from collections instead of just collections. Set the default size to 50 megabytes. the stream chunk size, which is not a limiting thing, but just tells you to read it 64K at a time. Responses to large exception. Okay. And now it's gonna do this do fetch, which is gonna use this apparently new recap thing, which I imagine say for chunk in inner content on the responses and then parse it through. If it gets too big, stop reading, throw an exception rather than truncate it, which it sounds like it was doing before. All right. Here's that ABC issue, the abstract base class issue it was talking about. If it is an instance of that, it's one of these. Perfect. Remember, it identified that this is not going to run, it believes, and I believe it. So we need to update that. Otherwise, it would have skipped this converting to retry and then sleep in a second. So we'll just use the default instead of its little loop there. Okay, great. And we got a lot more changes. Okay. We got the Godify notifier. And now it's passing the timeout. We've got the Nifty notifier, which passes the timeout. And it's importing the webhook default timeout that's being used everywhere. The telegram one is configuring the HTTPX requests. I don't like the fact that this is an inline import right here, but it's how it was before, right? You can even see it was disabled before as a warning. I guess it's still here. Anyway, I'll leave that be, no need to change it. But you can see that it's configuring the request with its read timeout, connection timeout, read timeout, all those kinds of things like we said. Perfect. And our unit tests are getting updated as well. So let's see, let's see what it said in our chat here. So it fixed the fetcher to be streamed, streaming and bounded. It fixed the deep recursion. It's making its way to set all the timeouts. You can see that it already did further down 'cause we checked in all the different APIs. What else we got down here? Down to just ran the tests, all the tests pass. You could see that right here's the flake eight, 79 past tests. Flake eight was clean and it even wrote, it looks like it wrote six new fetcher tests. What else? Oh, it's still thinking. I'm going to let it keep going. It looks like at this point, it's down to just going through the plan and updating it. Here you can see it's changed this mishandling of exceptional conditions to unbound a body, which we fixed, raised recursion limit, fixed, dead retry path, fixed. It's going back and reviewing all the different parts. This part is checkmark done. Awesome. All right, it took it a little while, but 79 and 79 passing tests, the linter's clean, hacker's afternoon ruined, bullseye. I love it. Look, you don't have to be playful, Claude. You could just treat it as a machine that you instructed to do things. I don't know that it'll necessarily make it behave any better, but doesn't that just make you laugh just a little bit? If you've got to talk to these things, these AIs, make them fun. Come on, that's my philosophy. All right, summary of what landed. We have eight files changed. The simple fetcher, full rewrite, streaming, iter content reads max bytes, and then checks for it being too large and raises exception. And then we also saw this collections.Callable to collections.abc.Callable. The bonus finding fix. We deleted deep recursion and set the recursion limit there. Soup now used beautiful soup directly. We set some default timeouts. Timeouts, timeouts, timeouts, update the changelog. Plan of verification has been marking finding four as fixed, which was really three things to be fixed. So we did that. What's left in tier one? Just finding five. Telegram chat. Before we call this section done, two really quick things. First, let's ask it. Just ask it, what does it think? What's its assessment? Claude, how do you feel about the fixes we've made so far? Has it made a significant dent in the security issues? Honestly, yeah, it's been a real dent. Now, not we polished a few rough edges. We closed the actual attack paths that mattered most to the threat model we're deploying into. Let me be specific about why I think that, because vague reassurances is exactly what you don't want from a security review. You're right, Claude and security lead. Well, we actually changed in attacker's terms. Before today, an attacker on the network had four credible plays against the Kibitzer deployment. Man in the middle attack the SMTP relay. Harvest the message body and some configurations from the credentials. Clear text on the wire. Real. Exploitable. Yeah, coffee shop problem. Serve a hostile body to a monitored URL. Memory exhausted the daemon. Feed it deeply nested DOM and crash the interpreter. No skill required. Ten lines of Python serve as the payload. Okay. Hang the connection of any of the nine notifiers. The freeze of the entire schedule indefinitely. taking out all monitoring. Equally trivial or just monitoring endpoints go down, right? You can't talk to them. SSRF via redirect chasing into the cloud metadata. I think we told it that we don't care about that. After this morning, plays 1, 2, and 3 are dead, not hardened, not mitigated, closed at the source with the regression tests that fail CI if anyone reintroduces the holes. Play 4 is one that we explicitly accepted as a residual risk because of the compensating controls, reasonable, and the deployment posture and the braking change wasn't worth it. This is a decision, not an oversight. What's left of the external attack surface for hardening production deployment is essentially the Selenium Firefox path, which is generally scary, but we're going to address, and the Telegram chat ID imprint. One file fix, super quick. Why I feel good about the quality of the fixes. A few good things stood out. I want to flag them because they're the difference between this is a fix and this is the fix that will hold. Every fix has a test, no backwards compatibility hacks. We deleted code, not just added it. Drive-by collections.abc.callablefix was a freebie and the change log saved it. And there's still some issues. Grade re-rate. When I gave it a C+ D split a few days ago or hours, the C+ assumed tier one fixes weren't done. Four of the five tier ones now closed, one accepted, three fixed. I revised my external attack grade to a solid B+, maybe an A-. We're going to get an A by the end of this one, aren't we? The full threat model is unchanged because the local access is bad. Gosh, Claude went on for this. I didn't expect this much. How I actually feel. If you asked me three days ago, would you run this on a server you cared about? I'd say no, not without changes. If you ask me right now, after this PR lands, the honest answer is yes. With finding five done and the deployment hardening guidance and the report followed. All right. I don't know how you feel about this, but I think these were not major problems, honestly, to start with, but they were interesting. We're not done, but we've made a significant dent. So I feel pretty happy about this. Timeout was a good one. The SMTP TLS issue was a good one. So yeah, we're making progress. We're not quite done. We've got a little bit more to go. We've got that one tier one finding left to do. Pretty interesting what we're able to do with our security lead, right? Oh, yeah, one more thing. And I said the other thing I wanted to talk about was how much has this cost us? You can see there's a lot of content changed here. We've kept the documentation up to date. We've written new tests and so on. How much does it cost? Let's find out. I don't know, 11%, 15%, something like that. Here to there, a little tiny bit more. not very much at all. Like, we've fixed most of the issues. Analyzed the app, came up with a report, and fixed most of the issues with just a little bit of what lives in my five-hour window at the intermediate level of cloud code. So it's not breaking the bank or doing anything crazy in terms of AI usage. And remember, I've got it set on level four max effort. And on the model, it says default. but it's weird, the default changes. Sometimes default's on it, sometimes it's opus, but it's on the peak model, right? So it took some time, but it took five or 10 minutes. It didn't take weeks, and it gave us really good results. So I'm very happy with how things are going so far.
|
|
|
|
38:00 |
|
|
transcript
|
5:43 |
Time to move on to the next project. And boy, oh boy, is this a big one. We're going to see how it turns out. So this project, Apache Superset, is an open source, modern data exploration and visualization platform, which will have some interesting multi-tenancy components to it, I believe. It also has 73,000 GitHub stars and 1,400 plus contributors at this time. So it's had a lot of eyeballs on this project. We're going to put Claude's eyeballs on it as well. We're gonna see where it goes. So I've already downloaded this, cloned it from GitHub and set it up with our agents, our security lead and our ask the security lead commands here. So that's good. And this project actually already comes AI enabled, I guess you might call it. So if you scroll down past the dots, it has a.agents file, which I believe is used mostly by Codex and others. And it has a.md file, which I think is probably exactly the same content. So we don't have to go through that process of setting up in that regard. I've created a virtual environment for this, although I have not installed it yet. This is a really big project. So I'm not sure if we're gonna be able to run it or not. I think we're just gonna have a look and see what comes out of it. Notice also we have a Gemini and a GPT, which I imagine a Simlink over, like we could actually find out here real quick. We could scroll up. Yeah, here you can see that Gemini is redirecting over to agents, and GPT is as well. What about Claude? Yeah, so everything is just redirecting over to agents. Cool. So that's really just a setup thing that Superset has done, which is actually pretty cool. Next thing, we want to make sure that our AI, our Claude, is set up correctly. So suppress that for a minute. We don't need the Gemini to be considered. Although, you know what? I guess might as well, right? It doesn't really matter. At that point, let's go ahead and make it be Claude so it's as comfortable as possible. Since the last recording, something really cool has come out. If you look here and we go to models, notice we have different models, but we have Opus 4.7 with 1 million token context. I don't remember if that was in the last one, but this, I'm quite sure it was not. So down here we have ask before edits and we also had edit automatically in plan. But now we have auto mode. Auto mode is awesome. It is so much better than what came before it. You have to enable it. And the way that I was able to enable it is I have to go to the terminal version of Claude Code and hit shift tab until it goes over to select auto mode. And it says warning, auto mode is more independent. It has more agency than the other modes, even edit automatically. So what auto mode does is it actually has the LLM evaluate the commands and say, is this going to be dangerous? Is this like a drop table sort of thing? Or is it just listing a file or reading a file that's within your project? And it's much more automatic and it asks you much less often. You can also give it higher level of efforts. I think maybe that might be the effort on edit automatically. I'm not entirely sure. But like I said before, best model, maximum effort, that's what we're after. But this new auto mode is gonna be great. It means we can just start it off and let it go. All right, with all that set, with all that in place, we're gonna give it the exact same command again as we gave it before. Please have the security lead give this code base a deep analysis. I'm considering using it for a critical project and we need to be triple sure that the security is top notch. I'll be happy to work with you and solve these security issues that we find together. I'm going to start this off on auto mode. I'll let it go for a few minutes and then kind of skip around because I imagine this is going to take some time, but that's okay. It's going to be time well spent, I imagine. First of all, just notice really quickly that it is using our security lead here. That's important. You want to make sure that's set up, everything's working. I'm sure it will find security without the security lead, security issues, but with it, I think it's going to do better. So let's let it go from here. There it is, it's done checking the watch that took about 10 minutes. Now it's great that it wrote it all here and I should have probably just instructed it in the beginning. Hey, what I would like you to do is save this. There's not a super way for me to do that. You know what, let's try this because it has all this formatting, like for example, this inline code here, we got to paste it somewhere that'll preserve it and then can put it back. So let's try this. Let's say, put it over there and notice if I paste, it's just junk. But I think if I open Typora, paste it in there, go to the source, copy it. I can probably pull it off. Yes, here you can see it's perfect. It's preserving it. Go view source, copy, paste, save. Why did I go through that trouble? No, we don't care about that. Why did I go through that trouble? I could have just asked Opus to write it, but it's super slow when it comes to that kind of thing. So let's have a look. There we go. A nicer, much nicer version of this. Now I'm going to do something a little bit different than last time. Instead of trying to explore this with you, I'm going to study this and come back. And then we're going to talk about what I found. I want it to be a little more crisp for you. So hang tight and we'll come back and review the security audit.
|
|
|
transcript
|
2:39 |
All right, we're all done. We have got our security review. Now, it took about 10 minutes, so I'm not going to play back the agent working with the little subtle background music that I've been adding. I'm just going to give you the results that I've studied. And I found one insufficient level of exploration and went into it. We're going to talk about that. But first of all, before we do any of those things, how much effort did it take? Now, I wish I had actually pulled up my usage beforehand, but you can come down here. You can go to your account and usage and it pulls this up. And notice here we've used 9%. I think that's a, I think it was about 6% before I started this security exploration. And let's also, let's ask how big is this? So this is a lot of code. So let's just open that. Here we go. Polyman's a little open source tool I built. So is this big? Yes, it's almost 1 million lines of code. It's a lot of YAML and other things here. As you can see what's on the screen. But the almost million lines of code and the fact that we were able to analyze that pretty deeply and only use 3% to 4% of our five-hour window, wow, that is not bad. And I took multiple attempts, multiple passes at getting this information. because I found it didn't really do a good enough job. And maybe if this was really my app and I deeply, deeply cared, I might go for every single potential issue and have it do a deeper exploration for maybe every OWASP, top 10, all the 10, and so on. And who knows if I do that and that was 3%. Maybe that kicks it up by 20%. Who knows? But it still is a small portion. This is not a mega amount of usage that it took. That all said, we're ready to talk about the security review. Before we dive into it, I'll tell you, it's not too bad, but I think I legitimately found some issues that if this was my app that I was running, I would actually not feel comfortable with. And I think that that's honestly, that is very, very surprising. A project with a fully apache.org project, 1400 contributors, so many eyes on it. It did really well, but it didn't do perfectly. So we're going to dive in and talk about the details next.
|
|
|
transcript
|
11:25 |
Let's go ahead and dive in. How did the review do? The security review of Apache Superset. So here's the summary. I'm going to give you just the high level. I'm not going to reread this to you. I'll include it. You can check it out and so on. The biggest security finding here, and this is probably just a documentation and user perspective, is that by default, if you run this without configuring it, without going through a hardening phase, you're going to have some problems, possibly a bad time. And Claude says, under that scenario, we have a medium risk for default configurations, but we're not super concerned about well-hardened deployments. The maintenance posture is strong, so they run daily Dependabot. They have a security team. They have a public record of CVEs, which is pretty cool. They have container scanning, and actually, in the opposite way that you would think, they actually took Trivy out because Trivy itself became a supply chain issue. So they're like, all right, we're just going to not use Trivy. You can use something else. They actually have a whole security statement there that people can go and look at and so on. There's a couple of issues here. One is sort of, could you possibly have cookie theft? Yes, maybe. And I think the biggest one here that jumped out on the first pass, not the second, but the first pass, is you have a default for the JWT token secret. Other parts of the app will refuse to run if you don't change the defaults, but this part will allow you to run it with the default. And then if it does, you can then forge JWT secret values and cookies and so on. That's bad. So we should probably fix that by adding the validation that says, if you're running in a mode that's gonna use the JWT tokens, and it's set to the default, and is in production, crash. Don't allow it to start and run in this insecure way. That's what I would recommend. Breaking change, so they might not accept it. I don't really plan on specifying that for them. It's not a big enough deal that I think it's like a deal breaker or like, oh my gosh, we better let them know before the world burns down. But it is something in terms of a review. If this was my app, That's a change I would make. Curious on. So the critical issue, and Claude also identified this as the biggest deal. It says, this is a security misconfiguration for cryptographic failures because this is the value. Please change me. Hey, hey, hey, I'm over here. The problem is if you use embedded dashboards, primary integration mode, and you forget to override this, anyone can forge a guest JWT token. Bypassing dashboard authorization, row-level security, all that kind of stuff. not ideal. So what do you do? You set it to some value, obviously. But here's the thing that I pointed out. Strongly recommended adding a startup check to parallel check secret key at this initialization step that refuses to start if the embedded dashboard feature is enabled and the security, the secret equals the placeholder, I would add. And we're in production mode, not development mode. We could patch that together. It's a 15 line fix. So maybe we'll do that just for fun, just like let Claude fix that. There's some others that it finds that are higher like, oh, you could do requests against arbitrary URLs as some of the import features. I'll let you read this to me. I feel like that's more of maybe possibly a trade-off of like flexibility or how this tool can be used. A webhook alert delivery. So we have HTTPS only is a good default for the webhooks, so that's good. But certain URLs like this to 10.0.0.1 or kubernetes.default.service will still run. And if you can configure an internal probe, you can potentially export, exfiltrate the report payload. Not great. You know, basically says don't allow local loopbacks to be part of this. Maybe. Here's one more that's interesting. This is actually a choice, but it's certainly worth knowing. And if you're running in a multi-tenant story, this is also a pretty big problem, something that you basically need to turn off. There's a thing called SQL lab. So it says row-level security is applied in here for data set-backed queries. However, for arbitrary queries using ad hoc SQL, it is not. So imagine you've got some table that has shared data because different tenants are using it and so on. Then you set constraints per row. So maybe this data context, data set context query comes back to the right answers, but the ad hoc queries come back with data for everyone. Okay, not a huge problem. It says, look, if this is part of your isolation story, you have to disallow SQL lab. I said it like this way. Or enforce it in some other way. Application layer row level security in superset is for chart time filtering, not defense in depth on a tenancy boundary. Good to know. So far, where are we? I think this one is the only one that really, I feel like, needs to be addressed so far. But you'll see there's another, saving the best for last. Here's another one that we could probably address. We don't have, I think, a strong enough policy for using HTTPS. So when we set cookies for the Talisman config here, it says out of the box, Superset will run on HTTP and set non-secure cookies. That's fine. But in HTTP misconfigured deployment, a network attacker reads the session ID. All right, so you're basically not using encryption and there's no HSTS header sent. That means always browser, always, always contact me directly with HTTPS, never try HTTP and then upgrade through a redirect. so potentially what could happen is somebody could have a logged in session do HTTP colon slash slash to your server get the cookie then the thing's going to redirect and then it goes SSL so that can be an issue there's also a hassle when you're doing development so often in my apps I'll do something like session cookie secure is not development mode enforce HTTPS not development mode and this as well, turn that on only in production. So not development in my world is production because if I'm running locally, trying to build it out, doing unit tests or whatever, I don't want those things. But if I'm put in production, I want it to be absolutely set. All right, so this is number two here, session cookie false and so on. There's a few more things, API keys. There's no expiration date or scope. So once you create an API key, it's forever, ever, ever available. Also, there's no concept of a read-only thing or a data set only. Those are the two issues that you might want to upgrade, right? You might just somehow want to manage those kinds of things and maybe add a layer that says read-only for this particular token, right? Kind of a least privilege type of thing. Here's some stuff about JavaScript running the page. That could be for cross-site scripting. I'm not going to worry about that too much. This one also a little bit less. SQLAlchemy is forced to be less than 2.0 and for a long time, 2.0 has been the latest. This is a problem if a bug or CVE is fixed in SQLAlchemy latest, but not backported to the older version. It does say it's a non-trivial migration and boy, oh boy, will I tell you that's true. SQLAlchemy got way more complex when it went from one to two. They changed the query syntax. It's like a multi-step builder and then executor. And I don't know, it's a big change. So understandable, but worth knowing. It also gave very nice props to all the things that are working well, right? Secure Patterns. This is genuinely good engineering. I'm not going to read these all to you. You can look at them yourself. I'll include this file on par with what we need. Last thing is it gives us a breakdown by the top 10. Broken access control. It talked about SQL lab. We've already talked plenty about this. The session cookie security falls in production. Possibility here is a default. Also, that's, like I said, something to consider. It has really good supply chain stuff. Crypto failures are good. SQL injection is well mitigated. This one we discussed. The SQL lab and row-level security. Does it matter to you? Maybe, maybe not. Rate limited logins and so on. One thing that it is missing is has no multi-factor authentication options. That's worth knowing here. This is the last thing that I actually want to dive in with, with you. I'm going to do this in a separate video logging. So it says I didn't really deeply look into logging. Like there's a million lines of code, Michael. I don't want to look into this. This is going to take forever. What I did is I went over here and I said, amazing, but please give this thing. I didn't look into deeply, a deeper look. We're going to come back and see what it said there. But to wrap up the initial security review, defense in-depth suggestions for your critical project. I would set these from your secret manager, right? Never use the placeholder, especially look at this one. Set the cookies to be true. Don't think that real-level security is going to save you. Add some guardrails. Manage the API keys. So issue them to dedicated service accounts with minimum role grants, rotate on a schedule, track every issuance. I think that's good advice. Do something to add multi-factor authentication in front of it. So if you do a single sign-on, for example, you can make people sign-on through single sign-on and then make that single sign-on place have MFA. Honestly, adding MFA to a Python app is pretty easy. I've done it a few times. It should probably be an option. Then it talks about, well, remember I told you I want to fix these things together. It says, what I want to do next with you. So the only one I really want to address is this one. It says it's upstreamable. That's awesome. That means like it's something I could do as a PR back to superset, which might be fun. We'll see. But let's just go through that process and do it. And then there's another one in the logging, which we haven't talked about yet, because there's enough there that I think that's a dedicated separate section. So two actions I would take here is I would note that we should fix this check secret key for the JW secret startup thing that we discussed. The other one is I think I'll have Claude build a hardening document. Like these are the things we should do to set up this app correctly and make sure it's got some checklist type of thing that we can work on to make sure that our app is ready to go.
|
|
|
transcript
|
2:51 |
Let's do this a little bit backwards from the way I mentioned it there. So what I'm going to do is I'll go over here. No, no, no, no, we don't want that. We want this auto mode. I would like you to build a security hardening document. Our security review highlighted that this is the biggest gap for running this in a critical environment. In order to make sure that we have the most success, can you please create a hardening and setup document that we can go through with check boxes in Markdown so that we can use this to launch our app correctly on Superset? Okay, good. I could have typed that, but I kind of like talking to it. I find I'm a little more thorough. I don't get impatient with the typing. So one more thing. Save this to hardening setup.md, something like that. And we're just gonna let it go. Come on Elsa, let it go. All right, so Claude is finished. It's created our hardening setup, our hardening checklist here, and I put it up in the preview so we can see it nicely. Let's just kind of scan through it. I don't really want to go too much into depth about what it is. It's just the security review is so much focused on it's fine. If you harden it, it's fine if you change the defaults, but if you don't, it's going to be an issue. So I thought that, you know, maybe this is something we want to kind of have as an artifact of this review. I'm sure that Apache superset on their website has a getting started doc. So maybe start there if you were actually going to use it and then just double check here. So it says, you know, look through all the things we're talking about, so on, so on. So these are the blockers. All right. Like you have to set the secret key and you have to do it. Here's the command to do so, which is kind of handy. Set it as an environment variable here. Don't hard code it into this one. Confirm that superset refuses to boot with the default value. It should. Store your key somewhere and so on. Set it and then see that it works. Again, we need to set this. We're going to actually talk about getting a behavior similar to this as well. But it shows us how to set that and set up the database, SSL, and so on. Single sign-on. My goal is not to go through the hardening list with you, but just to generate the hardening list itself. So just to show you that that's something we can do. And I think this is, it looks pretty good to me at first glance.
|
|
|
transcript
|
6:38 |
Now let's talk about the logging. This one is actually super interesting. I think we found legitimate issues here. When it comes to the guest JWT token secret stuff and whether or not the app should boot, if it's unsaid and so on, that is kind of what could be a breaking change for deployments, right? It's certainly going to change behavior, though it's not the instructed behavior. I'm sure there are versions of superset running in the world that use some of the charting, but also have a default secret. And that change wouldn't make those no longer start. That could be not worth the cost, but this one, the stuff that we find in this deeper dive into the security around logging absolutely is something we should adopt if we were in charge of the superset project. And it's even something that maybe is worth recommending, not necessarily in its entirety, but in a couple of choice pieces. So let's go through and see what we've got here. Here's the theme. Basically, it's adequate for routine operations, but it's insufficient for security, critical ones, or certain responses where you need to do a security investigation into what happened. For example, investigation readiness. Like, is it ready for a security investigation? So what? You can't answer who did what, but you cannot say what IP address, what user agent, and things like that they came from. That to me, that's a problem. All right. So it talks a lot about how things are good and all that. So this is great. But there's going to be a couple things from this critical section here, the gaps. So first of all, there's a log table that has a log of the history of actions in Superset. Those have no IP addresses, no user agents, no session ID. Oh, my. So you cannot answer where did the attacker come from in the audit trail. Oh, that seems problematic. So it shows us how to create a derived class for the database logger and then add things like IP user agent and session in addition to what's already there. So that's cool. This one is more serious, right? So this is like, we could certainly add this stuff and record it in the database and it's not going to harm anybody. It's not going to be a breaking change to just have more data most of the time. Yes, it depends what you're doing, but in general, it seems safe enough. But this one, check this out. User login failed. First of all, its payload is impoverished with a poor payload. So here's the payload. Username and user ID. This is what gets saved when a login failure happens. What's not there? IP address, user agent, refer, any other thing that we could possibly record. Login failures are serious. If you're looking at some kind of brute force attempt or something, You want to have extra information, not impoverished information. So here, this is one thing I would absolutely recommend to Superset. But it's worse. It's worse than that. Check this out. This only fires. Now, it's possible. There's a million lines that go to it. There's some scenario or some sort of hook that's catching this. But according to Claude, this only fires when the user exists and the password fails. If you pull this up and look at it, it certainly looks like it's saying when there's a login failure, what we're going to log is user dot username and user, like literally the function call looks like that comma that. If there is no user associated with that, it's going to crash at a minimum and not log. Okay. So it, it looks, it looks like it's not getting saved. So it says this only fires when the user exists and the password fails. So if you're guessing through a brute force credential stuffing attack, like I've got a bunch of leaked credentials, username, password, try it. Next username, next password, try those. You wouldn't detect slow horizontal brute force. One of these yields no log rows per attempt when the user doesn't exist. Only when the user does exist and the password fails. So you might not detect credential stuffing attacks or other types of guessing. Someone's trying to guess, but they're guessing with the wrong username, but they're guessing a lot and so on. So that is, I think, these two things, right? There's a lot around this one that needs attention from what I can tell. And anything else? There's other ones. Nice things going on here. One more that's worth calling out. The rest of them, maybe. Certainly worth looking into. But this one is kind of interesting as well. You can export CSVs. And when you export a CSV from like a query or a table or some kind of report or something, you download the CSV data, right? Big surprise. So that's logged and it says data exported. And it's set to say there's a CSV file exported. But here's the issue. You know that a CSV was exported, but it doesn't log how much data or how many rows or how many bytes or whatever was exported. you might think well who cares right like if the user downloaded three csvs that's enough i don't need to know that there was like a thousand rows in one or 500 in the other but for a data exfiltration investigation how much data did they get is the first question and if all you know is they got a csv it may have had 10 rows or it may have been the database or a table's worth of data well those are different things knowing how much data came out might be worthwhile right so how do we fix it we We got a custom event logger that adds row or byte count post execution or instrument the response middleware layer. So there it is. I think that needs attention. This absolutely needs attention. So those are two things around logging that I think are certainly worth looking into. You can see it's got this nice little checklist like logging failures logged with this. It does log the username if the username exists, but no IP address. That's the login review. It's kind of a narrow slice of what goes on, but logging around security is really important. I think there's really genuinely fruitful things to improve Superset from looking at this analysis.
|
|
|
transcript
|
8:44 |
Now, before we call it at least stop, not done necessarily. It was such a big project. We're definitely not done. But before we move on from superset, I do want to have it fix one of these issues. Just kind of have it go through and say, what would it be like if we were to ask it to fix this problem? And the one that I'm going to look at is this guest JWT token issue. Now, I can't just come over here and type, please fix this thing and put the whole description. But I just want to drive home this idea of first of all, auto. Maybe we'll do plan, we'll do plan. But check this out. If I double click this, it'll open at this location here. And here's a really good reason to use the Visual Studio Code extension, not just the CLI Claude Code. Because notice as I have that file open or as I move around, it's changing what's here, but even better, if I highlight this, it knows what line it is. And I could even go a little farther. Let's just go highlight this whole section to here. And it says, look, six lines are selected. So I don't have to try to communicate that context or more broadly, which file it came from. I just tell it, we're going to fix this issue. Let's do that. All right. So I said, let's fix this issue. We're in plan mode. So build a plan to do the fix and we can review it together. I want the app to fail to boot if the guest token is default and we are in prod mode, but start with a warning if we are in dev mode. This mirrors the other secrets behaviors. Let's just let it go and see what we get. all right it's pretty much done it just needs a couple of questions answered which i love this interview mode and I think gated on embedded superset. You need this secret for these shared embeds. So we're going to do that and we'll add this. How do we want to add this? Let's go and add this environment variable. Here we go. All right, it's given us a plan and it says, do you want to auto accept it? No, I have no, I do not know a mechanism in which I can accept this plan and give it the right execution options. It's just, I'm going to just say, keep planning. What do I mean by that? It's going to go into this, I believe. I want it to go into this. There's no option to say, accept the plan and run in auto mode on potentially a different model or whatever, but we're still on the default. Anyway, so I'm gonna say no, then I'm gonna say, okay, great, now build a plan, but let's just scan this real quick. It talks about the problem that we've already discussed, design decisions, we're only gonna warn people that this is not set if it's in a situation such that it could be set. I love the add comment capability here. So that's really nice. And I'm just gonna assume that this is good. You can see what it's proposing. If this was in practice, I was really doing this, I would actually go and review the plan better. But just for an example, let's just go, I'll say, I love the plan, let's build it, and let's kick this off and see what we get. So you can see it's working its way through implementing the plan, And it's doing things like recording what the default value is, and then later importing it so that it can actually check. Probably expand that and see. And it's going to try to pull in this value here, which is this parentheses, it's just a way to do multi-line statements here. So we're gonna try to get from the environment. And if not, it's gonna fall back to the default. Okay, fair. Now we have this new check one and we've got some stuff with logging. We check the feature flag is enabled. If the feature flag is not, then again, we're not per plan. We're not going to warn them. They didn't set a thing that they're not using because who cares? So if we're in debug mode, then we're just going to give them a warning. of course, is when the set token is equal to the default token from the config. We're going to do an exit one, which is a startup with a failure. I would like, I prefer to see this something like some fixed number that you say, if you see exit code 27, that's what it means. So let's see, we could go back and potentially fix that right here. Right? So you know what, this is going to be 27. I made up a number. Let's stick with it here. Right, so that way, if you say it crashed with exit code 27, we immediately know, well, that means you didn't set the secret key correctly, right? I'd be a fine grain about that as well. So it's pretty much good to go and it's trying to run the unit test, but I don't think I have everything set up correctly. So it's, see, it's going along. I'm gonna make it stop there. But let's just go over here to the diff, look at the diff and see what we got. We have a new config. We got a new hardening guide. Okay, all good. This one, here we can see we've imported those secrets. This is a lot of big files here. That's just a warning. You can see now we've changed how we've got our secret from the config. We've got our different constants. We have a unit test to run. And here's the part that we were just reviewing is now we have these constants to check and down here or so, We can see this is where we're doing the check that we talked about. If the guest token is set, then we're going to potentially exit with our value. 27 or whatever it was that I wanted to exit with. I know it wasn't written there, but that's what I would want. Okay. So we're all done. We are all done adding that feature again. Not going to add it into the project. So I'll just add, we'll say added, add check to fail. boot when guest JWT token default value is used. You know, let me just see what happens if I push this little, not that one. If I push the little AI button right there, will it give us a better one? You know what? I think this might be better. Now we have this, but like, I don't really intend to include it. So let's actually do it like this. And we can take out the hardening setup and the security deep dive. like that, and I'll ask it one more time. Not sure if it looks at everything or the stage commands here. Feature, enforce that. Okay. Commit. We're not going to push. Can't push. Wow, this project is under serious, serious work. It's had two changes that I can download just since I started working on this video because I refreshed it when I began. We're not really interested in contributing back, although, like I said, a couple of things are pretty interesting. I think we're going to call it for Apache Superset. This was an interesting experiment. I honestly did not expect many issues to be found here because it's such a popular, well-funded, and has so many people working on it, right? So I thought it would be absolutely hardened. And still, I think we found a couple of things, like we should absolutely log the IP address when somebody tries to log in and it fails. That's pretty straightforward. And it's kind of obvious, right? If you think through the top 10, you're like, well, what do the top 10 say? Well, one of them is logging around security. Seems really straightforward. And yet it seems also like it's not there. I think Apache Superset is already in a good place, but I also think we found a couple of things that would be worth adding back to the project.
|
|
|
|
25:10 |
|
|
transcript
|
5:28 |
Okay, so we are ready to work on our last project. Let's get our Claude Code tab open here. And press this, and that's going to be paperless. Ooh, there's quite a few. Jump out of the way. Look, there's quite a few changes since I checked this out at the start of the course. So we'll get those synced up until we have the latest, at least. Who knows? Maybe there's a security fix that we would have found that we're not going to find because it's gone now. Again, notice we have put our security lead agent and command in here. We're not going to push them back, but I'll just say, hold. There we go. Not going to sync that. I don't have rights to do or write permissions to do so anyway. Just like before, we're going to give it the command. And before we do that, let's go over here and just review. Once again, the status, 40,000 GitHub stars, 2.6,000 forks. That's more, no, I don't believe it's more than Superset, but 390 contributors. That's pretty awesome. And this is a project, Open Source Document Management System, that transforms your documents into a searchable online archive so you can keep, well, less paper. Great name, great product. I like the idea of it quite a bit. It has really sensitive things in it, so we're going to want to be careful. given the usage again this is not just a brand new one person project that's only been around a little bit I expect that it's going to pass pretty well in our security review but it's going to be our third and final example for this course it should be interesting no matter what now let's give it the exact same settings as I had before and again auto mode on default who knows what default is, like I said, it changes sometimes, but right now it's opus 4.7 with 1 million token context. So what I'm going to do, I'm going to kick this off and let it grind. Oh, and by the way, from just, I didn't really do a check-in with you on this at the end of the superset video, but let's go and do that real quick. This will both be a baseline going forward and it will be a kind of a check-in. So remember I showed you after the initial evaluation was at 9%? Let's see. Hopefully it hasn't gone reset. So check this out. We're only up to 10%. All that work we did, the extra review into the logging, the reviewing 900,000 lines of code for security issues, the fixing the feature, the generating the hardening guide, it took like 4% of my five hour window. That's nothing. I mean, of course it's a ton of compute that has to run, but in terms of, oh, we couldn't possibly run the most expensive model against it because it's just going to take too long and it's going to just use up all our credits. We can't afford it. It's a fraction of a fraction, you know, it's 5%, 4, 5% of the five hour bit, which is just a fraction of our seven day bit. And it took maybe 15 minutes of AI time to do all the work on superset. All right. That said, we got 18 minutes to do some work here and we'll do a check-in as well. Right now we're at 10%. Let's go. All right, it's done. Let's see. Clock time took about 15 minutes. So that's no joke of effort. Let's see what it shows up here in usage. 1%. Not bad. Not bad at all. OK, so let's open up the report. And we'll just take a quick initial look at it. but going to, of course, I'm going to pause the video, give this a nice review, and then I'll just give you all the overview. But overall risk is load medium for self-hosted, and medium, if you expose the app directly, and somewhere else it said, maybe at the bottom, said for a critical project, defining critical, high-value trusted documents, small team, Then hardening plus patching some of the other issues with this without first patching those and upstreaming them or vendoring them in, wouldn't do it. So pretty interesting. So what we're going to do is I'm going to read this and get back to you with what I think is germane to our conversation.
|
|
|
transcript
|
13:31 |
I've had a chance to review the security review for paperless. And all in all, it looks pretty good. But I think we found some meaningful stuff that would make this project better. So let's skip through. Let's sort of skip around here and hit the high level. So as before, it says the overall risk is low to medium for self-hosted behind reverse proxy. Medium, if you put it straight on the internet, run a multi-tenant status, or operate without hardening the TLS terminators. Okay, fair. Let's scroll down and we'll just look through. First of all, good news, no critical issues were found. So no remote code execution or bypassing of auth or anything like that. Bingo. Now, there's some warnings for real hardening gaps. And in these gaps, some of them I feel like, yeah, okay, I guess technically that's true. And in other places, yeah, we could definitely improve things, as you'll see. So the first one is about basically controlling the cookie, controlling the headers that go out, some of them for the cookies, some of them for SSL, and so on, for exposing those out. For example, the security block sets XFramesOrigin to same origin and the SecurityProxureSSL header, right? But it never sets any of the following, and it never reads them out of the.env file, which you would use, say, for the Docker Compose project, whether the cookie is secure, the CSRF cookie security, the same site settings, the HSTS, do you use SSL initially to talk to this project, SSL redirects, security refer policy. So it might make sense to add something like this where you, I don't know, read those values from the environment and set them to the default. So for example here, false or whatever comes back from there, right? That kind of thing. So that's minor, but, you know, it would be nice, wouldn't it? This one, not absolutely critical, but it's not great. So basically you can change your password, but you can change your password without verifying your password whatsoever. So any form of any way to take over a session will allow you to change the password to whatever you want permanently, locking out the real user, taking over their account. Again, is it critically bad? No, but it says the password change without current password check, and here's the code to do it. So we're going to get the data and make sure it's good. We're going to get the user. If it has a user value, that's all good. And then we're going to get the new password coming in. And if there is a password, you know, for example, it's not an empty string or something like that. We just set the password and off to the races. Click over here and see what that looks like right here. Probably should check the other password, right? The old password. That's something we could look at. So I think this is a valid concern we found. Is it a mega big project problem? No, I don't think so. But here's the risk. session hijacking escalation to full account takeover with persistence, right? So basically, like I said, if you can somehow impersonate the session, stolen cookies or something like that, then you can immediately just change the password to whatever you want it to be. And then you have full access to that account. All right, next. Some hardening around which URLs are used. It doesn't seem critical to me. So I'm going to skip on over that one. Nothing is completely trivial here, but it's certainly worth checking out. This next one, this W4, is something that's very common with these self-hosted web apps. You can have multiple users, but how do you specify who the initial user is as an admin? How do you create the admin account? Often what happens is when the app first starts up, you go log into it, create an account, And that first account is the admin. And then all subsequent accounts, you can control how they get created and so on. So that's what's happening here. Basically, it says it's a race condition on a fresh install. So when you set up a new install, this app comes up. And whoever logs in first, that is the admin, period. Because it's the first account created, not within the first some number of minutes, This risk basically persists as long as the site is up and the first user doesn't exist. All right, so what are the recommendations here? All right, so document loudly that you should run manage.py createsuperuser. This is a Django site, so that's how you would do it. Or maybe don't put it directly on the public internet. Log into it through some sort of background mechanism like a SSL tunnel or something. create your account, and then open it up. There's a couple of options. So that's one there. We're not going to change anything I've spoken about yet, but that's it. Here's another interesting one. There were originally issues about using pickling. So pickling is a serialization, like a binary serialization mechanism in Python. I'm not a huge fan of it, but it's like an easy way to say, I have this object or this data structure. I want to save it and get it back exactly like it was, and I don't care about thinking about its serialization. it's not just a security issue although that can result in running arbitrary code when you unpickle when you deserialize it so that's an issue but also it's just a versioning issue like if you change from python 311 to 315 and a bunch of stuff is in the cache is it still binary compatible sometimes yes sometimes no it's just like a reliability thing too so i what i would do is I would convert stuff to JSON or message pack, store it in the cache, and then deserialize it in a more durable, non-binary way. Anyway, here's what Claude finds. The classifier model is good because it uses an HMAC sign. It is HMAC signed. However, the per-document vectorized cache entries are written to Redis just with straight like this and read back with pickle.loads. They're not signed, so they could be tampered with. If they can be tampered with, then a single message put into Redis, as long as you can get it to be read, will create a remote code execution on the worker. That's not ideal. They actually mitigated this already on Celery. So the recommendation is just do what you did for Celery. Just fix it. Okay? Or switch it from pickle to, like I said, JSON message pack. Just completely sidestep it as well. But it's already been solved for Celery, but it's not solved for Redis. So also solve it for Redis because Redis is used as a cache in this app. Allow hosts. Not ideal, but we're going to leave that. I'm going to skip over that one, not go into it. Logging. I tell you, logging is fruitful. We've got some more logging goodness here, so check this out. The code base enables Django audit log for model changes. That's a good thing. And it writes to paperless log and mail log, but there's no central way to record the actions for, say, logins and failures. AllAuth, which is the authentication layer they're using, has signals or events you can wire up to when this happens, but Cloud says they're not wired up, so then you can't log those kinds of things. Similarly, token creation and regeneration for multi-factor. I think this is the multi-factor one. Turning on and off MFA, not documented, and you can actually see right here. Comes in, gets the user, gets the adapter, does the magic, and then just returns the results. You know, logging, nowhere to be found, right? And in the post over here, let's see. If valid, we're going to activate. And then signals, authenticator, send. And generate the recovery codes, and so on. Let me just ask what this does here. okay so i had claude double check that there's nothing wired up to this signal here that would cause it to log i said does this log the mfa changes and says does it cause mfa change to be logged? No, not by itself. Authenticator add is just a dispatch declared here. It has no built-in receivers and I corrupt. Nothing connects the receiver to authenticator added. No connect, no receiver. So send is a no op for logging today. Just nice to double check that here. So that's the situation. You know, this theoretically could trigger an event that somehow causes logging to happen, but it doesn't look like it, right? So you can go, it's really nice that our report has the exact code there. So what does it say? We said hook. What you should do is you should hook all the all auth signals, user logged in, user login failed, et cetera, et cetera, MFA, which is one we're just talking to, talking about, and write them to the audit log model with actor, target user, event type, et cetera, et cetera. IP address and user agent, don't forget those. Same for token create and delete and super user and staff bit changes, like making somebody a super user, making them no longer a super user. Okay, so that one certainly seems relevant, right? I think the one that requires the original password and the login here. But it does go through and sort of give some props to this project as well. It says, look, here's the secure patterns observed. We've got the signed celery messages, right, which is what we talked about with Redis could probably use this as well. Signed classifier model files for the, you know, as an LLM that does stuff on the documents. Doesn't start if the secret key is default and so on. You could look through all these and, you know, it has a little OWASP top 10 assessment as we've seen. And the security misconfiguration, cookie, HSTS hardening, not ENV bindable. Talked about that, right? There's no way to set that in a Docker ENV file or wherever else. Injection. This is only two raw SQL sites, so that looks fine. But this race condition for the super user, this is when you're setting it up, and you're going to find out. So it's not a huge deal. If somebody gets in there and messes it up, you just completely delete the installation, reinstall the Docker images, start it up again, but it is a bit of a hassle. This one, more serious, password change without the current password. You know, you can take over the account if you can somehow spoof the session. And the login. It doesn't log in, token rotate, MFA changes aren't in structured log. And finally, our pickling remote code execution. Probably the most dangerous because remote code execution, but also the hardest too, because you've got to get to the Redis server. And if you come over here and look, we can go to Docker, Compose. Let's just pick that one. And you can see the way they have Redis running here. So it's called Broker, but it's really Redis 8. And notice there are no ports that are shared here. So only stuff within the local Docker Compose network will even know that this exists. Other than that, it's completely blocked. So that's actually a really good setting. Same thing for the database. It's not accessible outside of the Docker network, at least for Postgres. There's like lots of options here, but I imagine that. This one is sus right here. This is kind of suspect. Remember? 127.0.0.1 colon would be a lot safer bet. If they're using an uncomplicated firewall to keep that locked down, that's public. And you don't want it public, right? This is the app. This is the web server, but not the SSL front-ended thing. This is sort of the worker one. right it talks to redis talks to postgres and so on all right that is my assessment of this project i'm you know third one up looking pretty good to me i think these are it's really gotten a decent amount of attention i didn't choose any you know one user one developer barely out of the gate projects. I chose real, used, popular projects because I wanted this to feel more realistic. Like I imagine if you run this against your app, you might get more notes than, more notices than maybe some of these, but still pretty interesting. Let's just pick one of these and trying to decide, Do we do number nine and do some logging or do we somehow change our pickle object here? Well, we'll see. We'll fix one of these before we put paperless to bed and call it its final review done.
|
|
|
transcript
|
6:11 |
Let's work on the reddest issue here. I'm not a huge fan of pickling, as I said, although this is not my project. So, you know, I probably, it probably makes sense to harden the, just, you know, put a signature on the pickle. So let's try that first and see how that goes with Claude. If it doesn't go super well, then I would switch it to JSON or message pack. Depending on if message pack is already used. If yes, then I might use that. Otherwise, JSON. JSON's nice because it's legible somewhat. All right, so if we double click this, it'll take us over here. We can highlight that. And notice that is now selected down here. So let's give this some instructions. All right, so I'm going to tell it, please add this security hardening to this Redis pickling. I'd like to mirror what has already been done in Celery to keep this as in sync as possible. With the rest of the code base, I've selected this file and selected the lines in the report that speak about this unsigned pickle in the Redis read cache. And I'll say, please interview me if you have questions. So maybe if it's unsure, it can ask, but I doubt that it'll need to. It has the Celery as an example. I put it on the peak model, auto mode, max effort. Let's go. All right, all done. Let's see what we have here. So it says we've changed classifier.py. This is where the unsigned tamperful, tamperable, pickles were being used. And it said, look, we've switched over to using the signed pickle from the celery one here. This apparently, like I said, was already solved previously for exactly the same reason in celery, but it just wasn't applied to Redis. So given my instructions, it said, all right, great. What we're going to do is we're going to just copy or not copy, import and reuse that code. You'll see there's a bit of an issue here. Maybe it makes sense. I think it does make sense to create like a dedicated utility library that doesn't create, you know, say circular imports and other issues that could possibly show up. But, you know, this is just an example. It's not true engineering. So I'm not going to go into those little nuanced details, right? The idea is that we got the existing code. We're going to use it to fix this problem. Now it says vectorize or underscore vectorize was rewritten to use signed pickle dump s, dump string, and signed pickle load s for load string. On failure, logs a warning, and I treat it as a cache miss. Okay, that's pretty sweet. I really like that aspect of it because I've run into this a lot with caches, not just for tampering, but like I said, this pickling stuff has durability and basically version inconsistencies. So if there's some kind of failure on the parsing, it can just say, well, that must have been an old value. We're going to have to recompute it because it's no longer binary compatible with our version of Python we're running in. So that's kind of a nice fallback, right? Just never reach it, just create a new one as if it's a cache miss. It is a cache after all, right? We also got our test and so on. So let's just look at this one here. Click on classifier and it says, actually, Let me zoom out a little here. It actually talks about vectorize, so that's really the important part. But we can see right here, those bits are changed. Paperless.celery, we imported the things as I described, the two functions, and then down here, it's rewritten. We can click on this and see a diff here. So this is the new version. And the old version, what it did was it would just say, if we got a result, good to go, remember? And it would just straight read it. I would get it from the cache and read it. Let's see, do we see that up here? Yeah, it just gets it right here, get the cache value. And then just pickle_dump_s, off it goes. We're setting it here, this is where we're setting it, right, for a certain amount of time. So let's get out of that. So now we get the cache value, and if there is one, then we do this signed pickle_load_s where we parse it out and make sure that the digital signature is good. If not, the HMAC mismatch creates this exception here, and we just set the value to none. We just update the value to keep it from falling out of the cache. And if it still doesn't exist, then we're going to recreate it, put it in the cache, and return it. So standard caching stuff. But the real nice value here is that we added signed pickle load s and signed pickle dump s into the cache here. That's what we got. I think it looks really good, and I probably wouldn't have caught this myself. Maybe I would. Maybe somebody would report it. But again, this is a pretty big project. Let's run tallyman and just get a sense. 147,000 lines of code. Yeah, that's pretty significant, right? That's not a simple little project that just has a few moving parts. There's a lot going on here. So yeah, it's very possible that this would get missed. And Claude Code and our security lead did a great job uncovering it. And then Claude Code did a nice job understanding the existing code base within the constraints I gave it and fixing it. So I think it's a win. Again, this to me is not a critical issue because the Redis server is so significantly locked down. I think it's not a big deal in practice, but defense in depth, you don't want to just guarantee that nobody can get to your Redis server until they can and then something bad happens, right? So I think this is great.
|
|
|
|
11:22 |
|
|
transcript
|
11:22 |
Well, here it is. You've made it to the end of the course. Congratulations. And I think you must have learned a ton of skills. You really probably understand the OWASP top 10 a lot better than you did. And most importantly, I hope you take away two things. First, that it's not about the tools you use. It's about how you use them. We saw that Django, Flask, FastAPI, and other code all had potential issues, and it wasn't because of the framework we chose. The solution to the fix might be tied to it, but not the problem. Often people think, oh, I shouldn't use this function, or I shouldn't call this API because it's insecure. Yes, maybe, but that's only the tip of the iceberg, as we've seen. We've worked through the entire OWASP top 10 for 2025 from broken access control to the brand new mishandling of exceptional conditions. Together, we reviewed vulnerable code, and then we either wrote or looked at the fix for that code. So as I said, Flask, Django, FastAPI, they all can surface the same types of issues, although how you fix them and how you spot them is slightly different. And critically, you've learned how to use AI to audit and create an audit workflow that you can point at any project. Given enough time and enough encouragement, you can use the security lead and one of the top-tier coding models to uncover incredible issues, either in other code or most likely. And the goal for me writing this course is for you and your projects. So we've audited three real-world projects that are open source that we could look at, of course. Paperless-ngx, Apache Superset, and Kibitzr. These are all real projects with a very high number of users, a very high number of contributors, and we found some issues in each and every one of them and mapped them back to the OWASP top 10. Speaking of which, we have broken access control. Recall, these are categories, not a single specific problem, and they're not tied to a single specific programming language or framework. They're security misconfiguration, a new one for 2025, software supply chain failures. This is a very, very serious one that continues to grow even today. Cryptographic failures, injection, insecure design, authentication failures, software or data integrity failures, security logging and learning failures, and one we already called out earlier, mishandling of exceptional conditions. Each one of these represents a deep category of issues that have different problems. We saw at least a couple examples from each one of these throughout the course. So we use Claude Code. Of course, this would work with codecs or others as well. But here's the workflow. We start out by setting up the project so that we can run it. We point Claude Code at the repo and maybe review the security statements. Like some of these projects had a security.md file or a CLAUDE.md and so on. And you want to prepare it to have Claude Code to work on it. So if you don't already have it, ask Claude Code to create a CLAUDE.md file. And you want to copy over the security agent or the security lead agent and command so that you can actually use that. And by the way, while you're at it, make sure you read through the security lead and maybe change it around a little bit. like certain parts of the security lead talk about specific technologies that I was using that maybe don't apply for your project. So tweak it a little bit to make it relevant to the specific project you're working on. Now, I didn't set up our projects to run. Like I said, the infrastructure for things like Superset or Paperless NGX is pretty intense. So instead, we just worked on it. But I'm assuming that in step one, this is actually your project. You can run it. You already have been working on it. So this virtual environment and infrastructure thing should be a lot easier in that case. Next, we invoke the security lead with the slash ask security lead command, and then give it some time five to 10 minutes. After that, it will walk through all the issues that categorizes them by OWASP top 10, but it's definitely not limited to those. What you get back is a nice severity assessment and CW linked report. So super, super cool. Remember, use the very, very best model that you can. Speed is not your friend. The cost is not that high. Remember, I use like 5% on all of superset, including the fixes in my five hour window, not my one week window for cloud code, for example. That little bit of extra time and extra effort is certainly time well spent to avoid false positives and other types of issues. Finally, once you have your security report, you can go through, assess it, figure out what makes sense for you, and then one by one have Claude Code plan out a fix and then apply the fix. We just ask for the fix directly, but you should really use a lot of careful engineering practices here to make it, document it, maybe have it create a GitHub issue, explain what the problem is, then have it apply the fix and so on. And remember, the way we just defined the security lead, it's a vigilant Python and SaaS security specialist, proactive, pragmatic, and educational rather than gatekeeping. It has three core workflows, the security review, which is based on the OWASP top 10 lens and other vulnerabilities. I had it focused specifically on data isolation. So if you're running a multi-tenant type of application, you know, multiple users can access their data, but not other people's data. That's super scary to me. So make sure that none of the data leaks and you're always keeping everything isolated. So hunt for cross-tenant leaks and IDOR risks. And then logging and alerting. So focus in on that and verify that if an attack happens, you'll be able to know what happened and that it actually did happen. Gives you a nice deliverable, a nice little markdown report with severity ratings, OWASP references and concrete remediation code. So it's very easy to have it go and apply the fixes. So you just invoke it, slash ask the security lead. A lot of times I'll put a little more extra info. If you just type slash ask and then autocomplete on security lead, it just starts. So it's something like, let's ask the security lead and give it more detail about what I actually want it to do. So it picks up the top 10 out of our agent definition and all the concerns that we put in.clod slash agent slash security dash lead. We did three real-world projects. We checked out Kibitzer, which is a personal web assistant. and we found a couple of issues. No single one of these had a, oh my gosh, remote code execution or takeover in an extreme way without some other necessary precondition, like session hijacking. So that was really good, actually. But what we found for Kibitzer is we had the SMTP TLS downgrade that would happen if you tried to log in and the TLS failed, it would just try without TLS. Okay. We also had the Firefox profile isolation. If you use Firefox as your default browser, and then you use Kibitzer, and it also uses Firefox, it might already have other session cookies that shouldn't. We also had the unbounded body recursion and timeout hardening. For Apache Superset, we had the SQL lab and the row level security gap. This was explained, but you need to be absolutely sure about it. And we have the secret keys, the cookies and API key lifetimes that we found. We specifically looked at having the embedded chart sharing secret for the JWT token secret there, making sure that if we're enabling that feature, we wouldn't let Apache Superset start just like the top level secret key wouldn't. And finally, we had paperless NGX, which is document management. We had the cookie and the HSTS hardening gaps. Remember, you could not set those values from the environment. They were just defaulted, and that was that. Also, the password change without specifying the old password and the first run super user race condition, as well as some of the audit log gaps that we surfaced. So I think these are all really interesting. Like I said, none of these are enough that I'm like, oh, my gosh, I have to reach out to these people and let them know right away. Still, I think we could make some improvements to each one of these apps, and they're all very popular. which I think is pretty interesting. And if you're inspired, we have two more things you can do after this to really level up your agentic programming as well as your OWASP and security side of things. So on the OWASP side, I recently interviewed Tanya Janca about the OWASP top 10 2025 list. She was on the organization that basically approved and debated what was on the top 10 list for 2025. And we go into like what made it, what didn't, why it was there. So certainly check out talkpython.fm/545. If you're interested in a dive on to the background of OWASP a little bit deeper. And if you love the agentic programming aspect and you're like, I really could be better than this. Or I'd really like to see it in action better. Check out the agentic AI for Python course that I created over at Talk Python training. Same side as right here. This is a hands-on course where we go from zero to an amazing Python app built out with specifications, built out with rules and agents that make building the app really, really far beyond anything that VibeCoding would come up with. It's really a nice engineering way to use AI. So if you're interested in more Python plus AI, check out this course. Now, go audit something. Ideally, start auditing your own projects. Sure, you've got a web app or an API or mobile app or something out there that you've been working on. Turn the same technique against it, but go way deeper, explore it more, and really encourage Claude Code and the security lead to pull everything apart. Add tests to demonstrate the issue, demonstrate the fix once the fix is there, and so on. So go out there, put what you've learned in this course to use. And just one more thing really quick, don't just start grabbing open source projects, running cloud code through them, and then submitting PRs to places. Be very careful about not overwhelming projects with effectively meaningless types of PRs and issues. Like, well, technically this is an issue, but this is really not relevant for our projects, so we're not going to fix it. People just don't want to have that. So be very, very careful about coming up with small, crisp, example-driven submissions if you're going out to work with the broader world. But certainly, start by auditing your own things. And thank you for taking the course. See you around.
|