Async Techniques and Examples in Python Transcripts
Chapter: Thread safety
Lecture: Visualizing the need for thread safety

Login or purchase this course to watch this video and the rest of the course contents.
0:00 So I told you that threading
0:01 requires extra safety, that it can introduce these bugs
0:05 so lets talk about why they primarily exist.
0:09 So let's take a step back and just think about the state
0:12 of a program as we call an individual function.
0:16 So this little box here this represents a function call
0:19 and top to bottom is time, I guess.
0:22 The blue stuff on the right, these are our data structures.
0:25 We actually have two pointers one to the next, let's say
0:27 a class that holds another class that then points
0:29 at an item in a list.
0:31 And then another one that actually holds a list.
0:34 And what we're going to do is we're going to say
0:35 blue represents a valid state.
0:38 So if you were to stop running
0:40 and let some other part of the system
0:42 or return from the function.
0:44 Your program is left in a valid state
0:46 and unless you do something terribly wrong
0:48 while you're writing code, this is always the case.
0:50 You don't have to worry about some kind of weird situation
0:53 coming up, you can exit your program and its
0:55 in an invalid state.
0:56 You normally don't write code like that.
0:58 But you do often put your program into invalid states.
1:03 Let's see how that works.
1:04 So here we're going to come along
1:06 we're going to call this function.
1:07 A little bit of work happens.
1:09 Everything's fine.
1:10 We're going to make a bunch of changes
1:12 to these data structures to evolve our program
1:15 from one state to another.
1:16 This happens all the time.
1:19 We can't make them all at once.
1:20 We can only change, say, one part now
1:24 and then we're going to go along
1:25 and run some more code and make another decision.
1:28 And we're going to change another part here.
1:31 And then finally we're going to make one
1:33 more change that's going to put this all back
1:35 into a valid state.
1:37 What's the problem?
1:38 Well, normally our code runs from top
1:42 to bottom like that, and at the end
1:43 it's back in some, probably new
1:46 but still valid state.
1:48 However along the way it enters
1:49 these temporarily invalid states.
1:52 The problem with threading is you have two
1:54 of these things running at the same time, potentially.
1:57 If they share any of these data structures
2:00 one of the functions looks at it
2:02 while another function has put it
2:03 into this temporarily invalid state.
2:05 Wham!
2:06 You have a threading bug.
2:09 So your goal with thread safety
2:12 is to make sure that any time you're going
2:15 to evolve your program into a temporarily invalid state
2:19 during that step you don't let other parts of the program
2:22 interact with the parts of data that you're working with.
2:26 This can be very coarse-grained
2:27 like, don't let anything else happen
2:29 in the program while it's in this red state.
2:31 Or it can be very fine-grained.
2:33 Don't let anybody touch those two red pointers
2:36 while it's in the state, but if they have other things
2:38 to do, let them just carry on.
2:40 There's this trade-off between how much work
2:42 and management do you want to do
2:43 to try to juggle that very fine grain, and stuff
2:46 how much overhead is there having tons
2:48 of little locks and other checks around all
2:51 that kind of stuff.
2:52 So, it depends on what you're doing
2:54 how you're going to do this. But the key take-away is, you cannot help
2:58 but put in your program into threading invalid states.
3:01 Think of traditional C code that
3:03 just even swaps two variables.
3:05 That requires three steps.
3:07 In Python we have tuple unpacking
3:09 and so you can sort of do it in a line
3:11 but probably in terms of operations
3:13 that still could introduce some kind of threading problem.
3:17 You have to have these temporarily invalid states.
3:19 That's how programs work.
3:20 Your goal is to isolate the data
3:22 that is temporarily invalided
3:24 when that's happening, and then, once it goes back
3:27 you can let all the threads have at it again.
3:29 Then, of course, our program returns
3:32 and everything is left as it should be.