Async Techniques and Examples in Python Transcripts
Chapter: Thread safety
Lecture: Visualizing the need for thread safety
Login or
purchase this course
to watch this video and the rest of the course contents.
0:00
So I told you that threading requires extra safety, that it can introduce these bugs so lets talk about why they primarily exist.
0:10
So let's take a step back and just think about the state of a program as we call an individual function.
0:17
So this little box here this represents a function call and top to bottom is time, I guess. The blue stuff on the right, these are our data structures.
0:26
We actually have two pointers one to the next, let's say a class that holds another class that then points at an item in a list.
0:32
And then another one that actually holds a list. And what we're going to do is we're going to say blue represents a valid state.
0:39
So if you were to stop running and let some other part of the system or return from the function. Your program is left in a valid state
0:47
and unless you do something terribly wrong while you're writing code, this is always the case.
0:51
You don't have to worry about some kind of weird situation coming up, you can exit your program and its in an invalid state.
0:57
You normally don't write code like that. But you do often put your program into invalid states. Let's see how that works.
1:05
So here we're going to come along we're going to call this function. A little bit of work happens. Everything's fine.
1:11
We're going to make a bunch of changes to these data structures to evolve our program from one state to another. This happens all the time.
1:20
We can't make them all at once. We can only change, say, one part now and then we're going to go along
1:26
and run some more code and make another decision. And we're going to change another part here. And then finally we're going to make one
1:34
more change that's going to put this all back into a valid state. What's the problem? Well, normally our code runs from top
1:43
to bottom like that, and at the end it's back in some, probably new but still valid state. However along the way it enters
1:50
these temporarily invalid states. The problem with threading is you have two of these things running at the same time, potentially.
1:58
If they share any of these data structures one of the functions looks at it while another function has put it into this temporarily invalid state.
2:06
Wham! You have a threading bug. So your goal with thread safety is to make sure that any time you're going
2:16
to evolve your program into a temporarily invalid state during that step you don't let other parts of the program
2:23
interact with the parts of data that you're working with. This can be very coarse-grained like, don't let anything else happen
2:30
in the program while it's in this red state. Or it can be very fine-grained. Don't let anybody touch those two red pointers
2:37
while it's in the state, but if they have other things to do, let them just carry on. There's this trade-off between how much work
2:43
and management do you want to do to try to juggle that very fine grain, and stuff how much overhead is there having tons
2:49
of little locks and other checks around all that kind of stuff. So, it depends on what you're doing
2:55
how you're going to do this. But the key take-away is, you cannot help but put in your program into threading invalid states.
3:02
Think of traditional C code that just even swaps two variables. That requires three steps. In Python we have tuple unpacking
3:10
and so you can sort of do it in a line but probably in terms of operations that still could introduce some kind of threading problem.
3:18
You have to have these temporarily invalid states. That's how programs work. Your goal is to isolate the data that is temporarily invalided
3:25
when that's happening, and then, once it goes back you can let all the threads have at it again. Then, of course, our program returns
3:33
and everything is left as it should be.