Reactive Web Dashboards with Shiny Transcripts
Chapter: Reactivity
Lecture: Reactive calc

Login or purchase this course to watch this video and the rest of the course contents.
0:00 So far when we've been using reactivity, we've been using what I'd call shallow
0:05 reactive graphs, where each of the inputs are being directly consumed by an output.
0:09 So here in this application we're looking at, we have these two inputs and
0:13 each of them are being consumed by the rendering function of these outputs. This is great, but it doesn't give us a very nuanced way of building our
0:22 application. And what we actually want is something with more depth. We want to be
0:25 able to maybe store intermediate values and have them kind of re-execute intelligently across those values. And there's something called reactive
0:33 calculations which does just that. So let's take an example here to sort of build up our intuition about what we want. So this is a second tab of this
0:42 application. This would be like if we had deployed our model and we are taking a
0:47 sample of the data from production, kind of a query to see like, okay how is our
0:50 model doing in production. And we have three inputs here. We have the same
0:55 account input that we had before, but we also have this dates input which lets
0:59 us select a time frame from the database. And we have a sample size that we're
1:04 able to make a change to sample. You know, I want to sample 20,000 records or
1:08 25,000 records from the database. And what I want here is a few things. I want this sample to be stable. So this dates and sample size, I want it to be
1:20 stable and I also want it to run not that often, right. This might be an expensive database query or something where, you know, I don't want to have it
1:27 rerun. So I want to like pull all the account information within this
1:33 sample and have that same sample be used by both of these plots. And then I want to
1:38 be able to do these filters in memory. So I want to pull the sample for all of the accounts into memory, into the memory of my application. And I
1:46 want to be able to make changes to these accounts without retaking the sample, without checking the sample. And lastly, whenever I make a change here
1:55 from sample size to sample size, I want this to be a fresh sample each time. So I
2:00 want the sample to be retaken whenever I change the dates or the sample size, but not when I change the account. So let's just write this down, what I
2:08 actually want. So I want to query the database for a sample between dates and
2:12 I want to filter that account by the account name in memory. And lastly, and
2:17 then I want to use that same data for both of the plotting functions. Okay, so that's the basic like what I want the application to do. But I'm really
2:24 demanding, I have some more things. I'd like to cache the results of one and two.
2:28 What caching means is that I want this to, in certain circumstances, keep the copy of the data in memory somewhere and return the cached value
2:39 instead of recalculating it. So when I change the account name, I want to return
2:43 the cached value of the sample instead of taking a new sample each time. But I also want to invalidate the cache whenever the upstream inputs change.
2:53 So when the sample size gets changed or the dates gets changed, I want to
2:57 invalidate the cached sample result. And lastly, I don't want to do any thinking or work. So this is kind of the demanding thing that I think Shiny
3:05 does that the other application frameworks don't do so well. So these last three. So I want this sort of caching and validation without actually
3:14 requiring me, the software engineer, to do much work and thinking about how that caching should happen is where Shiny really shines.
3:23 So reactive calculations are the tool that we use to do this. They're defined
3:27 with a new decorator, a reactive calc decorator, and they cache their values so it's cheap to call them repeatedly. And we get perfect cache
3:35 invalidation through reactivity. Since it takes place within that reactive graph, we know exactly when we should discard the cached value whenever
3:44 its upstream elements change. And when that cache value is discarded, we know that we need to notify the downstream elements to cause them to
3:53 recalculate. This is how you would actually, the code that you would use to do this. So we have a reactive calc decorator on sample data. We have
4:03 another reactive calc decorator on filter data. So this is the one that takes the sample from the database. This one filters that sample data in
4:10 memory. And then finally it's sent to the plotting function, the plotting renderer, at the end of the day. Reactive calcs are defined like
4:19 functions. They are functions. And they're called like functions. So we define sample data as a function. And then when we call it to get its value,
4:27 we use the filter data, just like you would with an input. And for the purpose of this, adding these to the graph, I'm going to use hexagons for
4:35 reactive calculations. And they'll be blue when it's retrieved from a cache and orange when it's recalculating. Taking a look at again our
4:43 application, we have three inputs, account, dates, and sample size. And we have two outputs, the scores plot and the, what am I calling it, API
4:53 response time plot. And I have these two hidden reactive calculations, the sample that stores the data that's pulled from the database, and the
5:05 filtered data which stores the in-memory data which has been filtered by account. Again, when Shiny starts up, it doesn't know the relationship
5:12 between any of these things, although I've kind of drawn them so you can guess what the relationship is going to be. We first start with calculating
5:19 the model scores. So we calculate the model scores and we go and get the filtered reactive calculation. We try to calculate that reactive
5:27 calculation. We discover we need this other one. We need the account information and we also need the sample. Go and calculate sample and we
5:34 need to get those inputs, date, and sample size. So right there we have, sort of, in the same way we auto-detected the graph for the simpler
5:42 application, we've auto-detected this kind of more complicated or an application with more depth. When we calculate API response though, since
5:51 filtered hasn't changed, like none of filtered's inputs have changed, it's still a valid reactive calculation, we don't recalculate it. We retrieve the
6:01 value from cache. This is always going to happen in this case because API response is going to calculate after model scores, so it's always going to
6:08 get its value from cache. When the account changes, so if I go up here and I change the account, but I don't change the sample size, I'm changing this
6:19 back and forth, what happens? Well, we do the same algorithm that we did before. So we have the account, we follow the arrows down, and we
6:27 invalidate its immediate descendants. But whenever we invalidate the descendant, we're going to also invalidate its descendants. So we're
6:35 going to validate, so account changes, filter invalidates, the two plotting functions invalidate, and then once they're done, we recalculate. So model
6:43 scores calculates, then we go get filtered, go get the account, and we get the sample. But since samples, its precedence didn't change, its
6:53 dependencies didn't change, we can get the value from cache. So we're not taking a new sample every time the account changes. And then API response
7:00 gets the filtered reactive calc from cache. When sample size changes, the whole plot invalidates, or the whole graph invalidates. So sample
7:10 invalidates, filtered invalidates, the two plots invalidate, and we're back to the beginning. Calculate model scores, get the reactive calc, get its
7:18 dependencies, and then we get the other inputs, and API response fires, and we get the filtered value. So right there we have this reactive graph with
7:30 more depth. We're able to store these intermediary values as reactive calculations and reuse them in multiple places. And because all of those
7:39 calculations exist in this reactive graph, we can use that graph to do calculate, to do cache invalidation. We know exactly when those need to be
7:47 recalculated, and we know what should happen when they recalculate. So they recalculate whenever its upstream inputs change, and when it recalculates,
7:57 it notifies its downstream, the things that depend on it, that they also need
8:01 to recalculate. We did this without the user really necessarily even needing to know that this is happening. But what this lets us do is build up
8:11 applications with a lot more complexity while maintaining the performance of
8:15 those applications. So you, the engineer, don't need to necessarily think about
8:20 how the application is re-rendering all the time. All you need to do is make sure that whenever you have repeated calculations, you're storing them in
8:27 reactive calculations and using them in downstream elements.


Talk Python's Mastodon Michael Kennedy's Mastodon