Python 3, an Illustrated Tour Transcripts
Chapter: Numbers
Lecture: Statistics

Login or purchase this course to watch this video and the rest of the course contents.
0:01 In this video we're going to talk about the new statistics module
0:03 that came out in Python 3.4, this was introduced in pep 450.
0:07 From the pep we read, even simple statistical calculations
0:10 contain traps for the unwary,
0:12 this problem plagues users of many programming languages, not just Python
0:17 as coders reinvent the same numerically inaccurate code over and over again.
0:22 Here's an example of some of the issues that someone might run into
0:25 when trying to implement some numerical code.
0:28 This is a simple function for calculating the variance.
0:31 That's the change of values over a sequence of numbers
0:36 how much they vary and here we are just calculating
0:40 the sum of the squares minus the square of the sums
0:45 and dividing by the numbers
0:47 so down below here, after we've defined variance
0:49 we pass in a list of numbers and we get the variance
0:52 and we say it's 2.5. It seems to be fine.
0:55 The problem is when we add a large number to that
0:58 here we're adding 1e to the 13th
1:01 and we're getting numbers that still should have the same variance
1:06 because the difference between them is still between 1 and 5.
1:09 And when you run that into our calculation here
1:12 you get a large negative number
1:15 and this illustrates some of the floating-point issues
1:18 that you might run into with simple naive calculations.
1:21 And so the impetus of this pep is to help deal with some of these issues
1:26 and provide a pure Python implementation of some common statistical functions
1:30 that don't have these sorts of issues.
1:34 Here we're showing an example of using the library.
1:37 We simply import it, it's called statistics,
1:39 and inside of there, there are various functions.
1:41 One of them is variance.
1:43 We look at the variance of our same data
1:45 and we get 2.5, we add 1e to the 13th for each of those numbers
1:50 and we still get 2.5.
1:52 There are various functions included in here.
1:54 I'm not going to go over them, but you can look at the function
1:56 and if you're dealing with statistical problems,
1:59 you can use this code if you need to.
2:02 Other nice thing to do is just to use the code to look at it
2:05 and glean some insights on how you might do numerical processing code in Python
2:10 and deal with some of these issues.
2:13 This module is written in pure Python
2:15 and so you can simply load the module up and inspect it
2:19 and see what tools and techniques they're using.