“Only a master of evil formulas, Darth!”
”…hmm, which kinda DOES make you the master doesn’t it? Crap, I give up, just
chop me in half and get it over with, but make sure Luke sees you do it ok?”
Anakin’s Second Line of Questioning
Two weeks ago I posted about a particular student’s knack for making me learn new stuff, specifically the “bubble up ranks” method. (Ooh, I forgot to circle back and cover the “bubble up exceptions” part – making a note of that now).
Anyway, he also pressed me to teach STDDEVX(). That’s a first. “No worries,” I said, “it’s just another X function – but only you will know if the numbers are correct, because I don’t know what Standard Deviation truly means.”
Embarrassing isn’t it? To be a numbers guy and not know what Standard Deviation means? I was on the varsity calculus team in high school. I even have a Math major (to go along with my Computer Science and Philosophy majors – jack of all trades, master of none). How did I slip through the cracks on Standard Deviation?
OK yeah, it measures deviation, variability. I get that from the name. And yeah I even “learned” it in Statistics class in college, to the extent I ever showed up at 8 am. But I didn’t grok it. Didn’t breathe it in deeply. Until now.
It’s all fun and games until the Football Project Returns!
Standard Deviation (in Red) Only “Makes Sense” When Compared to the Average (Green)
Turns out, Standard Deviation is a simple concept! It has real world tangibility! No one should be scared of it, at all.
Before I wax philosophical, here are the formulas displayed above:
[Yards per Week]=
[Yards Measure] / DISTINCTCOUNT(Schedule[WeekNumber])
In English: “take the total number of yards this player earned over the season, and divide by the
number of weeks in the season.”
[StdDev Yards per Week]=
STDEVX.P(VALUES(Schedule[WeekNumber]), [Yards Measure])
In English: “step through each week of the schedule and see how many yards this player gained in
each week. That results in a set of numbers. Then take the standard deviation of that set of numbers.
Simple huh?
LaDainian Tomlinson had a Standard Deviation of 65.3 in 2003 – What Does that MEAN in the Real World? Can we TOUCH It?
This is the cool part. YES, we can absolutely touch it! Two crucial points “unlocked” the mystery for me forever:
- Standard Deviation is “measured” in the same units as the average. In our example, our average is measured in yards per week. (He averaged 139.4 yards per week). We can also say that Tomlinson had a standard deviation of 65.3 yards per week!
- We should expect “normal” occurrences to fall in a range that is plus or minus 1 standard deviation from the average. In our example, in a “normal” game from 2003, we should expect Tomlinson to gain somewhere between (139.4 – 65.3 = 74.2) and (139.4 + 65.3 = 204.7) yards!
An Aside on the Value of Education…
Anakin, my student, taught me those two things in about 15 seconds. Why (and how) was that simple explanation kept from me all these years?
Let’s put that in perspective. My four-year private school college tuition from 1992-1996 was valued at $72k (I was on scholarship & didn’t pay a dime, no way my family could have afforded that). Today that same university charges about $180k in tuition for 4 years.
And I estimate that I learned no more than 12 useful lessons over those four years.
So in today’s dollars, my student taught me a lesson worth $15k. Let’s not tell him that. (OK he’s probably reading this).
Let’s Put That In Practice: Upper and Lower Bounds
Let’s take point #2 from above and “implement” it in two new measures:
[Yards per Week Expected Upper Bound] =
[Yards per Week] + [StdDev Yards per Week]
[Yards per Week Expected Lower Bound] =
[Yards per Week] – [StdDev Yards per Week]
Tomlinson’s High Average is Built on Big Games, But If You Had to Stake your Life on Someone Gaining at Least 85 yards in a week, Priest Holmes is Your Guy.
Hrm, this smells like another technique that’s relevant to charting. More on this later.
Important: Average and STD DEV Must Have the Same “Granularity”
One last point. In this example, I was taking an average per week. And my std dev measure stepped through individual weeks. That is important:
[Yards Measure] / DISTINCTCOUNT(Schedule[WeekNumber])
STDEVX.P(VALUES(Schedule[WeekNumber]), [Yards Measure])
If I wanted to measure variability of yards per season, or yards per play, I would need to change both of these metrics to that same “per thing” methodology before I could use them together.
In other words, standard deviation is just as “tied” to a particular granularity as an average would be. It is not a “universal” measurement.