I have spent the past four years of my life enamored with the mathematics of probability theory and its statistical applications. When I was first introduced to the subject in high school, I was curious to determine if flipping a coin was truly a fair game. Many, many flips (and plenty of lost quarters) later I can safely attest that this is probably the case. Testing theoretical probability distributions is an empirical process, and this trial and error mindset is something that I believe escapes many individuals studying probability theory and statistics today.  Good statisticians employ a rigorous scientific process, thereby creating actionable results. The best statisticians cultivate a robust analytical framework by assuming that they know nothing. To quote the late Stephen Hawking, “The greatest enemy of knowledge is not ignorance, but the illusion of knowledge.”

Building upon the assumption that we know nothing, statistics exists to answer questions in an empirical manner. It is not a calculator that outputs a binary “reject/accept the null hypothesis.” The interpretation of hypothesis testing is far more nuanced, yet many individuals studying statistics today believe that hypothesis testing yields binary outcomes. This lack of understanding is extremely dangerous and it served to facilitate the financial crisis of 2008, as financiers did not comprehend the intricacies of the models they were constructing. Credit rating agencies employed a faulty statistical process, providing them with the illusion of precision. However, on an ex-post basis (and ex-ante for the initiated), that was obviously not the case.

Since humans are not the best decision-makers, some people enlist the help of statistical models to enhance their decision-making process. But, as evidenced above, dubious models can misinform rather than inform crucial decisions. I once had a friend approach me with the task of creating a linear regression model that would predict his company’s revenue. However, he faced a data issue; there were only 14 data points in which to construct this model, accompanied by 17 predictors. Thus, if 13 predictors were included in the model, it would predict the training data with 100% accuracy – this is the 13th dimensional analog of drawing a line between two points.

My friend’s boss was pleased with a model that utilized 6 predictors (x1, x2, …, x6) to predict a value y. The R2 of the model was approximately .89 (implying that the model would predict revenue accurately), and his boss was ready to use the regression model in the workspace. However, the model was fundamentally flawed. Take the following experiment: By generating 6 predictors (x1, x2, …, to x6) each normally distributed with 14 data points, and a normally distributed random variable y, it can be observed that – through a completely random processx3 does a good job of predicting the outcome y. Even after my friend presented his boss with this argument, his employer still wanted to stick with the values provided by the regression model because it was “in-line with his expectations and the company’s internal goals.” We should not massage data to support a preferred narrative. Rather, we should let the data inform our decisions.

Recently, taking trips to visit physicians has grown to be one of my least favorite activities, largely because of their failure to understand basic risk management principles. For instance, I once visited an emergency room to obtain a pre-emptive rabies vaccine since I believed I had been bitten by a potentially rabid animal. The nurse who treated me recommended that I should not obtain the shot, even though I had a non-zero probability of potentially dying due to the disease! Furthermore, she estimated the probability that the animal had rabies was 1 in 100 (or 1%). As soon as I was able to regain my stream of thought, I asked her if that estimate was correct. She said it was a guess, but that she could provide me with the actual numbers. I then received a twelve-page report of documented rabies cases in the United States and estimated that the probability I had been infected was roughly 1 in 280,000 (or .000357%). Thus, this medical practitioner overestimated the probability of my infection by a factor of 2,800 times and still suggested I did not need to be vaccinated.

This vignette is not out of the ordinary either. I believe medical professionals are sometimes quick to recommend surgery; the best doctors save such an extreme measure for necessary circumstances. For example, 1 in every 365,000 people who get their wisdom teeth removed will suffer a brain injury or death. Any surgical procedure immediately increases your probability of dying – it is non-zero! I implore you to evaluate the efficacy of any surgery, and whether or not it is imperative to obtain the operation. Your life is quite literally on the line. Some may counter and say that it is worth the risk – and that is fine, perhaps you’re willing to assume it. But the effects of tail risks compound as the number of surgical procedures one undergoes increases. Smoking one cigarette in isolation is not particularly harmful – there is minimal risk of sickness, and one receives a considerable amount of pleasure from the buzz. Thus, taking one-off risks are not life-threatening in isolation. But when viewed collectively, these risks significantly shorten one’s life expectancy.

Humanity’s failure to understand risk intuitively is exemplified by our misunderstanding of volatility. In a study conducted by Daniel Goldstein and Nassim Taleb, finance professionals underestimated the standard deviation of Gaussian random variables by 25%, and that value could approach “90% in some fat-tailed distributions.” Take the following question, which was presented to the 87 participants in the study:

“A stock (or a fund) has an average return of 0%. It moves on average 1% a day in absolute value; the average up move is 1% and the average down move is 1%. It does not mean that all up moves are 1% – some are .6%, others 1.45%, etc. Assume that we live in the Gaussian world in which the returns (or daily percentage moves) can be safely modeled using a Normal Distribution. Assume that a year has 256 business days. The following questions concern the standard deviation of returns (i.e., of the percentage moves), the “sigma” that is used for volatility in financial applications. What is the daily sigma? What is the yearly sigma?”

This is volatility’s analog of the mean dispersion question: “Can you cross a river that is 4 feet deep on average?” Volatility’s analog posits: “On average, how much do the returns of a stock deviate from its mean value every day if the stock’s average movement is 1% a day.” The following chart, obtained from the same Goldman and Taleb study, shows the performance of the 87 finance professionals. Only three individuals obtained the correct daily standard deviation calculation of 1.25%, for a success rate of ~ 3.45%.

The implications are larger for probability distributions that are subject to more kurtosis, or tailedness, than the Gaussian distribution. The graph below exemplifies kurtosis: the area underneath the “tails” of the red curve is far less than the area underneath the blue and green curves. In addition, the red curve has a higher peak density than the other distributions.

To exemplify how kurtosis affects volatility estimates, take the following example which compares the Gaussian to a student-t distribution with 2 and 3 degrees of freedom:

The output above is quite significant. Confusing mean absolute deviation (a value moves 1% up or down on average) with standard deviation (on average there is a x.xx% deviation from the mean value) leads to a 25% underestimation of volatility in Gaussian random variables, 56% underestimation in student-t random variables with 3 degrees of freedom, and approximately 213% underestimation in student-t random variables with 2 degrees of freedom – assuming a 1% estimate was utilized for standard deviation. This implies that our intuition of volatility gets increasingly worse under distributions that are more volatile, which is the antithesis of what any risk-taker wants. One hopes to obtain better estimates of volatility if a distribution is more volatile (which may not be practical) – not the opposite. This emphasizes that “We Don’t Quite Know What We Are Talking About When We Talk About Volatility,” as stated by Goldstein and Taleb. Thus, we drastically underestimate the risks we may be taking through intuition alone, especially when the range of outcomes are subject to more variation than previously anticipated.

This leads us to an important question: if we cannot reliably account for every outcome in a probabilistic process, how do risk-takers protect against catastrophic loss? The answer is simple: refrain from taking risks that could lead to such losses. This past week another hedge fund (Optionsellers.com) closed its doors after losing clients’ money. Clearly, the people managing this fund did not respect the statistical characteristics of financial markets; they took undue risk and blew up as a result. Hopefully, this piece has served to enhance your ability to practically employ a statistical thought process. I caution you to be wary of results generated from statistical models. Numbers can quite literally lie, especially when the people sharing their results stand to benefit from the narrative their analysis supports. Additionally, think about risks in terms of the worst possible outcome, and ask yourself if you would be okay under the event that outcome materializes. Optionsellers.com did not, and unfortunately, other risk-takers will continue to do so in the foreseeable future.

Appendix:

Variable simulation, Multiple Regression/Single Regression output