The musings of an overwhelmed brain

Tuesday, March 30, 2021

The Wages of Wins Fallacy: The Illusion of Control

Back in the early 2010's, there were essentially 2 schools of basketball analytics: exclusively box-score based stats (i.e. PER) and exclusively plus-minus based stats (adjusted plus-minus). For those unfamiliar with basketball analytics, the basic idea behind adjusted plus-minus (APM) is that every player is assigned a value and each player's value is calculated from the extremely complex system of equations formed from all the data. For example, if there are 3 players in a league of 2-on-2 (and no "opponent" players) and assuming the following results--1) during the 100 possessions A+B were on the court, they outscored the opponent by 10; 2) during the 100 possessions A+C were on the court, they were even with the opponent; 3) during the 100 posessions B+C were on the court, they were outscored by the opponent by 10--the 3 equations in this system are A+B=10, A+C=0, and B+C=-10 and the APM values would be A=10, B=0, and C=-10. Meanwhile, the most vocal box-score-based stat community was the Wages of Wins people (and their stat Wins Produced, aka WP). Without going into too much of the specifics, the major argument between the 2 sides was the preference for lowing bias of the APM community and the preference for lowering variance of the WP community.

Basically, there is a bias-variance tradeoff in supervised learning (which is a subset of machine learning with the goal of predicting an output variable from a set of input variables), and APM was essentially on one end of the spectrum while WP was on the other. In layman's terms, reducing bias means reducing the number of variables not being measured by the model (or in the case of this discussion, reducing the number of skills and thus "categories" of players that the model misvalues) while reducing variance means reducing the random noise being measured by the model (maximizing the consistency of the metric over time). Ultimately, trying to minimize either bias or variance specifically isn't the goal of any all-in-one metric; the goal is to minimize the prediction error (though there can be value in statistics intended to speicfically describe what happened, with an understanding that they're measuring some amount of luck that won't presist; for example, clutch stats are very descriptive since for the most part, luck is a larger factor than skill). When the debate between the two communities first started, both sides were too extreme on opposite sides of the tradeoff spectrum, in part because no one had publicly identified any methods for finding a compromise that lowered overall prediction error. However, the APM community was very cognizant of the flaws in their methods while the WP community was blissfully ignorant of the fact that reducing variance doesn't mean reducing error. This was evident as RAPM (regularized adjusted plus-minus) was introduced as a means of reducing the variance captured by APM and it completely blew WP out in terms of reducing prediction error (though, if I recall correctly, APM already beat WP), but the WP community still held on to their misguided virtue of minimizing variance. "Our stat is very good at predicting our stat in future years!" (Who cares that it doesn't predict who'll win the game...). For those interested in learning more about how optimizing for variance is arbitrary (hey look a Russell Westbrook problem) and can be a misguided goal especially at the extremes, Ben Taylor actually wrote a very informative and comprehensive look at various all-in-one NBA metrics.

My best explanation for the insistence of the WP crowd even in light of clearly contrary evidence is the illusion of control and this ties in the Russell Westbrook problem. Humans are wired to desire control, even at the expense of suboptimal decisions. Why do some people with high incomes/income potential (such as me 10 years ago) try to save every last penny when purchasing items from the grocery store but refuse to spend that time learning how to invest the money they make? Because the former is significantly easier and we receive consistent feedback that we're helping achieve our goal of maximizing net worth. It's psychologically difficult to stick with learning about investing when we don't know for sure we're doing it correctly (assuming we have a short stretch of outperforming the market) or, even worse, we receive information that we might be doing it wrong. Why do most people who care about global warming (again, such as me) likely overobsess about recycling, at the expense of spending time/resources that might actually make a much more significant impact than the marginal plastic container? Because it gives us a sense that we're contributing, whereas it's not clear what these other activities we could do to help the planet are.

It's worth pointing out that even the least biased metrics are still very biased, simply because they're based off conditions that exhisted in the past. In the NBA, a post player pre-2000 likely drove very efficient offense (for his time) simply because the alternative was a perimeter player who took 20-footers, so low-bias metrics based off that data ended up undervaluing perimeter creators post-2010 because all of a sudden those players started shooting 3's. This ties in a point Nassim Taleb makes in The Black Swan; when a black swan event(an outlier event that wasn't predicted) occurs, people immediately prioritize preventing that exact black swan event from happening again, as they are under the illusion that THIS TIME they're reducing left-tailed outcomes. In reality, they're being given a false sense of security by their narrative fitting that is exactly what will lead to future black swan events. As someone who used to work in Risk Management for a bank, I can tell you that risk management is specifically about reducing arbitrary risk metrics during "normal times" that offers no benefits during actual risk events. So even a stat like APM (or any of the various improvements on it) is still biased, but given that the people who value it are more comfortable with variance and less likely to be under the illusion of control, they're likely aware of these flaws. On that note, maybe it's unsurprising that we can't say the same about the founder of WP, Dave Berri; he is, after all, an economist.

Edit: hearing the news about the J&J vaccine pause and what a common to it has been (including by me, as I've made sure to get a different one). People are more afraid of the dangers they can identify and explain than the ones that are more common but they can't. People are more afraid of flying in airplanes than eating a bunch of sugar all day. Seems like an extension of the illusion of control

Friday, March 19, 2021

The Russell Westbrook Problem

I've historically exhibited extreme depth but very limited breadth in my interests, and so naturally, I find analogies in some of my areas of expertise to apply to life. One of these areas of expertise is in why Russell Westbrook is the most overrated basketball player of my viewing lifetime. In other words, he's probably the NBA player for whom there's the largest discrepancy between my opinion of his value and the opinions of everyone else--and by everyone else, I'm including both the median casual fan (the 50th percentile opinion) as well as other more serious analytical folks (the 10th percentile opinion). Basically, he was an outlier in terms of the statistics points, rebounds, and assists, and his tools (size, speed, strength, vision, etc) were all very good. The reason I deviate from other people in terms of his value added to winning basketball is that either consciously (maximizing points, rebounds, and assists helps you get paid more, specifically because most people value those stats) or subconsciously, Westbrook ended up prioritizing to such an extreme degree metrics that have historically correlated with "winning" that he often did so at the expense of important tradeoffs.

And this is the phenomenon I'm calling "the Russell Westbrook Problem," which is a specific variant of "missing the forrest for the trees." It's the simplification of a big picture goal (in this case, winning basketball games) to metrics that correlate with that big picture goal, which due to the assumption of causation from correlation, leads to making suboptimal tradeoffs that then result in the dissolution of the original correlation. It's important to note that for the most part, there is very limited conscious intention in this practice, so I'm not necessarily assuming malicious selfishness in his behavior. In Westbrook's case, the most obvious example is that he often left opposing offensive players open away from the hoop because by staying closer to the hoop, he could 1) grab more rebounds, 2) save more energy for offense by not having to move as much on defense, and 3) start fastbreaks easier (which is good for his team's offense) given he's the one grabbing the rebounds. A second example is that he dominated the ball on offense heavily, which to put simply, means he often only passed when a teammate was in an immediate position to shoot (which led to Westbrook only passing when he could get an assist). All of these things led to him being rated extremely highly by box score stats, be it the casual fan's "triple double" (he had many games of more than 10 points, more than 10 rebounds, and more than 10 assists) or various all-in-one box-score metrics such as box-plus-minus (BPM). The latter is due to the fact that rebounding is often correlated with size and athleticism and assists is often correlated with high basketball awareness/basketball IQ, so a player who is athletic AND "understands" the game is basically one of the best players ever. However, he's historically the player who achieved these feats at the highest tradeoffs ever. As Westbrook (and other players at a less extreme level) started to optimize for metrics that used to be correlated with winning at the expense of other less measurable aspects of the game, those metrics started to weaken in their correlation with winning. The Russell Westbrook Problem.

So what are examples of this problem in normal life? Well, one such example is in human health. People have long observed that being overweight leads to many physical ailments (diabetes, heart disease, dementia, etc), so health (the original goal) is heavily correlated with being overweight. What else is correlated with being overweight? Eating lots of high-fat foods like fried foods, burgers, and dairy products. So for much of the 20th century and into the 21st century, scientists and corporations have pushed the low-fat narrative for "heart health" and losing weight. However, as with Westbrook's points, rebounds, and assists, this was an overly simpified causation assumption that people gravitated towards because it was easy to understand and implement. What's not as easy to understand and implement is the additional fact that a lot of the bad foods people had correctly identified were bad for reasons other than/in addition to their high fat content. Fried foods (i.e. french fries), burgers (the buns), and dairy products (i.e. desserts such as ice cream) are also extremely high in refined carbohydrates, processed oils, and simply in calories in general that lead to consistently high insulin levels, which then leads to inflammation and all of the various associated health problems. Simplifying the problem to "fats = bad" led to generations of Americans eating large amounts of other unhealthy foods (such as sugar, breakfast cereal, refined grains, and processed seed oils). Add to all of that the incentives for the government, corporations, and individual farmers to produce high volumes of low-cost foods (that by the mere existence of other more high-cost foods such as avocados, grass-fed beef, and organic vegetables means they're less healthy, because if they weren't less healthy, nobody would eat the other foods that are "worse" in the other dimensions) while blissfully assuming they were succeeding at their original goals of being healthy. The initially simple and reasonably accurate proxy for being healthy of avoiding high fat foods now no longer correlates with healthy eating as people found loopholes to obtain their sugars and vegetable oils. The Russell Westbrook Problem.

Another example that specifically hits home for me exhists in spirituality and the related issue of mental health. For much of humanity, people were content with their lives (even if by today's Westbrookian standards of wealth, life expectancy, and power/influence, they were extremely unsuccessful people). They didn't really have many life decisions to make because they didn't have many options; they simply had to do whatever was available in order to feed themselves. And yet, mental health problems are at their highest levels by far in recent years, especially among "wealthy" members of the developed world. This problem is accelerating despite the overall higher levels of wealth, improved flexibility in career options, and higher life expentancies (due to medicine that specifically optimizes for keeping people alive) among those people. And why is this? It's because in times of less manufactured optimization, having lots of money correlated (and often caused) people to be more healthy. Having more money meant one was no longer starving; having more money meant one likely was among the best at one's "profession" due in large part to hard work and commitment; having more money meant one could then devote more time and resources to other interests that also brought meaning to one's life. What's changed since? Well, the specific optimization for monetary wealth has dissolved its relationship with life contentment. People spend their entire able lives trying to make more and more money (from the billionaire to the corporate hamster), at the misguided assumptions that 1) they'll necessarily be alive, 2) they'll be healthy enough to enjoy their newfound luxury and freedom, and 3) they'll be wiling to give up the additional marginal wealth treadmill. People go to graduate school when they can't figure out what they want to do with their lives (law school being the most egregious) assuming that by spending additional years they could be trying to find themselves working towards this well-defined path while accruing additional debt will allow them to in the future make more money and be happy. The most common regrets of people on their death beds are that they ended up living other people's lives and not lives true to themselves and that they didn't spend enough of their lives around the people that are important to them. But it's a good thing their lives ended with very high monetary wealth scores, right? The Russell Westbrook Problem.

Intention

I'm starting this blog as I've noticed in recent months that my mind often feels overwhelmed by thoughts and releasing them (often by talking to people but hopefully also just by journaling them) helps bring a state of calm back to me. I have some current intentions for this blog but I don't really want to restrict myself right now by expressing them, as I have a tendency to miss the forrest when I find a specific easier-to-optimize tree. And I hope that by keeping my options open, I will find this blog less stressful to write, as I've started so many different blogs over the years with different narrow purposes but have been unable to continue them, and that will help it achieve its original goal of keeping my mind from overflowing with thoughts.