Analyzing Offensive Value in Context
Evaluating a player’s offensive output relative to his background has been one of the areas in which statistical analysis of the NBA has enjoyed the most success, with analysts since the 1980’s having endeavored to calculate the offensive value of players from various eras. To this day, the term “analytics” is synonymous with shooting efficiency in some people’s minds. Previous models have calculated players’ offensive volume (Oliver’s Points Produced, O-BPM, and the lions’ share of bygone metrics PER and WORP), offensive efficiency (Oliver’s Floor Rating, the offensive section of Wins Produced), offensive production rate (The NBA’s ORtg, OWS/48), and offensive impact (O-PIPM and the offensive components of RPM and RAPM). I have thoroughly studied individual player volume, efficiency, and production rate in The Basketball Bible. My study focused solely on the 2015-16 through 2018-19 seasons (the seasons for which full tracking data was available at the time).
There been no significant efforts to ascertain offensive impact for players prior to the play-by-play era (1996-97 through the present). The only metrics that have claimed to measure offensive impact are plus-minus models. Such models rely on changes in point margin while a player is on the floor to infer offensive impact. (sometimes the change in point margin is augmented by the player’s box score statistics and/or prior plus-minus).
It is possible, and indeed valuable, to measure a player’s offensive impact by comparing his performance against league average in a given season. We begin by determining each player’s volume (total points created), production rate (points created per possession), and efficiency (the ratio of points created to points created and lost). After calculating these values, we move to the comparison stage. Some easy-to-understand statistical methods can reveal powerful insights about a player’s relationship to his context using only these numbers. More to the point, I believe we can determine a player’s offensive greatness and how it affects his historical standing.
Why Subtracting League Average from a Player’s Statistics Doesn’t Tell You as Much as You’d Think it Would
Many analysts have compared players to league average by subtracting league average value from the player’s value. This method is misleading, however, when we are dealing with either a player’s total production or his per-possession production. To understand why this is so, imagine that there are two cities – Seattle and Phoenix. Seattle gets tons of rainfall each year, but Phoenix hardly gets any rain. If we measured the rainfall for each year in these two cities, would it make sense to subtract the historical average rainfall for that city from the rainfall in that year? (i.e., 2008 Seattle – average Seattle) If you do that, Seattle will have both extremely high and extremely low “adjusted rainfall” values. In fact, Seattle will have lower average-adjusted rainfall than Phoenix.
The reason why we get nonsensical results like this is that when values are higher, there is more distance between the values. This distance is called variance. When values are lower (like the annual rainfall in Phoenix), there is less distance between the values. A wet year in Phoenix might be 10 inches of rain, with a dry year being only 1 inch. That means that a wet year is only 9 inches more rain than a dry year.
Having larger values in one group leads to greater spread between the values. Thus, using current value – average value will not truly illustrate the amount of rainfall in that year. There is more rain in Seattle than in Phoenix. There is more rain in Seattle even when Seattle is having a relatively dry year. Failing to recognize that fact will lead us to conclude that a really wet year in Phoenix is not significant. After all, there is only a difference of a few inches. Meanwhile, a year that is barely above average in Seattle is a huge increase in rainfall.
In NBA history, there have been time periods where teams scored a lot of points across the league. (the 60’s, 70’s, and 80’s) During other times (the early 50’s, 90’s, and 00’s), scoring was much lower. The high-scoring eras are like Seattle: there are a lot of points everywhere! Taking points – average points will lead you to the conclusion that the best scorers in the 70’s were better than the best scorers in the 90’s AND the worst scorers in the 70’s were worse than the worst scorers in the 90’s. While this may or may not be true, we won’t know whether it’s true or not simply by subtracting. What we really want to know is how much better or worse a player is than average, given the variance in his context.
A Better Way to Compare Player Performance Against League Average
How do you measure a player’s greatness is context? A very common way to show how far a value is away from average is to find the standard deviation for a group of values. If we have the amount of points each player scored during the 1991 season, we can figure out three things. 1) We can figure out the average points scored per player during the season. 2) We can determine how much distance there is between point totals for players during the season. Almost every player will have either more or fewer points than average. This is the raw difference between their points and the average points value. If we take the absolute value of these amounts (treating them all as positive numbers, even if the player is below average), we can also determine 3) the average distance away from average.
It’s helpful to know how far away from average a normal player’s points might be, since this tells us whether being 50 points above average is a lot or a little. A given year in Seattle will be farther away from the average rainfall amount than a given year in Phoenix. When we know that, we can account for that fact. We can then recognize that annual rainfall values in Seattle will have greater variance than annual values for Phoenix. All we’re doing here is taking the same principle and applying it to NBA players instead of cities.
As it happens, most players are very close to average. In fact, the distribution that you may know as a “bell curve” (at right) illustrates this fact. Most values fall in the white area in the middle. The white area represents 1 standard deviation above average and 1 standard deviation below average. The arrangement, which is often called a standard normal distribution, occurs in many different cases: NBA statistics, test scores, income, rainfall, and any number of other things.
How to Determine a Player’s Offensive Value
What we have now is a way to tell how far away from average a player is, given how far away from average most players are in that season. Using a simple method called a z-score, we can learn a lot by asking the question, “how many standard deviations is this player away from average?” The reason this comes in so handy is that a player who is 2 standard deviations above average in 1991 has the same relationship to his era as a player who is 2 standard deviations above average in 1971, even though more points are available in 1971. Using this value, the z-score, we have an expression of a player’s value that is not dependent on his context.
Expressed in this way, we can compare seasons from low-scoring eras and high-scoring eras on a level playing field. We can also compare a player who played when there were 10 teams in the NBA with a player who played when there were 30 teams in the league. After all, we already accounted for the impact of having fewer players in step 3. This is how we evaluate the greatness of players from different eras.
What Are We Measuring?
We’ve accounted for how to measure greatness, but what are we measuring? Most offensive statistics are fairly similar (at least the ones that are based on individual player performance). My personal measuring rod, developed and explained in my Basketball Bible, is called Points Created. Points Created depends on modern tracking data for the highest level of discrimination. As a result, I have used a similar formula to estimate Points Created for the entirety of NBA history. Using the estimate, we can include seasons where we do not know how many of player’s baskets were assisted or how much impact a player had by setting screens.
Comparing a player’s Points Created against average in the manner described above gives us two possible ways to express a player’s offensive performance. We can measure by volume (how much he contributed) and by production rate (how much he contributes per possession). Making a similar comparison between a player’s total Points Created with the average amount of Points Created for that given season gives us O_Score by Volume. For Michael Jordan in 1991, this number is 3.55. Comparing a player’s Points Created per possession with the average production rate for that season yields O_Score Per Possession. For MJ in 1991, this value is 5.93.
In fact, I’ve also calculated scores for central offensive statistics, which you can find on the Offensive Impact Dashboard.
You can also see differentials for each stat on the same page, determined by taking actual value – expected value. Here is what that looks like in the case of a single season:
It is also instructive to view players’ career differentials on the Career Dashboard (also on the homepage). Consider the way in which we are able to distinguish a player with a long career, but a below-average offensive impact, from a player with an above-average offensive impact in a shorter career:
To quantify a player’s efficiency on the offensive end, I use the following method:
There have been changes in average efficiency rates over time, so how do we account for that fact? Z-scores are a good tool in many cases, but not for measuring a ratio as simple as this one. Some players will have a small amount of Points Created and zero Points Lost, or vice-versa. Low-minute players will play havoc with the distribution in one of these ways quite often.
Moreover, trying to express a ratio in this way will end up leading you astray. The number of players in the league will have an impact on how far away a normal value is from average. Seasons with very few low-minute players (as in the 50’s or 60’s) will have less variance, while seasons with a lot of low-minute players (like the 2010’s) will have loads of variance. These results won’t tell us about the distance between values for real players, though. Their communicative ability is limited by the presence of players with unusual ratios of positive to negative.
Thus, for comparing a player’s Offensive Efficiency to league average, we do simply subtract Offensive Efficiency – League Avg Offensive Efficiency. This method is contrary to the steps outlined above. In order to calculate a player’s career Relative Offensive Efficiency, it is necessary to multiply the difference by the player’s Opportunities in the given season (Opportunities = FGA + TOV + AST + (0.44*FTA)), then add up the weighted totals for each season in a player’s career. We then have to divide the sum by the total Opportunities in a player’s career. The result is the player’s Career Offensive Efficiency.
Caveats and Next Steps
As a result of the steps I’ve used for comparing a player’s efficiency to league average, it is important to stress that the reader should not take Relative Offensive Efficiency values from a single season for one-to-one comparison with another season when the two seasons are far removed from each chronologically. When there are vastly different parameters for the two seasons, the raw difference between the player’s efficiency and league average efficiency is not a reliable barometer of how efficient one player was relative to another.
With that caveat in mind, let’s move forward to our next stop: measuring defensive greatness.