Introduction
In the first post of this series, I introduced my plan for conducting methodological reviews of major basketball statistical models. In the second installment, I looked at the Wins Produced model. In keeping with the theme of “Wins-named” metrics, we turn to Win Shares.
Win Shares is a basketball adaptation of Bill James’ work with baseball statistics spanning decades. Justin Kubatko adapted James’ methodology to basketball. Kubatko’s explanation shows how to calculate the number of Win Shares for a player, so my review will rely on his exposition.
Basketball on Paper
We have already cited two influential figures in the development of Win Shares, but have yet to honor the true architect: Dean Oliver, whose ORating and DRating models are the real “meat” of Win Shares. Oliver’s work appears perhaps most completely in his book Basketball on Paper. Dr. Oliver’s methods are quite complex, which means that I need to delve into his book for a little while. By doing so, we will gain a deeper understanding of the perspective underlying Kubatko’s adaptation. In effect, we will be better able to answer the question “what are Win Shares?” if we first address the question “where do Win Shares come from?”
The Authors
Thankfully, Dr. Oliver is quite direct with his goals for Basketball on Paper. In fact, he makes abundantly clear what he hopes to accomplish with the book, even including sections for every type of audience that might conceivably have interest in the topic. In general, Oliver conceived of Basketball on Paper as a tool to help coaches. He devised formulas to evaluate teams and the contributions their players made. His goal for the book was never for it to be an in-depth discussion about using math as a tool for studying sports.
The stated goal stands in sharp contrast to the papers from the creators of the Wins Produced model. There, the focus was on the peer review process and on creating a simple model that would allow them to look at lies and myths common in basketball.
Justin Kubatko’s intent for Win Shares is much more in line with Dr. Oliver’s (which is probably a big reason behind his choice to use Dr. Oliver’s metrics as a framework). His purpose is to create a model whose outputs are estimates for a single player’s contribution to their team winning. And his target audience seems to be fans of NBA basketball who have an interest in statistical analyses.
The Model
Win Shares estimate how many “shares” a player has of their team’s wins, and divides the player’s contributions according to their defensive and offensive performance. The inputs for Win Shares are, broadly speaking, the box score stats for every player. Oliver’s Defensive Rating and Offensive Rating equations calculate a player’s productivity by adjusting for their team’s success on both ends in an attempt to provide context for the player’s box score record.
Finally, Win Shares takes the individual Defensive and Offensive Ratings and situates them according to the standard league-wide performance levels during the season in question. On this basis, the formula credits players for the points created above or below what was ‘expected’ of them and assigns the titular shares of credit to each player.
The Methodology
Possessions
The first concept Oliver introduces is possessions, which have featured prominently in many subsequent models. He approximates the number of possessions a team employs through the number of field goals attempted by that team that did not result in offensive rebounds, the number of times they turned the ball over to the opposition, and the number of free throws that end possessions. The last value was not tracked at the time; Dr. Oliver estimates that 40% of free throws end a team’s ball control, while later estimates have insisted the value is closer to 44%.
Though the author explains that he arrived at this estimate from his experience of analyzing score sheets, it is relevant to point out that it is an estimate. Since there are so many models based on his concept of possessions, it might well prove fruitful to further investigate the question of how many free throws end a possession now that we have access to play-by-play data (which was not publicly available when Basketball on Paper was published).
Offensive Rating
To be clear, Oliver’s derivation of Offensive Rating is different from that of Justin Kubatko, who uses an approximation suggested by Dr. Oliver himself. I will focus on the formulas developed by Dr. Oliver, then, rather than on the approximation used to calculate Win Shares. The formula for Offensive Rating features terms for each major type of event. I will highlight the method of handling each event in what follows.
Assists
The formula starts by dividing the credit for a made field goal between shooter and passer (when the shot is assisted). At this point, Dr. Oliver presumes that the difficulty of the assist is proportional to the ease of the shot. Proceeding with his assumption, he estimates the number of times a player made a field goal after an assist by a teammate.
The value of assisted field goals is based on an approximation of how many assists a player’s four teammates would accumulate over their time together. Afterward, Oliver splits that quantity between the participants based on their playing time. Next, he assumes that every player on the court will share equally the number of assists available. Finally, a player gets credit for their assists based on the assumption that their teammates present the same shooting patterns whether they (the player) are on the court or not.
Free Throws
The “FT Part” of the Offensive Rating formula relies on the aforementioned estimate of 40% of free throws ending a team’s possession. It also hinges on the assumption that whether a player missed or made a free throw has no impact on their ability to hit their next foul shots. In the language of probability, Oliver presumes that free throw attempts are independent events.
Offensive Rebounds
In simple terms, Dean Oliver credits the offensive rebounder with regard to the difficulty of their task (measured by how good their team is at offensive rebounding) and how likely their team is to score on a given play. He attains the latter value by calculating the conditional probability of the team scoring, given that we know that they either didn’t get the rebound or didn’t score.
Deriving Win Shares: Marginal Offense
The offensive component of the Win Shares model depends on all the estimates so far. Together, they become the Scoring Possessions and Individual Possessions statistics. Then, Dr. Oliver adjusts for a player’s shooting profile (in order to account for the extra value of three-pointers). The final product is the number of Points Created by a player, which Kubatko uses in the Win Shares Model. He does so by calculating a player’s marginal offense, defined as player Points Created minus 92% of the points an average player should be able to create with that many possessions.
Defensive Rating
The next step in calculating Win Shares depends on Defensive Ratings for specific players. When calculating Defensive Ratings, Dr. Oliver organizes his work under the concept of stops. He estimates how many plays a defender stopped based primarily on how many blocks, defensive rebounds, and steals they get.
Subsequently, Oliver uses the conditional probability of the team keeping its opponent from scoring, given that we know that they either a) didn’t let the opponent get the offensive rebound or b) didn’t let the opponent score. The author also has a secondary estimate of how many stops a player accomplishes, wherein he assumes that every player on a team is as good at stopping the opposition as the others, once blocks, defensive rebounds, and steals are taken out of the equation. In essence, defensive value not recorded in the box score is evenly distributed across a team under this second format.
Deriving Win Shares: Marginal Defense
Analogously to marginal offense, a player’s marginal defense comes from their Defensive Rating. Kubatko estimates how many points a player allowed through their DRating, then subtracts that from 108% of the points an average player would have allowed in the same number of possessions. The difference is the player’s marginal defense.
Win Shares: Deriving Win Values
Kubatko uses the player’s marginal offense and defense to calculate their offensive and defensive Win Shares, respectively. First, one finds the point value of a win for each team. Since teams have different scoring paces, the formula values a point differently for each, meaning that teams whose games see lots of points need more marginal points per win, which leads their victories to “cost” more points, while the opposite occurs for squads with slower paces.
As such, a player’s Offensive Win Shares equals their marginal offense divided by how many marginal points their team needs for a win. The same is holds for Defensive Win Shares, with marginal defense. Total Win Shares are simply the sum of Offensive Win Shares plus Defensive Win Shares.
Methodological Review
The Good
The model is highly elaborate in many ways that improve accuracy. Much of Dr. Oliver’s work was groundbreaking when it was published. For instance, his approach to estimating how many possessions a team or a player spends was quite significant, and remains in use. Indeed, the very concept of considering basketball through analytical, objective, and rigorous lenses was uncommon at the time and helped to give birth to the modern era of basketball analytics.
Kubatko’s contributions are also indispensable, since he took Oliver’s work and translated it into a model that divides credit for a team’s performance among its players on the season level. Furthermore, the model outputs are simple to understand: how many shares a player deserves of the number of wins their team achieved in a season. Last but not least, the offensive and defensive ratings allow for an easy comparison of players on the same team, a product of the time Dr. Oliver spent adjusting player results to those of their teams.
The Defensive Bias
This section would be identical to the one in the last post, which is why I will paste it below.
The first and most prominent bias is the way the model handles defense. I think I understand the rationale – that playing defense is a team activity. For the authors, this implies that a defense is only as weak as its weakest link. That is a very reasonable assumption. I would even argue that the concept probably holds true in many sports, like offensive line play in american football or the sport of football (soccer) en toto according to the authors of The Numbers Game.
For all that, when we are talking about specific players, the assumption seems fishy. Based on the model, one would conclude that the worst defender of a team has about the same defensive impact as its best. That conclusion is not reasonable, though Wins Produced is not the only model which leads to conclusions like this. In fact, the limitation shared with a few other models, and has a great deal to do with the kind of data that the modeler(s) have had access to. In 2006, when Wins Produced first appeared, better defensive data was not widely available.
Be that as it may, there is more access do defensive data now. Greg himself has a model that tries to account for both the difficulty of a player’s defensive role and how efficient the player is within that role which I highly recommend. Greg also has a book which includes his own methodological review of basketball models. His review helped a great deal with mine. The book, as a whole, would be a great read for anyone interested in further evaluating the strengths and weaknesses of various models.
The Assist Bias
Leaving aside how Dr. Oliver values assists, there is an implicit bias contained in how he divides credit for made shots. By dividing credit equally between passers and shooters, the valuation will frequently be unfair. In situations where a passer is the main reason a shot went in, Dr. Oliver’s method will underestimate the passer’s value. On the other hand, it will overestimate the value of the passer in situations where the shooter deserves the lion’s share of the credit.
In general, Dr. Oliver’s assumption only holds true if there is no reason to expect some individuals to consistently either outperform or underperform “normal” levels. That is to say that the premise is only veracious if every player has the same number of situations where the model will underestimate their value as they have of those where it will overestimate it. If that situation obtains, then the residuals will all “come out in the wash.” If the situation does not obtain, then credit for contributing to assisted baskets is liable to be incorrectly apportioned among players.
The Offensive Rebound Bias
In the offensive rebounds part of the formula, Oliver assumes that the act of grabbing boards has different associated values on different teams. He also asserts that one cannot estimate the benefits a player brings with their rebounding without measuring that value within their team’s context. Ultimately, by virtue of his system, Dr. Oliver asserts that a player’s rebounding ability is the same as their team’s rebounding ability. Thus, playing with bad rebounders devalues the feats of good ones.
The Position Bias
The last bias I will mention is the one Dr. Oliver introduces by not having a position adjustment added to his formulas. Hence, he imbues his results, especially the defensive ones, with a predisposition in favor of centers and power forwards. As a result, it is very commonly the case that a team’s big men will have the highest DRatings on the team, whether or not they are truly good defenders.
The Questionable
My key grievance with Win Shares and Basketball on Paper is the manner with which Dr. Oliver and Justin Kubatko make hard assumptions left and right, sometimes without offering explanations for those presumptions. Even though would-be reviewers like myself were not Oliver’s target audience, it is still frustrating that he does not show the evidence supporting many of his modeling decisions. The same applies with Kubatko’s adaptation, particularly when it comes to the 0.92 and 1.08 coefficients he uses to calculate the marginal production of a player.
I am also less than enthused with how deterministic much of Dr. Oliver’s calculations and systems are. For someone who is a proponent of probabilism, as I am, this seems like an opportunity lost.
The Bottom Line
Though I have some misgivings vis-à-vis the number of assumptions in Basketball on Paper and the Win Shares model, I greatly respect Dr. Oliver’s and Kubtako’s work. The fashion in which Dr. Oliver elucidates his beliefs about the sport is uniquely engaging and the depth of his knowledge is impressive. Both have made significant contributions to the analysis of basketball in the public sphere.
Up Next
In the next post of the series, I will do a methodological review of the WARP model, by Kevin Pelton.