Matchup-Based Defense

Defense is the unsolvable puzzle in NBA analytics. No matter how advanced the advanced stats get, defensive metrics continue to crash against the same conundrums. Better data often leads to better models, and recent years have seen a dramatic improvement in the quality of defensive data available for analysis. Tracking data, opponent shooting data, play-by-play data, and more have all played a hand in modern defensive analysis. In spite of the improvements, or perhaps in part because of the improvements, it is clear that defensive analysis is still not highly accurate.

Most defensive metrics which are currently extant are based on one of two schools of thought. In order to take stock of why defensive analysis is still frequently inaccurate, it will help to investigate the underlying assumptions behind most current models.

The Plus/Minus School of Thought

The most popular method by far is The Plus/Minus School, which counts BPM, RPM, RAPM, PIPM, and more among its adherents. The distinguishing precept of the Plus/Minus School is the belief that we can ascertain a player’s defensive value by evaluating the team’s performance with him on the court, if only we properly adjust for strength of opponent, the team’s talent level, the team’s performance with the player off the court, and the player’s performance level in seasons past. The adjustments made to raw plus/minus are attempts to extract reliable data by excising confounding variables.

Theoretically, plus/minus models consider everything that happens while a player is on the floor. While this approach has the advantage of capturing events that are not recorded in other statistics, it does require us to assume that every player on the floor has an impact on every possession. Though this is sometimes the case, it is also sometimes the case that a player gets a breakaway dunk, and none of the members of the opposing team has a chance to stop him. Sometimes there are possessions at the end of the quarter or end of the half where the majority of the players on both sides are simply cleared out for a one-on-one battle at the top of the key, and thus have minimal impact. Sometimes steals happen early in the possession, before defenders other than the one who records the steal have made any impact. The list goes on, but the idea should be clear.

Including everything that happens while the player is in the game has the benefit of capturing things that other statistics do not record, but it has the weakness of capturing events on which the player under consideration has no impact. In a very real sense, plus/minus models consider data which is not properly associated with the player in question in their evaluation of that player.  This incontrovertible conclusion makes it difficult to rely heavily on plus/minus models for accurately measuring defense.

The Player Data School of Thought

The second school of though in NBA defensive analysis is the Player Data School. The defining characteristic of player data analysis is the attempt to use data associated with individual players as the basis for determining how much the player in question affected opposing teams’ offensive performance. Some models in this school of thought use purely box score data (blocks, steals, fouls, and defensive rebounds), while others use more recent statistics such as defensive field goal percentage, opponent points per possession derived from tracking data, and opponent field goal percentage differential. The problem that has become increasingly evident with these methods is caused by the inability of the data used to identify a team’s best defender(s).

Focusing on misses against a player will lead you to the conclusion that a player who is in reality a poor defender whom his coaches “hide” on a non-scorer is actually an elite defender. Opponents miss a lot of their shots when such a player is the nearest defender, but the reason that happens is because the player under consideration is only defending opponents who miss more frequently than their peers anyway.

Photo by Brad Rempel-USA TODAY Sports

Focusing on opponent points per possession provides the advantage of being able to describe which types of action a player is good or bad at defending, but it obscures the impact of the player who denies his man the ball and forces a less-efficient shot to be taken against one of his teammates. Tracking data tells us about who the nearest defender was to the ball at the end of the possession, but it omits what happened leading up to the end of the possession. A team’s best defender will often to be tasked with forcing the opponents’ best scorer to give up the ball. When he succeeds in doing this, he is often erased from the record of the play; he is not the closest defender on the shot, he does not get credit for a steal if his opponent throws the ball away, and we do not recognize that he has lowered the opposing team’s expected points for that possession by forcing an inferior player to shoot.

Employing opponent field goal percentage differential, as we see in the new DRAYMOND model from 538, leads to the conclusion that rim defenders are vastly more valuable than any perimeter defender. This is true even when focusing on differential relative to shot location (as in DRAYMOND), rather than strictly by opponent FG%. While most defensive analyses tend to favor big men to some degree, the problem becomes more pronounced with models like DRAYMOND for a very simple reason: big men tend to defend low-usage, high-efficiency opposing big men (or: they tend to defend highly efficient shots, if you go by location). With a higher baseline opponent field goal percentage, big men are always going to have greater raw differentials than other players. If Center X defends an opponent who normally shoots 55% from the field, but he holds that player to 50% shooting, he has created a 5% value per shot. If Guard Y on the same team defends an opponent who normally shoots 44% from the field, but he holds that player to 40% shooting, Guard Y has produced only a 4% value per shot. Center X is a noticeably more valuable defender.

The problem is that the two players probably have the same impact on the defense’s bottom line. Guard Y caused a percent decrease of 9.1% in his opponent’s shooting efficiency (((40-44)/44)*100). Center X has also caused a 9.1% change in his opponent’s shooting efficiency (((50-55/55)*100). Failing to adjust for the baseline opponent field goal percentage results in a drastic overstating of centers’ defensive value.

What Do We Need to Know?

In order to properly and accurately evaluate individual defense, we need to know what a player is asked to do on defense, how well he does his job, and how much effect it has on his team’s defensive performance. Surprisingly, I am not aware of any defensive analysis which has undertaken a description of what a player is asked to do on defense. Without that information, however, it will be impossible to truly determine how good a player is on defense. If we don’t know a player’s job, the only way to estimate how good he is at that job is to either a) assume everyone has the same job or b) arbitrarily assign jobs to players and evaluate them on that basis.

Most arbitrary defensive “roles,” including my own defensive position algorithm, use size and other proxies such as defensive rebounding rate, blocks, steals, etc. to determine a player’s “position” or “role.” I term these designations “arbitrary” because they do not make reference to the difficulty of the player’s matchup. Even including opponent shot distances as my algorithm does will only indicate where opposing players shoot from, not how great of a scoring threat the opposing player is. To make any progress in examining individual defensive performance, we need to be able to credibly define a player’s role within his team’s defense.

How Do We Found Out What We Need to Know?

A team’s purpose on defense is to stop the opposition from scoring. Individual players each play a part in that goal, and their actions interact with one another to affect the goal both positively and negatively. In order to achieve their goal, the defense needs to reduce the offense’s likelihood of scoring in each possession. The most important and effective way of doing so is by forcing the offense to take less efficient shots. While it is possible to glean a great deal of information about a team’s defense by reviewing types of shots taken against the team (distance, zone, time on the shot clock, whether or not the shot was open, etc.), such measures categorically ignore the quality of the shooter. Anyone who needs to be convinced that the quality of the shooter matters should conduct the following thought experiment: Your favorite team plays two game in the same location against the same opponent. In the first game, the other team’s star player takes 20 shots; in the second game, the star takes 10 shots. Which game is your team more likely to win?

Thus, we need to determine how good the defender’s opponents are in order to know how that defender fits into his team’s goal of preventing points. How do we find that out? Enter data science! Paul DeVos helped to develop this analysis by scraping game-based matchup data from stats.nba.com, and we quickly agreed that this data could be a useful tool for answering some key questions. By inspecting how many points per game each opposing player scores, and how many possessions a player guards that opponent, we can determine the defensive load placed on each player. The players who carry the greatest defensive weight for their teams by guarding the other team’s best player will be easy to distinguish from players who defend non-scorers.

I used points per game as the variable to represent “opponent scoring danger” for a few reasons.

  • Points per minute (or per 36, or per 48) gives unstable results for low-minute players
  • Using shooting efficiency would imply that the players who make the highest percentage of their shots are always the most talented scorers. Due to usage and consequent defensive attention, this is not true.
  • Usage is also a rate stat. In addition to failing to distinguish between high-minute and low-minute players, it also depends on opponent assists. Since it is not at all clear that many or most assists can be prevented by the player assigned to defend the passer, it does not make sense to measure the defensive load a player carries by evaluating his opponent’s Usage.

Without further ado, here are the defensive load each player carried during the 2018-19 regular season:

Load

Generated by wpDataTables

The “Defensive Load” column measures the average difficulty of a player’s defensive assignment, without any adjustment. The “Relative Defensive Load” column expresses the player’s defensive load within the context of his team. The team’s defensive depth can impact a player’s unadjusted defensive load, thereby making it unreliable when comparing players on different teams. If you want to know who the most important defender is on a certain team, use Defensive Load. If you want to compare players on different teams to figure out who has a heavier burden, use Relative Defensive Load. If you want to identify the key defender on a certain team, take the following steps: sort by the Team column, press and hold shift, sort by the Defensive Load column, use the navigation arrows at the bottom of the table to move through the teams in alphabetical order until you get to your team.

Photo by Abbie Parr/Getty Images

How Good is Each Player at Doing His Job?

Now that we have an idea of how much a player is asked to do on defense, we are in a better position to evaluate how well he fills his role. In order to do so, we first need to estimate how many points each player should be expected to give up. As it turns out, there is a predictable linear relationship between the points per possession scored by a player’s assigned matchup and the normal points per possession scored by that opponent regardless of defender (seems obvious, right?). When we apply the equation that describes this relationship to each player’s defensive load, we can determine how many points we should expect the player to allow, given who he is guarding.

Comparing the player’s expected points allowed per possession with his actual points allowed per possession gives us a player’s Effectiveness Ratio – how much he affected his opponent’s performance compared with how much we expected him to affect his opponent’s performance. As you might easily deduce from the table and the data underlying it, most players’ Effectiveness Ratio is close to 1.0.

Which Defenders Make the Biggest Impact?

The result of this computation is visible in the chart above: there is greater deviation for players with a light defensive load than for players with a heavy defensive load. If we do not want to conclude that “low-usage” defenders who radically diverge from their expected points allowed are their team’s most important defender, we need a term that will modify a player’s Effectiveness Ratio for impact. Making this modification will enable us to determine how much effect the player’s deviation from “normal” has on the team’s goal of preventing points. We accomplish this by multiplying a player’s Effectiveness Ratio by his Relative Defensive Load. Thus, a player’s performance relative to expectation based on his opponent is modified by how important his opponent is to the game plan. Making a 1% change to opponent performance when defending LeBron James is more important than making a 1% impact when guarding Maxi Kleber.

The product of this adjustment, which we’ll call Load Adjusted Effectiveness, is the main ingredient to ascertain how good a player is at on-ball defense. To appropriately contextualize the results, we need to divide the player’s Load Adjusted Effectiveness by the team’s sum of Load Adjusted Effectiveness. We multiply the product by the share of the team’s possessions in which the player participated, then multiply by the team’s total opponent unblocked misses (Opp FGA – Opp FGM – Blk). Just like that, we have the player’s contribution quantified on the possession level. In order to transfer the contribution to a number of points prevented, we simply draw upon the game-based matchup data to determine the average points per made field goal by the player’s assigned opponents. Using this as a proxy for how many points would have been scored per shot if the player did not stop his man, we can calculate the Missed FG Points Saved for every player.

If you didn’t like the explanation in words, here is what that all looks like in math:

The results, which I’ve labeled as Base Shoot. Defense in the interest of saving space, are in the table below:

Base Shooting Defense

Generated by wpDataTables

This table expresses Relative Load as a positive or negative integer in order to give a clearer idea of the distinction between defenders who are asked to carry a heavy load and those who are not. You can read the Relative Load column as “this player is carries this percent more/less weight than normal on his team.”

Since we excluded blocked shots from the foregoing calculation, we can further expand our explanatory power by conjoining blocks with Base Shooting Defense to find Total Shooting Defense. To accomplish this marriage of blocks with misses, we simply take a player’s pace-adjusted blocks and multiply them by the same term used in Base Shooting Defense – the average opponent points per made field goal. The result tells us how many points a player saved by blocking shots. Adding the points a player kept off the board by preventing his matchup from scoring to the points a player saved by blocking shots gives us Total Shooting Defense.

Total Shooting Defense

Generated by wpDataTables

What Else Do We Need to Consider?

These results can at long last give us an idea of how important a player is to his team’s defense based on who he is assigned to guard. Since this post is already quite lengthy, I will save the two other areas of defensive performance for later posts. In those posts, I will describe and explain:

  • How to properly assign credit for causing the offense to turn the ball over. I call this category “Non-Shooting Defense” to distinguish it from the statistics in this article.
  • Valuing defense both when we include defensive rebounding and when we do not. When I present this part of the model, I will provide data for each player’s amount of Defensive Rebounding Credit, and examine the difference occasioned by whether or not one chooses to include defensive rebounding in the analysis of defensive performance.

Before concluding our present investigation, I want to leave you with just one more table. In 2018-19, there were 56 players who saved their teams at least 150 points in Total Shooting Defense while carrying a Relative Load of at least +10%. They are listed in the table below, sorted by Total Shooting Defense per 100 Possessions.

 

4 thoughts on “Matchup-Based Defense

Leave a Reply