Wednesday, January 6, 2016

Using Data for My NBA All Star Vote

Starters in the NBA All Star game are decided by fan voting, which is open until January 18. I take my voting responsibility very seriously, like a presidential election or new potato chip flavors. For the All Star game I downloaded 2015-2016 data for player efficiency rating (PER) and real plus minus (RPM). I like using these two stats because they have fundamentally different sources- PER is a clever aggregation of box score statistics, while RPM measures how the score changes while a player is on the floor, controlling for the other players on the floor. The first graph shows the Western conference.

Clearly my vote is Kobe x 5. (Kidding) The first four starters are clear- Curry, Westbrook, Durant, and Kawhi. This leaves one more front court player. I choose Draymond- if you draw a 45 degree line perpendicular to the trend, he's ahead of Anthony Davis, Blake, or Cousins. He also plays for a better team, which I think is fair to consider. But even more convincing to me, I think Real Plus Minus is a better stat than PER. (More on this below.)

Next, consider the Eastern Conference:

The top players are less talented in the East. Lebron is an outlier, even though his stats put him below Kawhi in the West. Milsap and Lowry are next in the East, and Bosh is slightly ahead of Drummond for the third forward spot, especially given my preference for RPM. Jimmy Butler is the only guard in the next tier of players (Butler, Monroe, Gasol, George, Love), so he gets the fifth spot.

A few other observations as I was putting this together.

1. Real Plus Minus is getting really good. RPM has more promise as a stat because it captures the effect of things that don't show up in the box score- setting screens, boxing out, and most aspects of defense. But early estimates weren't very good- it's difficult to estimate because some players are more likely to play at the same time as other certain players and it is hard to disentangle their effects (multicolinearity). But recent models use more advanced methods to control for this. (These models can be difficult to estimate, and its a really cool service for ESPN to provide them.) Anyway, the results do much better against the eye test than they used to, and I found this graph very convincing:

This graph once again shows PER vs RPM, but is now color-coded by defensive RPM. PER is much better at measuring offense than defense. Notice that almost any case where a player has a better PER than RPM (they're on the upper left half of the graph), they are bad at defense. RPM is capturing this, but PER is not.

2. Using a filter to a specific team is interesting. Look at the Warriors, Spurs, or Thunder to see interesting graphs. Or see the Cavs example below. Lebron and Love are doing most of the work, with positive contributions from Tristan and Delly. Mo and Jefferson were not great acquisitions. Kyrie (RPM = 3.4 last year) and Shump's (2.3) returns will help, especially when they take minutes from Mo and Jefferson.

3. The Sixers, Nets, and Lakers are interesting, for other reasons. For the Sixers, Stauskas has the second worst RPM in the East, and the third worse PER (Players with at least 500 minutes). Also on the Sixers, Okafor has the the worst RPM in the league by a large margin, even though he has an above-average PER. I interpret this as, "he puts up good stats, but everything else he does is terrible."

4. Unrelated, but I came across this while looking up the other data: the Spurs are on track to have the highest margin of victory of all time. It probably won't last, but I was surprised by the magnitude.


  1. Re: that last link - those 70/71/72 bucks teams must have been amazing to watch.

  2. $16 oakley sunglasses, it's so cheap, only for 3 days? oh my god, i will buy it now, click here- cheap oakleys