Converting PER to statistical offensive adjusted plus-minus

Last year, I followed Eli Witus’ instructions and Aaron Barzilai’s data-and-method to run 2006-2009 adjusted plus-minus (APM) numbers, split into offensive and defensive APM.

My results are a little different than others’ because I removed garbage time (defined as point spread > 10 +minutes left in game) and left each season unweighted in order to calculate a Statistical more easily.

Here’s the output, split into 2006-2009 and 2007-2009.  combining0609-and-0709.xlsx

FROM APM TO STATISTICAL APM

I’ve always been interested in figuring out what made an offense work, but the main reason I tried making a Statistical is because I didn’t like PER.  I didn’t understand it, I didn’t like how hard it was to calculate, and I figured a stat had to be organic to have meaning.  That was before I tried to beat it.

Awhile back, I created a Statistical called EOPM by regressing Ilardi/Barzilai’s 5-year weighted APM numbers against rate data from basketball-reference: TS%, AST%, OREB%, TOV%, and USG%.  I figured the ease of use (copying and pasting the whole row of advanced stats into an excel calculator) would outweigh any inaccuracies, as long is it was close.

But when I ran the r^2, it showed just .52, lower than Rosenbaum’s .57 from years back.  There were a few obvious problems with my method:

1) I used unweighed rates while Ilardi/Barzilai calcuated their APM numbers by giving extra weight to recent years.  I wasn’t sure how to quickly and accurately weight rate stats, so I lazily just left them unweighted.

2) I just used 5 rates.  Obviously that left a lot out.

3) I used TOV%, a stat I’ve come to dislike since.  Jason Collins and Dennis Rodman had enormous TOV%’s despite low Per36 turnovers because they rarely shot the ball.

EOPM didn’t work how I wanted.  But with my new unweighted APM data in hand, I decided to create a new EOPM using Rosenbaum’s Per40 stats idea. I tried to improve it further by adjusting for estimated Pace and including TS%.  In all, I made the inputs TS%, and Per36 pace adjusted (Per36 * 100/estimatedpace) OREB, DREB, AST, STL, BLK, TOV, PF and PTS.

The Formula is: EOPM = -17.173 + 18.131 * TS% + 0.317 * OREB + .190 * DREB + .716 *AST + .334 * STL  - .324 * BLK – 1.447* TOV – .280 * PF + .461* PTS

with an r^2 of .723.  Pretty good!  Next step, comparing it to PER.

COMPARING EOPM WITH PER

I figured, for EOPM to be useful, it should accurately find Team Offensive Ratings for past years given player EOPMs and minutes played.  By multiplying each player’s EOPM by his % of minutes played during that season, then summing those products for each team, I hoped to nail each team’s Offensive Rating within a few tenths of a point.

I performed the calculations for every 2009 NBA team, regressed my estimated O-Ratings against the actual O-Ratings and got an r^2 I liked: .800.  Then I did the same using PER: .914.  Wait, what?

Whoa.

After a bit of headscratching, I found that Fit is a big reason why adding EOPMs doesn’t work as well as PER.  For example, if a team is extremely good at offensive rebounding, the effect on each player’s EOPM is tiny, but for the team it leads to more shots and a higher ORating.  (When including offensive rebound rate in the team regression, EOPM’s r^2 jumped to .893.)

PER though, already considers team rebounding in its calculation, so its r^2 is unaffected when including team offensive rebound rate (jumps to .915).

So Hollinger beat me to a pulp.  And as a result, I’ve started using PER a lot more.  But sometimes I like to balance it out a bit…

COMPARING PER TO EOPM

Even though PER was a more accurate determinant of Team Offensive Rating, I still had reason to think EOPM fairly judged performance.  After all, the numbers are organic, and when adjusted for fit, its correlation nearly matches PER’s.

So my next step was seeing how the stats differed.  I regressed PER against EOPM, and the result was this formula.  EOPM = -8.4+.56* PER, with an r^2 of .736.

First off, that r^2 seemed oddly low for two stats that estimated the same thing, and I figured out why once I calculated ExpectedEOPM for each player using that formula.  Here were the players (who played >5000 minutes from 2006-2009) whose ExpectedEOPM differed most from their actual EOPM.

Biggest Overestimates:

Name PER ExpectedEOPM EOPM Difference
DeSagana Diop 11.1 -2.2 -6.6 4.4
Ben Wallace 14.5 -0.3 -4.0 3.7
Samuel Dalembert 14.9 -0.1 -3.5 3.4
Darko Milicic 13.1 -1.1 -4.4 3.3
Marcus Camby 18.6 2.0 -1.2 3.2
Dwight Howard 22.1 3.9 0.8 3.1
Joel Przybilla 13.5 -0.9 -3.8 2.9
Kendrick Perkins 12.4 -1.5 -4.3 2.8
Josh Smith 17.6 1.4 -1.4 2.8
Yao Ming 24.1 5.0 2.3 2.7
Chris Kaman 15.0 0.0 -2.7 2.6
Jason Collins 4.0 -6.2 -8.8 2.6

Biggest Underestimates:

Name PER ExpectedEOPM EOPM Difference
Jose Calderon 18.0 1.6 4.1 -2.4
Deron Williams 18.0 1.6 4.0 -2.4
Steve Nash 22.0 3.9 6.2 -2.3
Chauncey Billups 21.8 3.7 5.7 -1.9
Mike Bibby 16.4 0.7 2.5 -1.7
Michael Redd 20.3 2.9 4.6 -1.7
Raja Bell 11.6 -1.9 -0.4 -1.6
Eddie House 14.6 -0.3 1.3 -1.6
Antonio Daniels 13.8 -0.7 0.8 -1.5
Steve Blake 12.6 -1.4 0.1 -1.5
Leandro Barbosa 17.2 1.2 2.7 -1.5
Mike James 14.6 -0.3 1.2 -1.5

See a pattern?  If EOPM is to be believed, Hollinger overestimated big man stats (blocks and rebounds) to increase the PER of frontcourt players.  For a possible answer why, take a look at this chart.

Average APMs by position among players who played >5000 minutes between 2006-2009:

Off Def
PG +1.2 -0.5
SG +1.0 -0.8
SF +0.5 -0.2
PF +0.4 +0.9
C -1.6 +1.5

One axiom of the NBA game is that guards generally make a bigger impact offensively because they have the ball in their hands the most, and PFs/Cs make a bigger impact defensively because they defend more shots by playing in the paint, and the chart seems to support that.

Now look at the EOPM and PER of the same sample of players:

EOPM PER
PG +1.4 15.6
SG +1.1 15.5
SF +0.3 15.1
PF +0.2 16.6
C -1.4 15.9

While EOPM shows the same positional correlation as APM, PER shows the positions as equal, possibly even with a slight offensive edge to inside players.

I’m guessing Hollinger did this in an attempt to sell PER as an all-around player evaluator.  In his articles, he often analyzes transcations using only PER.  He created his Expected Wins Added statistic based purely off PER.   There’s even a Hollinger Analysis feature on the ESPN Trade Machine that estimates wins added using only PER.

But PER’s .70 correlation with Team Net Rating (Offensive Rating minus Defensive Rating) seems largely attributed to its ridiculous .914 correlation with offense.  For comparison, EOPM correlates at .69 with Team Net Rating.

By overrating Big Man categories (blocks and rebounds), Hollinger levels the positional playing field by accounting for the defensive impact of power forwards and centers.  Interestingly, that change doesn’t affect PER’s correlation with Team Offensive Rating because NBA lineups are generally built the same way, with two definite guards and two definite big men.  Therefore, each team receives roughly the same amount of over- and under-estimation.

CONVERTING PER TO EOPM

In spite of that oddity, I think PER’s pretty great.  Not only does it correlate impressively with Team Offensive Rating, but it’s widely available, and it’s easy to find PER across seasons at basketball-reference.com.

But what if you wanted to quickly estimate a player’s Offensive Statistical APM, and filter out Hollinger’s big man overestimates?

Remember that .736 correlation between PER and EOPM?  Insert Rebounds Per 36 and Blocks Per 36 into the regression, and the correlation skyrockets to .963, with the following formula:

EOPM (estimated) = -7.5 + .63 * PER – .19 * Rebounds36 – 1.22 * Blocks36

I wish it were a little easier to calculate, but it’s not bad.