Archived entries for Uncategorized

A breakdown of NBA player pace

Jon Nichols has done a cool project the last two years on his blog.  He used play-by-play data to find players’ actual AST%, OREB%, DREB%, BLK%, STL%, and USG%.

I took the liberty of comparing his 2008-2009 data to the estimates found on basketball-reference.com.  Here’s the breakdown: nicholsrates (edit: excel probably gone, still searching) (note that Nichols doesn’t filter out 3pt shots in BLK%, so his numbers are lower than estimates across the board.

Turns out most of the estimates are accurate to within 0.5%.  The exception is AST%, where the league’s most overestimated players are a who’s who of offensive stars.  Here’s the Top 10, along with their AST% adjustment:

PlayerName AST%
Chris Paul -3.3
Jameer Nelson -2.9
Dwyane Wade -2.8
Rajon Rondo -2.6
Tony Parker -2.2
Jason Kidd -1.9
Steve Nash -1.8
Deron Williams -1.7
Steve Blake -1.7
LeBron James -1.7

AST% estimates teammate FGs using (Team FG) * (player’s % of minutes played) – (player’s FG).  If a player is great offensively, his team will score more FGs per minute when he’s on the court.  More FGs per minute means more assist opportunities, a bigger denominator, and a lower true AST%.

But Nichols’ data got me thinking about another factor lost in b-r estimates: player pace.  Several advanced stats use the assumption that a player’s pace is his team’s pace, but that’s not always true.

PLAYER PACE

It dawned on me that I can estimate player pace using 82games on/off data.  82games provides team points scored when a player is on the court, along with the corresponding team offensive rating and player minutes played.  By

(1) Dividing team points scored when a player is on the court by player minutes played (to find POINTS PER MINUTE when a player is on the court), then

(2) Dividing team offensive rating by the result of (1), you get MINUTES PER 100 POSSESSIONS when a player is on the court.  Then,

(3) Adjust to 48 MINUTES, and you get pace.  It seems averaging offense and defense numbers would give an even more accurate number.

Well, I did this for every player in basketball this past year, and here are the results: playerpacerevised.  The top 5 and bottom 5:

Player ONpace OFFpace Diff
Dwyane Wade 90.4 84.1 6.4
Kevin Durant 93.1 88.3 4.9
Gilbert Arenas 94.1 89.4 4.7
Steve Nash 95.7 91.7 4.0
Russell Westbrook 93.4 89.5 3.8
John Salmons (MIL) 88.3 91.9 -3.6
Shane Battier 91.4 95.1 -3.6
Marc Gasol 91.2 94.9 -3.7
Goran Dragic 91.9 95.8 -4.0
Craig Smith 88.2 93.2 -5.0

The downside is that most of the fast paced guys play so many minutes it doesn’t make much difference in their rate numbers, since their pace is close to their team’s overall.  The players whose b-r estimates would theoretically be off are the bench players at either extreme.  My guess is Goran Dragic, for example, has a true AST% closer to 27% than his listed 24.1%.

The myth of post scoring?

One assumption I’ve made about offense is that there are two types of efficient shots: (1) Three pointers and (2) Inside Shots. I referenced that with my first post on this site (edit: that post is gone), citing the > 0.55 eFG%’s in those locations as evidence.

Now obviously layups and dunks make up a big chunk of the efficiency of close shots, but I’ve always thought that post scoring was efficient too. Maybe it was because I grew up watching Olajuwon, Robinson, Ewing and Shaq dominate inside, with the Bulls using three bodies and 18 fouls to stop them from seemingly scoring at will in the post.

But while messing around with Synergy recently, I started looking at points per possession (PPP) on post possessions.  %poss means the percentage of that player’s total scoring possessions:

%poss PPP
Shaq 63.8 0.85
Howard 60.9 0.91
Jefferson 56.8 0.92
Bynum 46.4 0.93
Bogut 45.5 0.80
Aldridge 44.5 0.91
Duncan 42.8 1.00
Pau 40.7 1.00
Marc 37.9 0.93
Lopez 37.6 0.91
Kaman 36.5 0.77
Bosh 34.9 1.09
Horford 34.8 0.94
West 33.8 0.94
Randolph 33.5 1.01
Nene 32.9 0.94
Garnett 32.3 0.90
Perkins 30.3 0.74
Scola 27.8 0.91
Nowitzki 26.6 1.06
Boozer 21.6 0.87
Stoudemire 19.2 0.99
Millsap 16.0 0.83
Noah 14.1 0.83
Lee 13.0 0.80

PPP looks like it’s just a simple calculation of TS% with turnovers added, the estimating formula being PTS/(FGA + .44*FTA + TOV) for every player’s possession that began in the post. Note that it doesn’t subtract possessions for an offensive rebound. Using that formula leaguewide, we find an NBA average of 0.94 PPP, with the Suns #1 at 1.01 and the Nets last at 0.88.

The raw average of the table above is 0.91 PPP, a few points lower than the NBA average. A select few players rank significantly above the league average, but many rank nowhere close. Why are these numbers so much lower than I expected? Here’s a further breakdown:

PPP FG% SF% TOV%
Bosh 1.09 52.5 16.4 10.0
Nowitzki 1.06 49.8 11.6 8.1
Randolph 1.01 48.6 9.2 7.3
Duncan 1.00 49.5 13.3 7.0
Pau 1.00 48.8 10.7 8.5
Stoudemire 0.99 50.8 13.8 14.3
Horford 0.94 45.8 9.3 8.6
West 0.94 48.5 8.0 12.5
Nene 0.94 47.5 9.8 11.8
Bynum 0.93 48.3 8.2 11.0
Marc 0.93 51.4 9.7 14.1
Jefferson 0.92 47.0 7.6 8.5
Howard 0.91 52.7 17.8 19.3
Aldridge 0.91 44.1 10.0 7.5
Lopez 0.91 42.6 16.2 13.5
Scola 0.91 49.8 6.9 14.1
Garnett 0.90 44.9 8.6 11.9
Boozer 0.87 47.5 9.0 15.8
Shaq 0.85 48.9 11.7 14.7
Millsap 0.83 45.7 10.7 19.7
Noah 0.83 40.7 11.6 8.9
Bogut 0.80 43.3 5.1 11.7
Lee 0.80 44.2 3.1 12.8
Kaman 0.77 41.5 5.7 15.0
Perkins 0.74 44.7 10.3 19.6

I guess I’m most surprised at how rare it is for players to make 50% of their post shots. Fairly average shooting foul percentages and slightly higher than normal turnover rates haven’t helped.  And the low FT% for some players (mainly Shaq and Howard) leave them with ground to make up.

In fairness to the teams that keep feeding the post, it has its obvious benefits of spreading the floor, and there’s evidence to suggest that post misses have a better than usual chance of leading to an offensive rebound.  So players who are slightly below that 0.94 mark may still be helping their teams against the league average.

As a result, it’s hard for me to draw a clear conclusion from this.  My instinct is to say “Don’t doubleteam as much in the post”, but then again, maybe part of the reason the PPPs are suprisingly low is that post playes are doubleteamed so often.  Perhaps it’s better to just realize that a very select few players are worth doubleteaming in the post.

Synergy Sports, defensive PPP

I used to come across Synergy Sports stats on DraftExpress, where I’d read “so-and-so was the tenth most efficient player in the country in isolation efficiency” and become increasingly jealous at their wealth of data that I couldn’t access.

Well now it’s accessible on the nba website, with many of the cool features available on the business package. Some of the things I can now do:

  • Find a player’s scoring efficiency on pick-and-roll, isolation, cut, and offensive rebound plays.
  • Find a player’s defensive efficiency when defending those same plays.
  • Watch a player’s video clips filtered by some query (example: Derrick Rose pick-and-roll plays)
  • Watch single possessions from any game over and over to analyze offensive and defensive schemes.

Right now the program runs slow, and it only contains data from a quarter of the season (UPDATE: Andy Graham from Synergy emailed me saying “I wanted to let you know that the dataset is actually from the entire NBA season, not just a quarter of it. We are including every possession of every NBA regular season and playoff game in the offering.” – glad to hear it, my mistake), so any numbers come with a high degree of some error.  But some, like defensive efficiency (points per possession), return results like this:

NBA Power Forwards and Centers: Defensive Points Per Possession (of the man guarded by listed player)

	        PPP
Al Horford	0.75
Nene Hilario	0.81
Dwight Howard	0.82
K. Perkins	0.82
Taj Gibson	0.82
Tim Duncan	0.82
Andrew Bynum	0.83
Mehmet Okur	0.83
Kevin Love	0.83
Lamar Odom	0.84
Nick Collison	0.84
Robin Lopez	0.84
A. Stoudemire	0.84
Elton Brand	0.85
Kevin Garnett	0.85
Jermaine O'Neal	0.85
Carlos Boozer	0.85
Brendan Haywood	0.86
Paul Millsap	0.86
Pau Gasol	0.86
Brook Lopez	0.86
Nenad Krstic	0.86
Ben Wallace	0.87
Andrew Bogut	0.87
S. Dalembert	0.87
Marc Gasol	0.88
Channing Frye	0.88
Jason Thompson	0.88
L. Aldridge	0.88
Antawn Jamison	0.88
Serge Ibaka	0.89
Joakim Noah	0.89
Chris Bosh	0.89
Rashard Lewis	0.89
Zach Randolph	0.89
Udonis Haslem	0.89
Chuck Hayes	0.90
A. Varejao	0.90
Marcus Camby	0.91
Emeka Okafor	0.91
Erick Dampier	0.91
Dirk Nowitzki	0.91
Chris Andersen	0.92
Matt Bonner	0.92
Carl Landry	0.92
Andrea Bargnani	0.92
Josh Smith	0.93
Amir Johnson	0.93
Al Jefferson	0.93
Roy Hibbert	0.94
Luis Scola	0.94
DeJuan Blair	0.96
Troy Murphy	0.98
David Lee	0.98

Defensive PPP leaves out a lot of what a player can do on defense, like rebounding, help defense (Synergy only credits a defensive play to a player when he’s the sole defender of a shot), steals, hedging off screens, etc etc etc. What it does seem to estimate is how well a player defends in isolation. And unlike the 82games Opponent PER stats, Synergy tracks who a player was guarding on each possession. So if Chris Bosh switches to guard Derrick Rose off a screen and Rose misses a jumpshot, that miss is credited to Bosh’s defense.

I haven’t played around with Synergy enough to figure out the accuracy of these numbers, but the list above passes my eye test. Most of the league’s top man defenders rank high on the list. The league’s poorest interior defenders rank low. And the Josh Smith / Joakim Noah types who are phenomenal help defenders rate out below their defensive reputations as expected.

I am famous

The Chicago sportsradio show Boers and Bernstein picked up on a post I wrote projecting Joe Johnson over the next five years, and talked about it at length here.

Converting PER to statistical offensive adjusted plus-minus

Last year, I followed Eli Witus’ instructions and Aaron Barzilai’s data-and-method to run 2006-2009 adjusted plus-minus (APM) numbers, split into offensive and defensive APM.

My results are a little different than others’ because I removed garbage time (defined as point spread > 10 +minutes left in game) and left each season unweighted in order to calculate a Statistical more easily.

Here’s the output, split into 2006-2009 and 2007-2009.  combining0609-and-0709.xlsx

FROM APM TO STATISTICAL APM

I’ve always been interested in figuring out what made an offense work, but the main reason I tried making a Statistical is because I didn’t like PER.  I didn’t understand it, I didn’t like how hard it was to calculate, and I figured a stat had to be organic to have meaning.  That was before I tried to beat it.

Awhile back, I created a Statistical called EOPM by regressing Ilardi/Barzilai’s 5-year weighted APM numbers against rate data from basketball-reference: TS%, AST%, OREB%, TOV%, and USG%.  I figured the ease of use (copying and pasting the whole row of advanced stats into an excel calculator) would outweigh any inaccuracies, as long is it was close.

But when I ran the r^2, it showed just .52, lower than Rosenbaum’s .57 from years back.  There were a few obvious problems with my method:

1) I used unweighed rates while Ilardi/Barzilai calcuated their APM numbers by giving extra weight to recent years.  I wasn’t sure how to quickly and accurately weight rate stats, so I lazily just left them unweighted.

2) I just used 5 rates.  Obviously that left a lot out.

3) I used TOV%, a stat I’ve come to dislike since.  Jason Collins and Dennis Rodman had enormous TOV%’s despite low Per36 turnovers because they rarely shot the ball.

EOPM didn’t work how I wanted.  But with my new unweighted APM data in hand, I decided to create a new EOPM using Rosenbaum’s Per40 stats idea. I tried to improve it further by adjusting for estimated Pace and including TS%.  In all, I made the inputs TS%, and Per36 pace adjusted (Per36 * 100/estimatedpace) OREB, DREB, AST, STL, BLK, TOV, PF and PTS.

The Formula is: EOPM = -17.173 + 18.131 * TS% + 0.317 * OREB + .190 * DREB + .716 *AST + .334 * STL  - .324 * BLK – 1.447* TOV – .280 * PF + .461* PTS

with an r^2 of .723.  Pretty good!  Next step, comparing it to PER.

COMPARING EOPM WITH PER

I figured, for EOPM to be useful, it should accurately find Team Offensive Ratings for past years given player EOPMs and minutes played.  By multiplying each player’s EOPM by his % of minutes played during that season, then summing those products for each team, I hoped to nail each team’s Offensive Rating within a few tenths of a point.

I performed the calculations for every 2009 NBA team, regressed my estimated O-Ratings against the actual O-Ratings and got an r^2 I liked: .800.  Then I did the same using PER: .914.  Wait, what?

Whoa.

After a bit of headscratching, I found that Fit is a big reason why adding EOPMs doesn’t work as well as PER.  For example, if a team is extremely good at offensive rebounding, the effect on each player’s EOPM is tiny, but for the team it leads to more shots and a higher ORating.  (When including offensive rebound rate in the team regression, EOPM’s r^2 jumped to .893.)

PER though, already considers team rebounding in its calculation, so its r^2 is unaffected when including team offensive rebound rate (jumps to .915).

So Hollinger beat me to a pulp.  And as a result, I’ve started using PER a lot more.  But sometimes I like to balance it out a bit…

COMPARING PER TO EOPM

Even though PER was a more accurate determinant of Team Offensive Rating, I still had reason to think EOPM fairly judged performance.  After all, the numbers are organic, and when adjusted for fit, its correlation nearly matches PER’s.

So my next step was seeing how the stats differed.  I regressed PER against EOPM, and the result was this formula.  EOPM = -8.4+.56* PER, with an r^2 of .736.

First off, that r^2 seemed oddly low for two stats that estimated the same thing, and I figured out why once I calculated ExpectedEOPM for each player using that formula.  Here were the players (who played >5000 minutes from 2006-2009) whose ExpectedEOPM differed most from their actual EOPM.

Biggest Overestimates:

Name PER ExpectedEOPM EOPM Difference
DeSagana Diop 11.1 -2.2 -6.6 4.4
Ben Wallace 14.5 -0.3 -4.0 3.7
Samuel Dalembert 14.9 -0.1 -3.5 3.4
Darko Milicic 13.1 -1.1 -4.4 3.3
Marcus Camby 18.6 2.0 -1.2 3.2
Dwight Howard 22.1 3.9 0.8 3.1
Joel Przybilla 13.5 -0.9 -3.8 2.9
Kendrick Perkins 12.4 -1.5 -4.3 2.8
Josh Smith 17.6 1.4 -1.4 2.8
Yao Ming 24.1 5.0 2.3 2.7
Chris Kaman 15.0 0.0 -2.7 2.6
Jason Collins 4.0 -6.2 -8.8 2.6

Biggest Underestimates:

Name PER ExpectedEOPM EOPM Difference
Jose Calderon 18.0 1.6 4.1 -2.4
Deron Williams 18.0 1.6 4.0 -2.4
Steve Nash 22.0 3.9 6.2 -2.3
Chauncey Billups 21.8 3.7 5.7 -1.9
Mike Bibby 16.4 0.7 2.5 -1.7
Michael Redd 20.3 2.9 4.6 -1.7
Raja Bell 11.6 -1.9 -0.4 -1.6
Eddie House 14.6 -0.3 1.3 -1.6
Antonio Daniels 13.8 -0.7 0.8 -1.5
Steve Blake 12.6 -1.4 0.1 -1.5
Leandro Barbosa 17.2 1.2 2.7 -1.5
Mike James 14.6 -0.3 1.2 -1.5

See a pattern?  If EOPM is to be believed, Hollinger overestimated big man stats (blocks and rebounds) to increase the PER of frontcourt players.  For a possible answer why, take a look at this chart.

Average APMs by position among players who played >5000 minutes between 2006-2009:

Off Def
PG +1.2 -0.5
SG +1.0 -0.8
SF +0.5 -0.2
PF +0.4 +0.9
C -1.6 +1.5

One axiom of the NBA game is that guards generally make a bigger impact offensively because they have the ball in their hands the most, and PFs/Cs make a bigger impact defensively because they defend more shots by playing in the paint, and the chart seems to support that.

Now look at the EOPM and PER of the same sample of players:

EOPM PER
PG +1.4 15.6
SG +1.1 15.5
SF +0.3 15.1
PF +0.2 16.6
C -1.4 15.9

While EOPM shows the same positional correlation as APM, PER shows the positions as equal, possibly even with a slight offensive edge to inside players.

I’m guessing Hollinger did this in an attempt to sell PER as an all-around player evaluator.  In his articles, he often analyzes transcations using only PER.  He created his Expected Wins Added statistic based purely off PER.   There’s even a Hollinger Analysis feature on the ESPN Trade Machine that estimates wins added using only PER.

But PER’s .70 correlation with Team Net Rating (Offensive Rating minus Defensive Rating) seems largely attributed to its ridiculous .914 correlation with offense.  For comparison, EOPM correlates at .69 with Team Net Rating.

By overrating Big Man categories (blocks and rebounds), Hollinger levels the positional playing field by accounting for the defensive impact of power forwards and centers.  Interestingly, that change doesn’t affect PER’s correlation with Team Offensive Rating because NBA lineups are generally built the same way, with two definite guards and two definite big men.  Therefore, each team receives roughly the same amount of over- and under-estimation.

CONVERTING PER TO EOPM

In spite of that oddity, I think PER’s pretty great.  Not only does it correlate impressively with Team Offensive Rating, but it’s widely available, and it’s easy to find PER across seasons at basketball-reference.com.

But what if you wanted to quickly estimate a player’s Offensive Statistical APM, and filter out Hollinger’s big man overestimates?

Remember that .736 correlation between PER and EOPM?  Insert Rebounds Per 36 and Blocks Per 36 into the regression, and the correlation skyrockets to .963, with the following formula:

EOPM (estimated) = -7.5 + .63 * PER – .19 * Rebounds36 – 1.22 * Blocks36

I wish it were a little easier to calculate, but it’s not bad.



Copyright © 2004–2009. All rights reserved.

RSS Feed. This blog is proudly powered by Wordpress and uses Modern Clix, a theme by Rodrigo Galindez.