Using Beta Regression to Better Model Norms in Political Psychology

November 28, 2017

Update 2018-08-23: The link below is outdated. A full, more detailed paper can be found at my GitHub.

I recently wrote a short working paper on how to use beta regression and how it helps take into account norms in correlational studies of ideology, politics, and prejudice. It is a little long for a blog post (and this platform does not support LaTeX), so I uploaded it as a working paper. Click here to download the paper.

I hope it is instructive and informative, and that it fills in a few gaps from previous papers. Please email me if you have any questions about it. As always, the code can be found over at my GitHub.

Updated 2017-12-11

Screen Shot 2017-12-11 at 11.13.04 AM.png

In Support of Open Seeding in the NBA

September 24, 2017

This is a post arguing why the NBA should adopt open seeding in the playoffs: Instead of taking the top 8 teams in each conference, the top 16 teams in the NBA should make the playoffs.

The first thing I wanted to do was diagnose the problem. I looked at every year from 1984 (when the NBA adopted the 16-team playoff structure) through 2017. For each year, I tallied the number of teams making the playoffs who had a smaller win percentage than a team in the other conference. These data come from Basketball-Reference.com, and the code for scraping, cleaning, and visualizing these data can be found over at GitHub.

The following figure shows this tally per year, and the years are grouped by color based on the conference who had a team with the worse record. The years between dashed vertical lines represent the years in which division winners were guaranteed a top three or four seed.

plot of chunk figure

This analysis spans 34 seasons. Of these 34 seasons:

There were 10 seasons where the 16 teams with the best records were the 16 in the playoffs (although not seeded as such, since the playoff bracket is split by conference).
Of the 24 (71%) seasons where at least one team with a worse win percentage than a lottery team in the other conference made the playoffs, the offending conference was the East 16 times, while the West 8 times.
The worst year was 2008, where more than half of the playoff teams in the Eastern Conference had a worse record than the 9th-place Golden State Warriors, who went .585. Actually, the 10th-place Portland Trail Blazers went .500, placing them ahead of the 7th- and 8th-seeded Eastern Conference teams, and the 11th-place Sacramento Kings had a better record than the 8th-seeded Eastern Conference Atlanta Hawks (a team that only won 37 games).

I have heard the argument that we should not worry about unbalanced conferences in any one year, because “Sometimes the East is better, sometimes the West is better—it balances out in the long-run!” While my analyses don't control for strength of schedule in each conference, it simply isn't true that the conference imbalance evens out over time. I'm looking at the past 34 seasons, and the East was worse twice as often as the West (at least in terms of worse teams making the playoffs).

That argument also doesn't make sense to me because championships are not decided over multiple-years. They are an award given out at the end of every season. So even if it balanced out between conferences over time, this would not matter, because every year some below-average team is making the playoffs. And from these data, we can see that 71% of the seasons in the last 34 years have resulted in at least one team making the playoffs that had a worse record than a lottery team in the opposite conference.

The Importance of Blocks for NBA Defenses, Over Time

September 22, 2017

Introduction

I have always been ambivalent on how I feel about blocking shots. On one hand, they are really cool. Seeing someone send a shot into the fifth row or watching Anthony Davis recover and get hand on a shot just in time is fun. Perhaps the most important play of LeBron's entire career was a block. At the same time, blocks mean more shooting fouls, and blocks don't often result in a defensive rebound.

My interest in blocks continues in this post: I will be looking at the relationship between blocks and defensive rating, both measured at the player level. Both are per 100 possessions statistics: How many blocks did they have in 100 possessions? How many points did they allow in 100 possessions (see the Basketball-Reference glossary)?

My primary interest here is looking at the relationship over time. Are blocks more central to the quality of a defense now than they used to be? The quality of defense will be measured by defensive rating—the number of points a player is estimated to have allowed per 100 possessions. For this reason, a better defensive rating is a smaller number.

There is some reason for why blocks might be more important now than they used to be. In the 2001-2002 NBA season, the NBA changed a bunch of rules, many relating to defense. There were two rules in particular that changed how people could play defense (quoting from the link above): “Illegal defense guidelines will be eliminated in their entirety,” and “A new defensive three-second rule will prohibit a defensive player from remaining in the lane for more than three consecutive seconds without closely guarding an offensive player.” This meant that people could start playing zone defense, as long as the defense didn't park guys in the paint for longer than three seconds. This also means blocking shots became a skill that took self-control: You had to be able to slide into place when necessary to block shots—you couldn't just stand there in the paint and wait for guys to swat.

Other rules made it more likely that guys were going to be able to drive past backcourt players and reach the paint: “No contact with either hands or forearms by defenders except in the frontcourt below the free throw line extended in which case the defender may use his forearm only.” So, have blocks grown more important for a defense over time?

Analysis Details

I looked at every NBA player in every season from 1989 to 2016, but players were only included if they played more than 499 minutes in a season. If you are interested in getting this data, you can see my code for scraping Basketball-Reference—along with the rest of this code for this blog post—over at GitHub.

I fit a Bayesian multilevel model to look at this, using the brms package, which interfaces with Stan from R. I predicted a player's defensive rating from how many blocks they had per 100 possessions, what season it was, and the interaction between the two (which tests the idea that the relationship has changed over time). I also allowed the intercept and relationship between blocks and defensive rating to vary by player. I made the first year in the data, 1989, the intercept (e.g., subtracted 1989 from the year variable) and specified some pretty standard, weakly-informative priors, as well:

brm(drtg ~ blkp100*year + (1+blkp100|player),
    prior=c(set_prior("normal(0,5)", class="b"),
            set_prior("student_t(3,0,10)", class="sd"),
            set_prior("lkj(2)", class="cor")),
    data=dat, warmup=1000, iter=2000, chains=4)

Results

The probability that the relationship between blocks and defensive rating depends on year, given the data (and the priors), is greater than 0.9999. So what does this look like? Let's look at the relationship at 1989, 1998, 2007, and 2016:

Year	Slope
1989	-1.04
1998	-1.48
2007	-1.92
2016	-2.36

Our model says that, in 1989, for each block someone had in 100 possessions, they allowed about 1.04 fewer points per 100 possessions. In 2016? This estimate more than doubled: For every block someone had in 100 possessions, they allowed about 2.36 fewer points per 100 possessions! Here is what these slopes look like graphically:

plot of chunk plot

However, the model assumes a linear change in the relationship between blocks and defensive rating per year. What if there was an “elbow” (i.e., drop-off) in this curve in 2002, when the rules changed? That is, maybe there was a relatively stable relationship between blocks and defensive rating until 2002, when it quickly became stronger. To investigate this, I correlated blocks per 100 possessions and defensive rating in each year separately. Then, I graphed those correlations by year. (Note: Some might argue a piecewise latent growth curve model is the appropriate test here; for brevity and simplicity's sake, I chose to take a more descriptive approach here).

plot of chunk next-plot

It is hard to tell specifically where the elbow is occuring, but it does look like there is less importance of blocks before the rule change (i.e., the correlation was getting closer to zero), whereas after the rule change, the correlation was getting stronger (e.g., getting more and more negative).

Conclusion

It looks like blocks are not only cool, but especially relevant for players to perform well defensively in recent years. A consequence of the rule changes in 2001-2002 is that blocks are more important now for individual defensive performance than they were in the past.

Blog

Introduction

Analysis Details

Results

Conclusion

Archive