Simulating Data in R: Examples in Writing Modular Code

Simulating data is an invaluable tool. I use simulations to conduct power analyses, probe how robust methods are to violating assumptions, and examine how different methods handle different types of data. If I’m learning something new or writing a model from scratch, I’ll simulate data so that I know the correct answer—and make sure my model gives me that answer.

But simulations can be complicated. Many other programming languages require for loops to do a process multiple times; nesting many conditional statements and other for loops within for loops can quickly be difficult to read and debug. In this post, I’ll show how I do modular simulations by writing R functions and using the apply family of R functions to repeat processes. I use examples from Paul Nahin’s book, Digital Dice: Computational Solutions to Practical Probability Problems, and I show how his MATLAB code differs from what is possible in R.

My background is in the social sciences; I learned statistics as a tool to answer questions about psychology and behavior. Despite being a quantitative social scientist professionally now, I was not on the advanced math track in high school, and I never took a proper calculus class. I don’t know the theoretical math or how to derive things, but I am good at R programming and can simulate instead! All of these problems have derivations and theoretically-correct answers, but Nahin writes the book to show how simulation studies can achieve the same answer.

Example 1: Clumsy Dishwasher

Imagine 5 dishwashers work in a kitchen. Let’s name them dishwashers A, B, C, D, and E. One week, they collectively broke 5 plates. And dishwasher A was responsible for 4 of these breaks. His colleagues start referring to him as clumsy, but he says that this was a fluke and could happen to any of them. This is the first example in Nahin’s book, and he tasks us with finding the probability that dishwasher A was responsible for 4 or more of the 5 breaks. We are to do our simulation assuming that each dishwasher is of equal skill; that is, the probability of any dishwasher breaking a dish is the same.

What I’ll do first is define some parameters we are interested in. Let N be the number of dishwashers and K be the number of broken dishes. We will run 5 million simulations:

iter <- 5000000 # number of simulations
n <- 5          # number of dishwashers
k <- 5          # number of dish breaks

First, I adapted Nahin’s solution from MATLAB code to R code. It looks like this:

set.seed(1839)
clumsy <- 0
for (zzz in seq_len(iter)) {
  broken_dishes <- 0
  for (yyy in seq_len(k)) {
    r <- runif(1)
    if (r < (1 / n))
      broken_dishes <- broken_dishes + 1
  }
  if (broken_dishes > 3)
    clumsy <- clumsy + 1
}
clumsy / iter
## [1] 0.0067372

First, he sets clumsy to zero. This will be a variable that counts how many times dishwasher A broke more than 3 of the plates. We see a nested for loop here. The first one loops through all 5 million iterations; the second loops through all broken dishes. We draw a random number between 0 and 1. If this is less than 1 / N (the probability of any one dishwasher breaking a dish), we assign the broken dish to dishwasher A. If there are more than 3 of these, we call them “clumsy” and increment the clumsy vector by 1. At the end, we divide how many times dishwasher A was clumsy and divide that by the number of iterations to get the probability that this dishwasher broke 4 or 5 plates, given that all of the dishwashers have the same skill. We arrive at about .0067.

These nested for loops and if statements can be difficult to handle when simulations get more complicated. What would a modular simulation look like? I break this into two functions. First, we simulate which dishwashers broke the plates in a given week. sim_breaks will give us a sequence of N letters from the first K letters of the alphabet. Each letter is drawn with equal probability, simulating the situation where all dishwashers are at the same skill level. Then, a_breaks counts up how many times dishwasher A was at fault. Note that this function has no arguments of its own; it only has ..., which passes all arguments to sim_breaks. The sapply function tells R to apply a function to all numbers 1 through iter. Since we don’t actually want to use those values—we just want them as dummy numbers to do something iter many times—I put a dummy argument of zzz in the function that we will be applying to each number 1 through iter. This function is a_breaks(n, k) > 3). result will be a logical vector, where TRUE denotes dishwasher A broke more than 3 dishes and FALSE denotes otherwise. Since R treats TRUE as numeric 1 and FALSE as numeric 0, we can get the mean of result to tell us the probability of A breaking more than 3 dishes, given that all dishwashers are at the same skill level:

# simulate k dishwashers making n breaks in a week:
sim_breaks <- function(n, k) {
  sample(letters[seq_len(k)], n, replace = TRUE)
}
# get the number of breaks done by the target person:
a_breaks <- function(...) {
  sum(sim_breaks(...) == "a")
}
# how often will dishwasher a be responsible for 4 or 5 breaks?
set.seed(1839)
result <- sapply(seq_len(iter), function(zzz) a_breaks(n, k) > 3)
mean(result)
## [1] 0.0067372

We again arrive at about .0067.

Lastly, R gives us functions to draw randomly from distributions. Simulating how many dishes were broken by dishwasher A can also be seen as coming from a binomial distribution with K trials and a probability of 1 / N. We can make iter draws from that distribution and see how often the number is 4 or 5:

set.seed(1839)
mean(rbinom(iter, k, 1 / n) > 3)
## [1] 0.0067196

All three simulations give us about the same answer, which basically agree with the mathematically-derived answer of .00672. How do we interpret this? If you have a background in classical frequentist statistics (that is, focusing on p-values), you’ll notice that our interpretation is about the same as a p-value. If all dishwashers had the same probability of breaking a dish, the probability that dishwasher A broke 4 or 5 of them is .0067. Note that we are simulating from what could be called a “null hypothesis” that all dishwashers are equally clumsy. What we observed (dishwasher A breaking 4)—or more extreme data than we observed (i.e., 4 or more dishes) had a probability of .0067 of occurring. In most situations, we would “reject” this null hypothesis, because what we observed would have been so rare under the null. Thus, most people would say that dishwasher A’s 4 breaks in a week was not due to chance, but probably due to A having a higher latent clumsiness.

Example 2: Curious Coin Flip Game

Nahin tells us that this was originally a challenge question from the August-September 1941 issue of American Mathematical Monthly, and it was not solved until 1966. Imagine there are three people playing a game. Each of these three people have a specific number of quarters. One person has L quarters, another has M quarters, and another has N quarters. Each round involves all three people flipping one of their quarters. If all three coins come up the same (i.e., three heads or three tails), then nothing happens during that round. Otherwise, two of the players will have their coins come up the same, and one person will be different. The one that is different takes the other two players coins from that round.

So, for example, let’s say George has 3 quarters, Elaine has 3 quarters, and Jerry has 3 quarters. They all flip. George and Elaine get heads, while Jerry gets tails. George and Elaine would give those quarters to Jerry. So after that round, George and Elaine would have 2 quarters, while Jerry would have 5.

When someone runs out of coins, they lose the game. The challenge is to find the average number of rounds it takes until someone loses the game (i.e., runs out of coins). We are tasked with doing this at various values of initial starting quarter coins of L, M, and N. This is Nahin’s MATLAB solution:

broke.m-1.JPG
broke.m-2.JPG

For my taste, there’s too many for, while, and if else statements nested within one another. This can make it really easy to get confused while you’re writing the code, harder to debug, even harder to read, and a pain if you want to change something later on. Let’s make this modular with R functions. To make it easier to read, I’ll also add some documentation in roxygen2 style.

#' Simulate a Round of Coin Flips
#'
#' This function simulates three coin flips, one for each player in the game.
#' A 1 corresponds to heads, while 0 corresponds to tails.
#'
#' @param p Numeric value between 0 and 1, representing the probability of 
#' flipping a heads.
#' @return A numeric vector of length 3, containing 0s and 1s
sim_flips <- function(p) {
  rbinom(3, 1, p)
}

#' Simulate the Winner of a Round
#' 
#' This function simulates the winner of a round of the curious coin flip game.
#' 
#' @param ... Arguments passed to sim_flips.
#' @return Either a number (1, 2, or 3) denoting which player won the round or
#' NULL, denoting that the round was a tie and had no winner.
sim_winner <- function(...) {
  x <- sim_flips(...)
  x <- which(x == as.numeric(names(table(x))[table(x) == 1]))
  if (length(x) == 0) 
    
   else 
    
  
}

#' Simulate an Entire Game of the Curious Coin Flip Game
#' 
#' This function simulates an entire game of the curious coin flip game, and it
#' returns to the user how many rounds happened until someone lost.
#' 
#' @param l Number of starting coins for Player 1.
#' @param m Number of starting coins for Player 2.
#' @param n Number of starting coins for Player 3.
#' @param ... Arguments passed to sim_winner.
#' @return A numeric value, representing how many rounds passed until a player 
#' lost.
sim_game <- function(l, m, n, ...) {
  lmn <- c(l, m, n)
  counter <- 0
  while (all(lmn > 0)) {
    winner <- sim_winner(...)
    if (!is.null(winner)) {
      lmn[winner] <- lmn[winner] + 2
      lmn[-winner] <- lmn[-winner] - 1
    }
    counter <- counter + 1
  }
  return(counter)
}

Nahin asks for the answer with a number of different combinations of starting quarter counts L, M, and N. Below, I run the sim_game function iter number of times for the starting values: 1, 2, and 3; 2, 3, and 4; 3, 3, and 3; and 4, 7, and 9. Giving a vector of calls to sapply will return a matrix where each row represents a different combination of starting quarter values and each column represents a result from that simulation. We can get the row means to give us the average values until someone loses the game for each combination:

set.seed(1839)
iter <- 100000 # setting lower iter, since this takes longer to run
results <- sapply(seq_len(iter), function(zzz) {
  c(
    sim_game(1, 2, 3, .5),
    sim_game(2, 3, 4, .5),
    sim_game(3, 3, 3, .5),
    sim_game(4, 7, 9, .5)
  )
})
rowMeans(results)
## [1]  2.00391  4.56995  5.13076 18.64636

These values are practically the same as the theoretical, mathematically-derived solutions of 2, 4.5714, 5.1428, and 18.6667. I find creating the functions and then running them repeatedly through the sapply function to be cleaner, more readable, and easier to adjust or debug than using a series of nested for loops, while loops, and if else statements.

Example 3: Gamow-Stern Elevator Problem

As a last example, consider physicists Gamow and Stern. They both had an office in a building seven stories tall. The building had just one elevator. Gamow was on the second floor, Stern on the sixth. Gamow often wanted to visit his colleague Stern and vice versa. But Gamow felt like the elevator was always going down when it first got to his floor, and he wanted to go up. Stern, ostensibly paradoxically, always felt like the elevator was going up when he wanted to go down. Assuming that the elevator is going up-and-down all day, this makes sense: 5/6 of the other floors relative to Gamow (on the second floor) were above him, so 5/6 of the time the elevator would be on its way down. And the same is true for Stern, albeit in the opposite direction.

Nahin tells us, then, that the probability that the elevator is going down when it gets to Gamow on the second floor is 5/6 (.83333). Interestingly, Gamow and Stern wrote that this probability holds when there is more than one elevator—but they were mistaken. Nahin challenges us to find the probability in the case of two and three elevators. Again, I write R functions with roxygen2 documentation:

#' Simulate the Floor on Elevator Was On, and What Direction It Is Going
#' 
#' Given the floor someone is on and the total number of floors in the building,
#' this function returns to a user (a) what floor the elevator was on when
#' a potential passenger hits the button and (b) if the elevator is on its way
#' up or down when it reaches the potential passenger.
#'
#' @param f The floor a someone wanting to ride the elevator is on
#' @param h The total number of floors in the building; its height
#' @return A named numeric vector, indicating where the elevator started from
#' when the person waiting hit the button as well as if the elevator is going
#' down when it reaches that person (1 if yes, 0 if not)
sim_lift <- function(f, h) {
  floors <- 1:h
  start <- sample(floors[floors != f], 1)
  going_down <- start > f
  return(c(start = start, going_down = going_down))
}

#' Simulate Direction of First-Arriving Elevator
#' 
#' This function uses sim_lift to simulate N number of elevators. It takes the
#' one closest to the floor of the person who hit the button and returns
#' whether (1) or not () that elevator was going down.
#' 
#' @param n Number of elevators
#' @param f The floor a someone wanting to ride the elevator is on
#' @param h The total number of floors in the building; its height
#' @return 1 if the elevator is on its way down or 0 if its on its way up
sim_gs <- function(n, f, h) {
  tmp <- sapply(seq_len(n), function(zzz) sim_lift(f, h))
  return(tmp[2, which.min(abs(tmp["start", ] - f))])
}

First, let’s make sure sim_lift gives us about 5/6 (.83333):

set.seed(1839)
iter <- 2000000
mean(sapply(seq_len(iter), function(zzz) sim_lift(2, 7)[[2]]))
## [1] 0.8334135

Great. Now, we can run the simulation for when there are two and three elevators:

set.seed(1839)
results <- sapply(seq_len(iter), function(zzz) {
  c(sim_gs(2, 2, 7), sim_gs(3, 2, 7))
})
rowMeans(results)
## going_down going_down 
##  0.7225405  0.6480000

Nahin tells us that the theoretical probability of the elevator going down with two elevators is .72222 and three is .648148; our simulation adheres close to this.

I prefer my modular R functions to Nahin’s MATLAB solution:

 
gs.m-1.JPG
 
 
gs.m-2.JPG
 

Conclusion

Simulate data! Use it for power simulations, testing assumptions, verifying models, and solving fun puzzles. But make it modular by writing a few R functions, with documentation, and combine them all in a call to an apply-family function to do them in a way that is readable, clean, and easier to debug and modify.

Star Wars Fandom Survey, Part 5: Importance of Movie Characteristics

Welcome from Part 1, where I talked mainly about methods; Part 2, where I discussed the three major types of Star Wars fans; Part 3, where I discussed sexism and political attitudes; and Part 4, where I discussed age and nostalgia. In this part, I will focus on age and nostalgia. As always, email sw.survey.2019@gmail.com with questions about analyses, methods, results, and so on.


People enjoy movies for different reasons. Some want to have fun or want the film to challenge how they think, some want to be emotionally moved or to see compelling action—and others want a combination of these things. I wrote a questionnaire about “movie importance” to gauge what respondents want from their movie-watching experience. (I briefly mentioned this in Part 1.) And I wanted to know how each of these correlated with favorability toward each Star Wars trilogy. I asked participants, “How important are each of these to you when watching a movie?” and presented them with this list:

  • Fun: “Having fun while watching the movie.”

  • Meaningful: “Finding the movie meaningful.”

  • Emotionally Moving: “Being emotionally moved by the movie.”

  • Complex Characters: “That the movie has complex characters.”

  • Thought-Provoking: “That the movie be thought-provoking.”

  • Action: “That the movie has engaging action.”

  • Artistically Valuable: “That the movie is artistically valuable.”

  • Twists and Unexpected: “That the movie has twists and unexpected events.”

  • Feel-Good Ending: “It has a feel-good ending.”

  • Costumes and Setting: “The costumes and sets are aesthetically appealing.”

  • Logical Worldbuilding: “That the movie builds a logical world and lore.”

Participants answered each on a 1 (not at all important) to 7 (very important) scale. Every item correlated positively with one another (minimum correlation: .09, maximum: .58, average = .27), so looking at correlations between each item and favorability toward each movie could surface illusory correlations. For example, the “fun” question correlated with the “action” question at .39, and if fun correlates with enjoying one of the trilogies, it could be due to the overlapping correlation with action. What I did here, then, was use all these questions as simultaneous predictors in a multiple regression equation. Then I looked at any movie importance item that was significant at p < .01.

As I’ve done in other parts, I averaged how favorably people feel toward each of the main Star Wars films by trilogy. This created favorability scores for the originals, prequels, and sequels. There were three regression models, one for each saga. I plotted the standardized regression coefficients below.

 
get-coefs-1.png
 

Wanting action and logical worldbuilding were positive predictors of enjoying the originals, while needing complex characters and a feel-good ending predicts feeling unfavorably toward movies of the original trilogy.

We see more significant predictors for the prequels. People who enjoy the prequels also tend to like action, a feel-good ending, well-made costumes and settings, meaningful films, twists and unexpected events, and logical worldbuilding. Much like the originals, wanting complex characters predicted not enjoying the prequels.

We see different relationships for the sequels, which has been a recurring theme in each part. Yet again, the data showed how this trilogy is divisive and breaks from the other two Star Wars trilogies. People who want a feel-good ending, twists and unexpected events, complex characters, to be emotionally moved, to have fun, and to watch something with artistic value are all more likely to enjoy the sequels. For originals and prequels, wanting complex characters predicted disliking the movies; conversely, finding complex characters important to a film predicts enjoyment of the sequel movies.

The biggest relationship here, however, is that those who wanted logical worldbuilding and lore tended not to enjoy the sequel trilogy. This also flips relationships that we see for the originals and prequels, where logical worldbuilding predicted enjoyment. This flip is likely because of the bold character and narrative choices Rian Johnson made in The Last Jedi.

These data do not necessarily mean that, for example, the originals had non-complex characters (e.g., Lando’s actions on Bespin in Empire Strikes Back are neither deplorable nor laudable). It also doesn’t mean that the sequels lack logical worldbuilding (e.g., Leia’s Force pull in space in The Last Jedi has canonical precedent from Rebels). What these data do show, however, is the psychological relationship between what people want from a movie and how much they enjoy each trilogy. And yet again, we see the sequel trilogy is empirically separated from the other two trilogies.

Star Wars Fandom Survey, Part 4: Age and Nostalgia

Welcome from Part 1, where I talked mainly about methods; Part 2, where I discussed the three major types of Star Wars fans; and Part 3, where I discussed sexism and political attitudes. In this part, I will focus on age and nostalgia. As always, email sw.survey.2019@gmail.com with questions about analyses, methods, results, and so on.


Star Wars was a big part of many fans’ childhoods. In 2005, George Lucas told BBC News that the Star Wars “movies are for children but [fans] don’t want to admit that.” Star Wars has a massive adult fanbase, but my survey’s sample of over five thousand fans suggests that these fans largely became such as children. The median age when first watching a Star Wars film was six, 90% of respondents watched one for the first time before the age of 13, and 96% of the current sample did so before the age of 18.

 
age-first-1.png
 

I also wanted to know if fans felt particularly warm toward the movies that came out when they were children. I looked at this by plotting participants birth year against how favorably they reported feeling toward each of the trilogies. I averaged scores for movies within trilogies to get this overall favorability score.

I did not, however, draw a typical, straight regression line. Instead, I drew what are known as “cubic regression splines.” Put simply, the lines try to be more flexible to the data than typical regression lines. They allow more bends in the line, while still being smooth so as to not read too much into noise.

 
age-fav-1.png
 

In the left panel, we see that people who feel most favorably toward the originals are the people who were children when they first released. The same thing is in the middle panel: A bump in favorability for the prequels for people born in 1990 and afterward, since they grew up with these movies (whereas older generations did not). The right panel shows that those who were born around the time of the original trilogy dislike the sequels the most. We are still a decade or two from getting good data on the kids who grew up during the sequel trilogy, but I hypothesize that they will feel more positively toward it than the other age groups.

I interpret this as a sign of nostalgia for participants’ childhoods. Nostalgia is a “sentimental longing or affection for the past” (Baldwin, Biernat, & Landau, 2015). As mentioned in Part 1, I asked respondents how much they “feel a nostalgic and warm feeling” for things from their personal past: friends, family, pets, toys, TV shows, movies, and music. Unfortunately, since these questions did not form a cohesive scale together, I looked at each separately. As many people told me at the end of the survey, one cannot feel nostalgic for pets if they did not have pets; for that reason, I do not include that item here.

In the figure below, I show the correlations between favorability for each of the trilogies and how nostalgic people are. Each box represents a correlation, which can range from -1 (an exact, negative relationship) to +1 (an exact, positive relationship). Empty tiles represent correlations that were not significant, p > .01.

 
nost-1.png
 

Many of these are considered “small” correlations (< .10) in psychology, so I focus on the larger correlations. In general, we see that nostalgia is correlated with how favorably one sees the originals, while the relationship between the prequels and nostalgia is smaller. The only negative correlation is between the sequels and nostalgia toward toys. Many of the survey’s respondents were referred from toy collectors’ websites, so it makes some intuitive sense that nostalgia in this domain would be powerful. The more nostalgic one reports being toward the toys from their past, the less they like the sequel trilogy. This again shows how the sequel trilogy—particularly The Last Jedi—have broken with tradition, to the chagrin of some nostalgic fans.

I also wanted to compare those born before 1990 and those born in and after 1990, since that is when we see positive attitudes toward the prequels start to increase in the age plot above. The two panels of this plot are mostly the same; it seems like the nostalgia for the original trilogy carried over to the prequels for those born before 1989, even though they were largely adults upon those movie’s releases.

The biggest difference again shows the polarization of the sequel trilogy. Nostalgia is largely unrelated to the sequel trilogy for those born in and after 1990; the negative correlation between toy nostalgia and sequel-trilogy favorability is only present for those born before 1990. It seems the nostalgia that carried over from the original trilogy to the prequel trilogy has not also carried over to the sequel trilogy, which does not directly involve George Lucas and has broken with tradition in casting and narrative decisions.

 
comp-ages-1.png
 

Many Star Wars fans start young, as did the majority in this sample. This allows nostalgia to be a powerful lens through which people perceive these movies. The more nostalgic people report being, the more they enjoy the Star Wars films. The only exception to this is older fans and the sequel trilogy. In The Last Jedi, Kylo Ren implores Rey to, “Let the past die. Kill it, if you have to.” This line had some meta-contextual meaning: many unexpected narrative choices in The Last Jedi made it break from what one might expect from a Star Wars film. These data suggest some older, nostalgic fans would rather not kill the past.

Star Wars Fandom Survey, Part 3: Sexism and Political Attitudes

Welcome from Part 1, where I talked mainly about methods, and Part 2, where I discussed the three major types of Star Wars fans. In this part, I will focus on sexism and political attitudes. As always, email sw.survey.2019@gmail.com with questions about analyses, methods, results, and so on.


It is not inherently sexist to dislike Disney’s Star Wars films. There have been many thoughtful, intelligent criticisms of these movies. That being said, sexism played a major role in the backlash against these movies, particularly The Last Jedi. A vocal minority has published a great many articles and videos condemning The Last Jedi as feminist, politically-correct (“PC”) propaganda. Tweets about female characters contained hate speech, which drove actor Kelly Marie Tran (who plays Rose Tico in the sequel trilogy) from social media; she has since responded to this online abuse in a New York Times op-ed. And the actor who plays Rey, Daisy Ridley, responded to criticism of Rey’s competence—the “Mary Sue” critique—by calling the term sexist. The Force Awakens director J.J. Abrams and Mark Hamill (who plays Luke Skywalker) both spoke out against the sexist rhetoric used to criticize the Disney movies.

I am not here to litigate the gender politics in the Star Wars movies. My focus in this part is on the empirical, psychological relationship between favorability toward Star Wars films and sexist attitudes. I also look at the closely related concepts of “political correctness” and political identification. I focus on the survey’s questions of hostile sexism, benevolent sexism, PC beliefs, and political identification. I briefly discussed these in Part 1.

These concepts are all related to one another. If we see a relationship between how conservative someone is and their attitudes toward The Last Jedi, then how do we know it isn’t one of these other variables that is responsible for the relationship? To address this, I ran regression equations with each as a simultaneous predictor of fan-cluster membership, movie favorability, and character favorability. In these equations, I will focus only on variables that were significant predictors (p < .01).

I’ll start by looking at how fan clusters (from Part 2) differ on these items. Then I’ll turn to the relationships between these items and favorability toward Star Wars films and characters.


Fan Clusters

In Part 2, I found three major types of Star Wars fans: Prequel Skeptics, who love the saga but rate the prequels lower than the rest; Saga Lovers, who rate everything highly; and TLJ Disowners, who rate only The Last Jedi very negatively.


Sexism

I measured hostile sexism with two statements: “Most women interpret innocent remarks or acts as being sexist,” and “Feminists are making unreasonable demands of men.” I averaged these together to get a general picture of hostile sexism. Benevolent sexism was measured with: “Women should be cherished and protected by men,” and “Many women have a quality of purity that few men possess.” Again, I averaged these together for a general picture of benevolent sexism.

In the plots below, each dot represents a response. The black circle is the mean of the group, and the horizontal lines above and below each dot are the 95% confidence intervals. The confidence interval represents a plausible range of values we can expect the mean to truly be.

 
sexism-clust-1.png
 

TLJ Disowners reported more sexism than Saga Lovers, who reported more sexism than Prequel Skeptics. Comparing both means and medians, all comparisons were statistically significant, (p < .01). Those disowning The Last Jedi tended to score higher on sexism, though, as you can see, not everyone who hates The Last Jedi is sexist. This demonstrates some empirical evidence that sexism plays a role in attitudes toward The Last Jedi.


PC Beliefs and Conservatism

I asked respondents how much they agreed with: “Needing to be ‘politically correct’ creates an atmosphere in which the free exchange of ideas is impossible” to measure their “PC” beliefs. Participants also rated themselves on a scale from very liberal to very conservative.

 
pc-con-1.png
 

The same pattern of results is found here as above: TLJ Disowners reported more negative attitudes toward being PC and more conservatism than Saga Lovers, who reported more of both than Prequel Skeptics. All comparisons were statistically significant, (p < .01). The biggest differences were between TLJ Disowners and the other two clusters. TLJ Disowners are more likely to believe political correctness is a negative force in society and are less politically liberal. Once again, we see a lot of variance within these groups, showing us that the clusters are not in lockstep with these political beliefs.


Trilogy Correlations

I asked respondents how much they liked each Star Wars episode on a ten-point scale. Movies in the same trilogy tended to correlate highly with one another, so I averaged attitudes toward movies of the same trilogy together for these analyses. In this and the next section, I look only at the sexism and PC questions because political identification was no longer a significant predictor after taking these attitudes into account. That is, the relationship between conservatism and movie favorability could have been due to sexism and PC beliefs.

The plots below show sexism and PC scores on the x-axis and favorability toward the trilogy on the y-axis. Each point is someone’s response, and I drew a line showing the relationship between the two attitudes through the points.

Each graph shows the same pattern. There were small, positive relationships between each attitude and favorability for the original and prequel trilogies. The more sexism and anti-PC beliefs one reported, the more they rated the movies favorably. However, we see the opposite relationship for the sequels, especially with hostile sexism and PC beliefs. The more sexism and negative PC beliefs someone reports, the less likely they are to like the sequels.

 
host-trils-1.png
 
 
benev-trils-1.png
 
 
pc-trils-1.png
 

There has been a lot of cultural discussion about Star Wars and sexism. These data show the empirical relationship: Sexism is correlated with disliking the sequel films, as is thinking political correctness harms society. Again, however, I will point to the variability around these lines; although this relationship exists, not everyone who dislikes the sequels is a sexist or bemoans PC culture.


Character Correlations

Three sequel-trilogy characters have faced the brunt of sexist criticism: Vice Admiral Amilyn Holdo, Rey, and Rose Tico. We see the same relationship across all three characters and three attitudes (sexisms and PC beliefs): The more sexism someone reports, the less they like Holdo, Rey, and Rose, and the more one dislikes political correctness, the less favorably they feel toward the characters.

 
wom-sexism-1.png
 

This aligns with what we know about psychological theories of sexism and political correctness. Each of these women defy traditional gender stereotypes. Both types of sexisms are based on traditional gender stereotypes, and violations of these anger those who believe in the stereotypes. Those who believe political correctness harms free exchange of ideas might not see traditional stereotype-defying women as a free artistic expression, but as a decision made solely to appeal to political correctness.

These results also replicate work a colleague of mine and I did after the release of The Force Awakens. We found that hositle sexism was a positive predictor of the typical, “Mary Sue” complaints about Rey, i.e. that she was “too competent” throughout The Force Awakens.


Conclusion

These data shed some insight onto the still-ongoing conversation about sexism, politics, and The Last Jedi. Looking at responses from over five thousand Star Wars fans, it is clear that sexism and disliking political correctness are positively related to disliking Disney’s sequels, though it is also not a one-to-one relationship: Some people defy this trend and dislike the movies while holding progressive attitudes about women.

These data support the excess anecdotal evidence (tweets, comments, articles) that sexism plays a major role in the backlash to Disney’s sequels. While some criticism of the movie is in good faith, these data suggest some of the backlash to the film is likely not. Given their social and political attitudes, some people might have been predisposed to hate it—regardless of the film’s quality—due to main female characters demonstrating skill, bravery, and leadership.

Confidence Interval Coverage in Weighted Surveys: A Simulation Study

My colleague Isaac Lello-Smith and I wrote a paper on how to obtain valid confidence intervals in R for weighted surveys. You can read the .pdf here and check out the code at GitHub.

There are many of schools of thought out there on methods for estimating standard errors from weighted survey data, and many books don’t tell you why a method may or may not be valid. So, we decided to simulate a situation that we often see in our work and see what worked best. A few highlights:

Screen Shot 2019-04-09 at 6.06.57 PM.png
  • Be careful with “weights” arguments in R! Read the documentation carefully. There are many types of weights in statistics, and your standard errors, p-values, and confidence intervals can be wildly wrong if you supply the wrong type of weight.

  • Use bootstrapping to calculate standard errors. We found that even estimation methods that were made for survey weights underestimated standard errors.

  • Be skeptical of confidence intervals in weighted survey contexts. We found that standard errors were underestimated when simulating real-world imperfections with the data, such as measurement error and target error. 95% confidence intervals were only truly 95% in best-case scenarios with low error.