This code borrows heavily from the following sources:

Brown VA. An Introduction to Linear Mixed-Effects Modeling in R. Advances in Methods and Practices in Psychological Science. 2021;4(1). doi:10.1177/2515245920960351
- Associated OSF page with data and code: https://osf.io/v6qag/
Winter, B. (2013). Linear models and linear mixed effects models in R with linguistic applications. arXiv. http://arxiv.org/ abs/1308.5499
For a more in-depth guide, see: Brauer, M., & Curtin, J. (2018). Linear mixed-effects models and the analysis of nonindependent data: A unified framework to analyze categorical and continuous independent variables that vary within-subjects and/or within-items. Psychological Methods, 23(3), 389-411. https://doi.org/10.1037/met0000159

Preliminaries

Install packages if they aren’t already installed (notice the nice compact code here that avoids re-installing the package if it already is found in your R environment).

if (!("knitr" %in% installed.packages())) install.packages("knitr")
if (!("lme4" %in% installed.packages())) install.packages("lme4")
if (!("lmerTest" %in% installed.packages())) install.packages("lmerTest")
if (!("tidyverse" %in% installed.packages())) install.packages("tidyverse")
if (!("broom.mixed" %in% installed.packages())) install.packages("broom.mixed")
if (!("afex" %in% installed.packages())) install.packages("afex")
if (!("papaja" %in% installed.packages())) install.packages("papaja")
if (!("here" %in% installed.packages())) install.packages("here")
if (!("performance" %in% installed.packages())) install.packages("performance")

Load libraries and set up paths

library(lme4)
library(lmerTest)
library(tidyverse)
library(broom.mixed)
library(afex)
library(papaja)
library(knitr)
library(here)
library(performance)

data_path <- here("data")

General problem

Many of the datasets we work with in cognitive science have a particular hierarchical structure: they are clustered within units, such that

Inspect the data

Load data, and name that object “rt_data”

rt_data <- read_csv(here(data_path,"rt_dummy_data.csv"))

## Rows: 21679 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (2): modality, stim
## dbl (2): PID, RT
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

View the first six rows of the data frame

head(rt_data)

## # A tibble: 6 × 4
##     PID    RT modality   stim 
##   <dbl> <dbl> <chr>      <chr>
## 1   301  1024 Audio-only gown 
## 2   301   838 Audio-only might
## 3   301  1060 Audio-only fern 
## 4   301   882 Audio-only vane 
## 5   301   971 Audio-only pup  
## 6   301  1064 Audio-only rise

Fit a linear mixed-effects model

rt_mod <- lmer(RT ~ 1 + modality + (1+modality|PID), 
                    data = rt_data)

Inspect the summary output.

summary(rt_mod)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality + (1 + modality | PID)
##    Data: rt_data
## 
## REML criterion at convergence: 302400.8
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3640 -0.6991 -0.0141  0.5903  5.0118 
## 
## Random effects:
##  Groups   Name                Variance Std.Dev. Corr 
##  PID      (Intercept)         28541    168.9         
##           modalityAudiovisual  7709     87.8    -0.17
##  Residual                     65715    256.3         
## Number of obs: 21679, groups:  PID, 53
## 
## Fixed effects:
##                     Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)          1044.08      23.34   52.04  44.733  < 2e-16 ***
## modalityAudiovisual    83.11      12.56   51.87   6.617 2.04e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr)
## modltyAdvsl -0.179

Let’s try to make sense of some of the key pieces.

Focus first on the fixed effects model summary. This looks a lot like the output we get from a linear regression model.

Generate pretty output

There’s a growing number of excellent options for quickly printing lovely APA-style model outputs and regression tables

Option 1: `tidy()` from `broom.mixed`

First, let’s use some nice R functions for getting cleaner output of the model. For this, we can use the nifty broom package.

rt_mod %>%
  tidy() %>%
  filter(effect=="fixed") %>%
  select(-group,-effect) %>%
  kable(digits=8)

term	estimate	std.error	statistic	df	p.value
(Intercept)	1044.0786	23.34039	44.732692	52.04062	0e+00
modalityAudiovisual	83.1101	12.56097	6.616533	51.86638	2e-08

Option 2: `apa_print()` from `papaja`

First, use the apa_print() method from papaja to get a structured object representing the model outputs in APA-friendly format

lmer_output <- apa_print(rt_mod)

This object now contains summaries of the model output formatted in a way that they directly print out in APA format in an R markdown document. For example, we found a significant effect of modality on reaction time, \(\hat{\beta} = 83.11\), 95% CI \([58.49, 107.73]\), \(t(51.87) = 6.62\), \(p < .001\).

We can also output a (basically) publication-ready regression table.

lmer_output %>%
  #optionally: give columns desired names
  label_variables(
  Term = "Model Coefficient"
  ) %>%
  apa_table(
    caption = "Estimates for linear mixed-effects model predicting",
    note = "Degrees of freedom estimated using the Satterthwaite approximation.",
    results="asis")

(#tab:unnamed-chunk-9) Estimates for linear mixed-effects model predicting
Term	\(\hat{\beta}\)	95% CI	\(t\)	\(\mathit{df}\)	\(p\)
Intercept	1,044.08	[998.33, 1,089.82]	44.73	52.04	< .001
ModalityAudiovisual	83.11	[58.49, 107.73]	6.62	51.87	< .001

Note. Degrees of freedom estimated using the Satterthwaite approximation.

Interpret the output

Fixed Effects

The fixed effects can be interpreted very similarly to a typical linear regression model. Intercepts reflect the model estimate when all other predictors are set to zero. Estimates for simple predictors are slopes, reflecting how the DV is estimated to increase for a unit-change in the predictor (if there are interactions, this is the slope at when all other predictors are set to zero). Interactions reflect how the slope associated with a given predictor changes as a function of another variable.

Note that in addition to the summary command, R also gives us a function to extract the fixed effects directly.

fixef(rt_mod)

##         (Intercept) modalityAudiovisual 
##           1044.0786             83.1101

Estimating p-values for fixed effects

Approximating df

We are using lmerTest() to estimate the p-values for the fixed effects. The underlying problem for estimating p-values is that it’s not clear what degrees of freedom make sense for a mixed-effects model The lmerTest uses a particular method to estimate the p-values, called the Satterthwaite approximation. Another possible method for estimating df’s (probably more robust) is the Kenward-Roger approximation, which is implemented in the pbkrtest package.

Null model approach (w/ likelihood ratio test)

Another common way that people often use to estimate p-values in mixed-effects models is to compare the full model to a null model that includes all of the same parameters except the key parameter of interest (in this instance, modality condition).

Full model: \(RT \sim 1 + modality + (1 + modality|PID)\)

Null model: \(RT \sim 1 + (1 + modality|PID)\)

# fit the null model
rt_mod_null <- lmer(RT ~ 1 + 
                      (1 + modality|PID), 
                    data = rt_data)

Now, we use the anova() command to compare the full model to the null model using a likelihood ratio test.

anova(rt_mod_null,rt_mod)

## refitting model(s) with ML (instead of REML)

## Data: rt_data
## Models:
## rt_mod_null: RT ~ 1 + (1 + modality | PID)
## rt_mod: RT ~ 1 + modality + (1 + modality | PID)
##             npar    AIC    BIC  logLik deviance  Chisq Df Pr(>Chisq)    
## rt_mod_null    5 302458 302498 -151224   302448                         
## rt_mod         6 302428 302476 -151208   302416 32.375  1  1.271e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Let’s inspect this a bit and see if we can make sense of the main bits of input.

Just for fun and to demystify a bit, here’s where the chisquared statistic and the p-value are coming from.

#compute chi-squared value
chi_sq <- 2 * (logLik(rt_mod, REML=FALSE) - logLik(rt_mod_null, REML=FALSE))
chi_sq

## 'log Lik.' 32.36581 (df=6)

#get p-value (why df=1?)
pchisq(chi_sq, df=1, lower.tail=FALSE)

## 'log Lik.' 1.277137e-08 (df=6)

Note that the summary() command also outputs the results of a comparison of the full model to a null model “under the hood”. This approach just makes that process more explicit, and then conducts a slightly different test for estimating the p-value (the likelihood ratio test).

Compare this outcome to the p-value estimated using the Satterthwaite approximation. How similar are the results?

Example description

To investigate the effect of modality on reaction time, we fit a linear mixed-effects model predicting reaction time from modality condition (dummy coded; 0 = audio-only; 1 = audio-visual). We included a by-participant random intercept and by-participant random slope for modality. Degrees of freedom were estimated using the Satterthwaite approximation. Reaction times were higher in the audio-visual condition than in the audio-only condition, \(\hat{\beta} = 83.11\), 95% CI \([58.49, 107.73]\), \(t(51.87) = 6.62\), \(p < .001\).

Recoding variables to make output more interpretable

A good general heuristic to keep in mind:

Always center your predictors and code them such that a unit change is meaningfully interpretable.

rt_data <- rt_data %>%
  mutate(
    modality_c = ifelse(modality=="Audio-only",-1,1)
  )

Let’s now refit the model with modality centered.

rt_mod_c <- lmer(RT ~ 1 + modality_c + 
                      (1 + modality_c|PID), 
                    data = rt_data)
summary(rt_mod_c)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality_c + (1 + modality_c | PID)
##    Data: rt_data
## 
## REML criterion at convergence: 302402.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3640 -0.6991 -0.0141  0.5903  5.0118 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  PID      (Intercept) 28010    167.4        
##           modality_c   1927     43.9    0.10
##  Residual             65715    256.3        
## Number of obs: 21679, groups:  PID, 53
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept) 1085.634     23.056   52.022  47.087  < 2e-16 ***
## modality_c    41.555      6.281   51.865   6.616 2.05e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr)
## modality_c 0.091

What’s changed compared to the model in which modality is not centered? (hint: check out and compare the estimates between the two models)

Some advantages of centering include:

improves interpretability
avoids worst risks of misinterpretation when fitting interaction models - improves likelihood that linear mixed-effects models will converge (helps reduce covariance between random effects parameters)
depending on centering strategy (within- vs. between-participants), can ensure that fixed effects are clearly interpretable as within-participant effects (see also Simpson’s paradox).

Random Effects

Let’s dig deeper into the model to understand what exactly the random effects are doing. The random effects capture how much the effects vary at different levels of my random units. In other words, in this case, the random effects capture estimates at the level of individual participants. We’re going to use the lattice package to easily visualize this.

First, R provides a few convenience functions for extracting the random effects. The first we’ll look at is coef().

coef(rt_mod_c)

## $PID
##     (Intercept) modality_c
## 301   1014.9281  -8.748539
## 302   1045.0072   1.057262
## 303    911.5878  28.643556
## 304   1218.2331 -14.254925
## 306   1058.6560  17.031122
## 307   1106.7267  -4.802868
## 308   1286.1686  35.453911
## 309    803.0457   7.871501
## 310   1227.8618  52.509060
## 311   1028.2030  14.924405
## 312   1186.3287  76.666989
## 313   1129.2855  14.790379
## 314   1132.6228 -36.619027
## 315    886.0573   8.519083
## 316   1400.1383 -19.478545
## 318    974.8713  28.959640
## 319   1066.7398  49.260264
## 320   1037.9520  50.909023
## 321   1094.7379  69.762004
## 324   1099.4306  68.263813
## 325    844.4737  17.741190
## 326   1068.6178  20.178014
## 328   1301.1244  64.174670
## 329   1047.6241   5.536363
## 331   1487.3441  80.468895
## 333   1672.2863  27.896086
## 334    990.2461  46.528235
## 335   1203.2158  32.212791
## 337    957.6077  84.840739
## 338    867.4615  60.884613
## 340   1231.9904  52.370119
## 341    929.8112  61.507809
## 342   1268.4359  15.057271
## 343   1091.3845 104.207017
## 344   1076.3856  48.568813
## 346    913.2044  18.026367
## 348    849.5420  94.237291
## 349    968.4023  27.221037
## 350   1224.4601 151.365898
## 351   1228.1187 107.265191
## 352    835.7700  39.188684
## 353   1181.2475  77.642717
## 354   1152.3405  77.159157
## 355   1237.5573  44.326990
## 356   1105.9765 198.423178
## 357    951.2166  41.197069
## 358    979.7722  16.497627
## 359   1067.7778 -19.809301
## 360   1083.8725  13.428596
## 361   1047.8516  65.186997
## 362    981.4705  28.240624
## 363    942.5167  21.833948
## 364   1040.8955  38.094841
## 
## attr(,"class")
## [1] "coef.mer"

ranef is similar, but it instead shows the variation for each participant around the fixed effect.

ranef(rt_mod_c)

## $PID
##     (Intercept) modality_c
## 301  -70.705564 -50.303589
## 302  -40.626414 -40.497788
## 303 -174.045818 -12.911494
## 304  132.599481 -55.809975
## 306  -26.977678 -24.523928
## 307   21.093035 -46.357918
## 308  200.534932  -6.101139
## 309 -282.587939 -33.683549
## 310  142.228105  10.954010
## 311  -57.430632 -26.630645
## 312  100.695022  35.111939
## 313   43.651883 -26.764671
## 314   46.989141 -78.174077
## 315 -199.576348 -33.035967
## 316  314.504648 -61.033595
## 318 -110.762321 -12.595410
## 319  -18.893887   7.705214
## 320  -47.681631   9.353973
## 321    9.104267  28.206954
## 324   13.796933  26.708764
## 325 -241.159932 -23.813860
## 326  -17.015885 -21.377036
## 328  215.490707  22.619620
## 329  -38.009605 -36.018687
## 331  401.710474  38.913845
## 333  586.652675 -13.658964
## 334  -95.387521   4.973185
## 335  117.582095  -9.342259
## 337 -128.025985  43.285689
## 338 -218.172183  19.329563
## 340  146.356697  10.815069
## 341 -155.822413  19.952760
## 342  182.802200 -26.497779
## 343    5.750852  62.651967
## 344   -9.248082   7.013763
## 346 -172.429277 -23.528683
## 348 -236.091681  52.682241
## 349 -117.231359 -14.334013
## 350  138.826409 109.810848
## 351  142.485085  65.710141
## 352 -249.863619  -2.366366
## 353   95.613808  36.087667
## 354   66.706862  35.604107
## 355  151.923594   2.771940
## 356   20.342824 156.868128
## 357 -134.417071  -0.357981
## 358 -105.861472 -25.057423
## 359  -17.855847 -61.364351
## 360   -1.761148 -28.126454
## 361  -37.782069  23.631947
## 362 -104.163211 -13.314426
## 363 -143.117007 -19.721102
## 364  -44.738129  -3.460209
## 
## with conditional variances for "PID"

Visualizing random effects

Now, we can visualize the variation in random effects using lattice’s dotplot.

lattice::dotplot(coef(rt_mod_c))

## $PID

For each row in the plot, we see:

the intercept estimate for each participant, i.e., their estimated average reaction time (note that this interpretation hinges on our centering strategy! What would be the interpretation of the random intercepts for our initial rt_mod model?)
the slope estimate for each participant, i.e. how much their reaction time increases on average between the audio-only condition and audio-visual condition.

What values should the intercepts/slopes (approximately) be centered around?

#check
mean(coef(rt_mod_c)$PID$modality_c)

## [1] 41.55505

mean(coef(rt_mod_c)$PID$`(Intercept)`)

## [1] 1085.634

We can also use ranef and dotplot to focus on variability around the fixed effects. Notice how the effects are now centered around zero.

lattice::dotplot(ranef(rt_mod_c))

## $PID

Visualizing participant-level variation

Let’s visualize the effect for each participant. Can you try to map onto this figure what the random intercept and the random slope for modality represent?

ggplot(rt_data,aes(modality_c,RT))+
  geom_point()+
  geom_smooth(method="lm",se=F)+
  facet_wrap(~PID)

## `geom_smooth()` using formula = 'y ~ x'

Multiple random effects

If you take a close look at the data, you might notice that there is another potential source of non-independence: the particular words (in the stim column). We have multiple observations for each word in our dataset. (Incidentally, this is a problem that has long been recognized in psycholinguistics. One of the first to point this out was Herb Clark in the ’70s. See Clark, H. (1973). “The Language-as-Fixed-Effect Fallacy: A Critique of Language Statistics in Psychological Research”, Journal of Verbal Learning and Verbal Behavior, 12, 335-359.) Ideally, we’d like to have a way to simultaneously account not only for non-independence due to participants AND non-independence due to words.

rt_data %>%
  arrange(stim) %>%
  head()

## # A tibble: 6 × 5
##     PID    RT modality   stim  modality_c
##   <dbl> <dbl> <chr>      <chr>      <dbl>
## 1   302   729 Audio-only babe          -1
## 2   304  1426 Audio-only babe          -1
## 3   306  1320 Audio-only babe          -1
## 4   308  1279 Audio-only babe          -1
## 5   310  1008 Audio-only babe          -1
## 6   312  1531 Audio-only babe          -1

The elegance of linear mixed-effects regression is that we can easily extend our model to include as many additional random effects units as we need in order to accurately capture the clustered nature of our design. Note that we can have random effects for basically any factor! Participants and stimuli, but also schools, countries, dataset sources (think of ManyLabs/ ManyBabies studies), …. Here’s how we can include random effects due to item.

rt_mod_c_full <- lmer(RT ~ 1 + modality_c + 
                      (1 + modality_c|PID) + (1 + modality_c|stim), 
                    data = rt_data)
summary(rt_mod_c_full)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality_c + (1 + modality_c | PID) + (1 + modality_c |  
##     stim)
##    Data: rt_data
## 
## REML criterion at convergence: 302387.1
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3646 -0.6964 -0.0140  0.5886  5.0003 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  stim     (Intercept)   398.15  19.954      
##           modality_c     54.15   7.359  0.51
##  PID      (Intercept) 28031.13 167.425      
##           modality_c   1927.47  43.903  0.10
##  Residual             65258.80 255.458      
## Number of obs: 21679, groups:  stim, 543; PID, 53
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept) 1085.726     23.080   52.160  47.042  < 2e-16 ***
## modality_c    41.590      6.288   52.094   6.615 2.02e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr)
## modality_c 0.092

Compare the output between this model and previous models. What’s changed?

Notice that we included both a random intercept for stimuli and a random slope for condition. Do you have an intuition about why we might have made this choice? (more below)

Note that we can extend this logic to include as many types of random effects as we need to capture the structure of our data! The only limiting factor is whether we have enough data to capture the complexity of our model structure (see more below for dealing with convergence issues and pruning random effects).

Choosing random effects structure: keep it maximal

A general rule of thumb (Barr et al., 2013) is to keep the random effects structure “maximal”, that is, to include the full random effects structure that is justified by the design of your experiment/ what you know about the data generating process. Otherwise, you are likely going to severely inflate Type I error (see e.g. this blog post for a helpful simulation: https://benediktehinger.de/blog/science/lmm-type-1-error-for-1condition1subject/.

How do you determine the maximal random effects structure? As a general rule, you can apply the following principles:

If there are multiple observations for a given unit (e.g., multiple observations per participant), you need a by-unit random intercept.
If there are multiple observations for a given predictor (e.g. condition) within unit, you need a by-unit random slope for that predictor (e.g., if condition varies within participant, include a by-participant random slope for condition. If condition varies between participant, do not include a by-participant random slope for condition).
Include by unit random slopes for interactions in which all participating predictors vary within-unit (e.g., for an interaction between two predictors that vary within participant, include a by-participant random slope for their interaction).

See also Table 11 in Brauer & Curtin (2018).

Investigating model performance

It’s good practice to check the extent to which model assumptions for our mixed-effects model are fulfilled. The performance package provides a bunch of nice convenience tools for assessing the most important model assumptions. Many of the key assumptions are similar to a linear regression model, namely - linearity of residuals - variance is constant - normality of residuals The main difference is that we can now inspect assumptions at multiple levels of the model (also for random effects).

It’s also useful to get a diagnosis of influential observations (observations that, when removed, lead to large changes in the estimates).

performance::check_model(rt_mod_c,detrend=FALSE)

Check out details in the function documentation

?performance::check_model

Why do you think the residuals might be non-normal? What might we do to try to address it?

#RT data is skewed
rt_data %>%
  ggplot(aes(RT))+
  geom_histogram()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

#try log-transforming RT data
rt_data <- rt_data %>%
   mutate(
     log_rt = log(RT)
   )

#replot
rt_data %>%
  ggplot(aes(log_rt))+
  geom_histogram()

## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.

rt_mod_log_c <- lmer(log_rt ~ 1 + modality_c + 
                      (1 + modality_c|PID), 
                    data = rt_data)

## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00683441 (tol = 0.002, component 1)

check_model(rt_mod_log_c,detrend=FALSE)

#now a little better - still some non-normality in the left tail of the residuals

Interactions

rt_data_interaction <- read_csv(here(data_path,"rt_dummy_data_interaction.csv"))

## Rows: 21679 Columns: 5
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (3): SNR, modality, stim
## dbl (2): PID, RT
## 
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

head(rt_data_interaction)

## # A tibble: 6 × 5
##     PID    RT SNR   modality   stim 
##   <dbl> <dbl> <chr> <chr>      <chr>
## 1   301  1024 Easy  Audio-only gown 
## 2   301   838 Easy  Audio-only might
## 3   301  1060 Easy  Audio-only fern 
## 4   301   882 Easy  Audio-only vane 
## 5   301   971 Easy  Audio-only pup  
## 6   301  1064 Easy  Audio-only rise

Note that the data now contains an extra parameter, SNR (signal-to-noise ratio), at two levels, easy and hard.

Fit a model with levels dummy coded

First, let’s fit a naive interaction model. For ease of discussion, I’m going to stick with just the by-participant random effects, but in principle, we want to include the random effects for stim too. [if there’s time: discuss the misspecification of the random effects structure in the Brown introduction, and the consequences for significance.]

rt_int <- lmer(RT ~ 1 + modality * SNR + 
                      (1 + modality * SNR|PID), 
                    data = rt_data_interaction)
summary(rt_int)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality * SNR + (1 + modality * SNR | PID)
##    Data: rt_data_interaction
## 
## REML criterion at convergence: 300397.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.1276 -0.7052  0.0078  0.6062  4.9068 
## 
## Random effects:
##  Groups   Name                        Variance Std.Dev. Corr             
##  PID      (Intercept)                 25252    158.9                     
##           modalityAudiovisual         18800    137.1    -0.13            
##           SNRHard                     17519    132.4    -0.06  0.26      
##           modalityAudiovisual:SNRHard 42633    206.5     0.15 -0.76 -0.65
##  Residual                             59120    243.1                     
## Number of obs: 21679, groups:  PID, 53
## 
## Fixed effects:
##                             Estimate Std. Error     df t value Pr(>|t|)    
## (Intercept)                   998.39      22.08  51.97  45.226  < 2e-16 ***
## modalityAudiovisual            98.79      19.40  52.10   5.093 4.94e-06 ***
## SNRHard                        91.60      18.79  51.56   4.874 1.09e-05 ***
## modalityAudiovisual:SNRHard   -27.44      29.14  51.77  -0.942    0.351    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) mdltyA SNRHrd
## modltyAdvsl -0.152              
## SNRHard     -0.082  0.278       
## mdltyA:SNRH  0.161 -0.753 -0.653

Note that by default, R will dummy code categorical variables alphabetically, so the alphabetically first name will be the reference level (coded as 0) and the second level will be coded as 1.

What is the consequence of this coding approach for the interpretation of the interaction term? What is the consequence for the interpretation of the main effects?

Fit a model with levels flipped

Let’s reverse the dummy coding, so now the other level serves as the reference level.

rt_data_interaction$modality_rev <- ifelse(rt_data_interaction$modality == "Audiovisual", 0, 1)
rt_data_interaction$SNR_rev <- ifelse(rt_data_interaction$SNR == "Hard", 0, 1)

Fit the model

rt_int_rev <- lmer(RT ~ 1 + modality_rev * SNR_rev + 
                      (1 + modality_rev * SNR_rev|PID), 
                    data = rt_data_interaction)
summary(rt_int_rev)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality_rev * SNR_rev + (1 + modality_rev * SNR_rev |  
##     PID)
##    Data: rt_data_interaction
## 
## REML criterion at convergence: 300397.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.1276 -0.7052  0.0078  0.6062  4.9067 
## 
## Random effects:
##  Groups   Name                 Variance Std.Dev. Corr             
##  PID      (Intercept)          37067    192.5                     
##           modality_rev         18598    136.4    -0.29            
##           SNR_rev              24640    157.0    -0.39  0.39      
##           modality_rev:SNR_rev 42626    206.5     0.21 -0.75 -0.77
##  Residual                      59120    243.1                     
## Number of obs: 21679, groups:  PID, 53
## 
## Fixed effects:
##                      Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)           1161.34      26.66   51.99  43.564  < 2e-16 ***
## modality_rev           -71.35      19.34   51.54  -3.689 0.000542 ***
## SNR_rev                -64.16      22.06   52.00  -2.908 0.005338 ** 
## modality_rev:SNR_rev   -27.44      29.14   51.78  -0.942 0.350694    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) mdlty_ SNR_rv
## modality_rv -0.303              
## SNR_rev     -0.394  0.391       
## mdlty_:SNR_  0.218 -0.751 -0.764

What changes? What doesn’t change?

Fit the model with levels centered (right way!!)

Now, let’s fit the model with centered predictors. This is the way you should typically fit an interaction model. This is because once the predictor variables are centered, we can now safely interpret the lower-order effects as “main” effects, i.e. as the effect averaging across the other predictor.

rt_data_interaction$modality_c <- ifelse(rt_data_interaction$modality == "Audiovisual", 0.5, -0.5)
rt_data_interaction$SNR_c <- ifelse(rt_data_interaction$SNR == "Hard", 0.5, -0.5)

Fit the model

rt_int_c <- lmer(RT ~ 1 + modality_c * SNR_c + 
                      (1 + modality_c * SNR_c|PID), 
                    data = rt_data_interaction)
summary(rt_int_c)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality_c * SNR_c + (1 + modality_c * SNR_c | PID)
##    Data: rt_data_interaction
## 
## REML criterion at convergence: 300397.7
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -4.1276 -0.7052  0.0078  0.6062  4.9067 
## 
## Random effects:
##  Groups   Name             Variance Std.Dev. Corr             
##  PID      (Intercept)      27955    167.20                    
##           modality_c        8042     89.68    0.10            
##           SNR_c            10420    102.08    0.20 -0.45      
##           modality_c:SNR_c 42625    206.46   -0.12 -0.01  0.17
##  Residual                  59120    243.15                    
## Number of obs: 21679, groups:  PID, 53
## 
## Fixed effects:
##                  Estimate Std. Error      df t value Pr(>|t|)    
## (Intercept)       1086.72      23.03   52.02  47.193  < 2e-16 ***
## modality_c          85.07      12.76   51.91   6.666 1.70e-08 ***
## SNR_c               77.88      14.41   51.98   5.403 1.66e-06 ***
## modality_c:SNR_c   -27.44      29.13   51.78  -0.942    0.351    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##             (Intr) mdlty_ SNR_c 
## modality_c   0.100              
## SNR_c        0.198 -0.424       
## mdlty_:SNR_ -0.113 -0.003  0.159

Compare the modality effect to the effects in the centered model including only modality. What do you notice?

summary(rt_mod_c)

## Linear mixed model fit by REML. t-tests use Satterthwaite's method [
## lmerModLmerTest]
## Formula: RT ~ 1 + modality_c + (1 + modality_c | PID)
##    Data: rt_data
## 
## REML criterion at convergence: 302402.2
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -3.3640 -0.6991 -0.0141  0.5903  5.0118 
## 
## Random effects:
##  Groups   Name        Variance Std.Dev. Corr
##  PID      (Intercept) 28010    167.4        
##           modality_c   1927     43.9    0.10
##  Residual             65715    256.3        
## Number of obs: 21679, groups:  PID, 53
## 
## Fixed effects:
##             Estimate Std. Error       df t value Pr(>|t|)    
## (Intercept) 1085.634     23.056   52.022  47.087  < 2e-16 ***
## modality_c    41.555      6.281   51.865   6.616 2.05e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##            (Intr)
## modality_c 0.091

Session Info

sessionInfo()

## R version 4.3.2 (2023-10-31)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Sonoma 14.7.2
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Los_Angeles
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] performance_0.13.0  here_1.0.1          knitr_1.45         
##  [4] papaja_0.1.3        tinylabels_0.2.4    afex_1.4-1         
##  [7] broom.mixed_0.2.9.6 lubridate_1.9.3     forcats_1.0.0      
## [10] stringr_1.5.1       dplyr_1.1.4         purrr_1.0.2        
## [13] readr_2.1.5         tidyr_1.3.1         tibble_3.2.1       
## [16] ggplot2_3.5.1       tidyverse_2.0.0     lmerTest_3.1-3     
## [19] lme4_1.1-35.3       Matrix_1.6-5       
## 
## loaded via a namespace (and not attached):
##  [1] tidyselect_1.2.1    farver_2.1.1        fastmap_1.1.1      
##  [4] TH.data_1.1-2       bayestestR_0.15.2   digest_0.6.35      
##  [7] timechange_0.3.0    estimability_1.5    lifecycle_1.0.4    
## [10] survival_3.5-8      magrittr_2.0.3      compiler_4.3.2     
## [13] rlang_1.1.3         sass_0.4.9          tools_4.3.2        
## [16] utf8_1.2.4          yaml_2.3.8          labeling_0.4.3     
## [19] bit_4.0.5           plyr_1.8.9          abind_1.4-5        
## [22] multcomp_1.4-25     withr_3.0.0         numDeriv_2016.8-1.1
## [25] datawizard_1.0.0    grid_4.3.2          fansi_1.0.6        
## [28] xtable_1.8-4        colorspace_2.1-0    future_1.33.2      
## [31] globals_0.16.3      emmeans_1.10.1      scales_1.3.0       
## [34] MASS_7.3-60.0.1     insight_1.0.1       cli_3.6.2          
## [37] mvtnorm_1.2-4       crayon_1.5.2        rmarkdown_2.26     
## [40] generics_0.1.3      rstudioapi_0.16.0   reshape2_1.4.4     
## [43] tzdb_0.4.0          parameters_0.21.6   minqa_1.2.6        
## [46] cachem_1.0.8        splines_4.3.2       parallel_4.3.2     
## [49] effectsize_0.8.7    vctrs_0.6.5         boot_1.3-30        
## [52] sandwich_3.1-0      jsonlite_1.8.8      carData_3.0-5      
## [55] car_3.1-2           patchwork_1.2.0     hms_1.1.3          
## [58] ggrepel_0.9.5       bit64_4.0.5         listenv_0.9.1      
## [61] see_0.8.3           jquerylib_0.1.4     glue_1.7.0         
## [64] parallelly_1.37.1   nloptr_2.0.3        codetools_0.2-20   
## [67] stringi_1.8.3       gtable_0.3.4        munsell_0.5.1      
## [70] furrr_0.3.1         pillar_1.9.0        htmltools_0.5.8.1  
## [73] R6_2.5.1            rprojroot_2.0.4     vroom_1.6.5        
## [76] evaluate_0.23       lattice_0.22-6      highr_0.10         
## [79] backports_1.4.1     broom_1.0.5         bslib_0.7.0        
## [82] Rcpp_1.0.12         coda_0.19-4.1       nlme_3.1-164       
## [85] mgcv_1.9-1          xfun_0.43           zoo_1.8-12         
## [88] pkgconfig_2.0.3

Introduction to Mixed-Effects Models

Martin Zettersten

25 February, 2025

Preliminaries

General problem

Inspect the data

Fit a linear mixed-effects model

Generate pretty output

Option 1: tidy() from broom.mixed

Option 2: apa_print() from papaja

Interpret the output

Fixed Effects

Estimating p-values for fixed effects

Approximating df

Null model approach (w/ likelihood ratio test)

Example description

Recoding variables to make output more interpretable

Random Effects

Visualizing random effects

Visualizing participant-level variation

Multiple random effects

Choosing random effects structure: keep it maximal

Investigating model performance

Interactions

Fit a model with levels dummy coded

Fit a model with levels flipped

Fit the model with levels centered (right way!!)

More advanced topics

Dealing with convergence issues

Testing out alternative optimizers

General remedies

Pruning random effects

Demonstrating robustness of key effects across various random effects specifications

Power Analysis

Effect sizes

Logistic mixed-effects models

Plotting model predictions

Plot the predictions for the interaction model

Testing against chance using offsets

Categorical variables with more than 2 levels

Session Info

Option 1: `tidy()` from `broom.mixed`

Option 2: `apa_print()` from `papaja`