multivariate and MELSM
Models 1, 2, and 4 are linked throughout this lecture.
Models 3 and 5 are too big to share via GitHub. If you’ve registered for the class, you can find those files on our Canvas site.
Multivariate MLMs (M-MLM)
Mixed Effects Location Scale Models (MELSM)
We’ll use data provided by Williams and colleagues (2021.
d <- read_csv("https://raw.githubusercontent.com/sjweston/uobayes/refs/heads/main/files/data/external_data/williams.csv")
# scaled time variable
d <- d %>% mutate(day01 = (day - 2) / max((day - 2)))
distinct(d, id) %>% count()# A tibble: 1 × 1
n
<int>
1 193
# A tibble: 1 × 3
median min max
<int> <int> <int>
1 74 8 99
mean sd 5.5% 94.5% histogram
id 98.61029694 63.7493291 10.00000000 207.0000000 ▇▇▇▃▅▅▅▃▃▃▂▂
female 0.57016803 0.4950710 0.00000000 1.0000000 ▅▁▁▁▁▁▁▁▁▇
PA.std 0.01438236 1.0241384 -1.68129708 1.7514660 ▁▁▃▇▇▃▁▁
day 44.17962096 27.6985612 6.00000000 92.0000000 ▇▇▇▇▅▅▅▃▃▃
PA_lag 0.01992587 1.0183833 -1.72043510 1.7170361 ▁▂▃▅▇▇▅▃▂▁
NA_lag -0.04575229 0.9833161 -0.87507181 1.9904675 ▇▃▂▁▁▁▁▁▁▁▁▁▁▁
steps.pm 0.05424387 0.6298941 -1.02580678 1.0113559 ▁▂▇▇▃▁▁▁
steps.pmd 0.02839160 0.7575395 -1.12359512 1.2299739 ▁▁▇▇▁▁▁▁▁▁▁▁▁▁
NA.std -0.07545093 0.9495660 -1.04842926 1.8110607 ▁▁▁▇▂▁▁▁▁▁
day01 0.43040430 0.2826384 0.04081633 0.9183673 ▇▇▇▇▅▅▅▃▃▃
set.seed(14)
d %>%
nest(data=-id) %>%
slice_sample(n = 6) %>%
unnest(data) %>%
ggplot(aes(x = day, y = NA_lag)) +
geom_line(color = "grey") +
geom_point(color = "#1c5253", size = 1/2) +
geom_line(aes(y = PA_lag), color = "darkgrey") +
geom_point(aes(y = PA_lag), color = "#e07a5f", size = 1/2) +
ylab("affect (standardized)") +
facet_wrap(~ id)Let’s fit an MLM with varying intercepts and slopes.
Likelihood function and linear model
\[\begin{align*} \text{NA}_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha_{\text{id}[i]} + \beta_{1\text{id}[i]}\text{NA_lag}_i + \beta_{2\text{id}[i]}\text{PA_lag}_i \end{align*}\]
Varying intercepts and slopes:
\[\begin{align*} \alpha_{\text{id}[i]} &= \alpha + u_{\alpha,\text{id}[i]} \\ \beta_{1\text{id}[i]} &= \beta + u_{\beta,\text{id}[i]} \\ \beta_{2\text{id}[i]} &= \beta + u_{\beta,\text{id}[i]} \end{align*}\]
Random effects:
\[\begin{align*} \begin{bmatrix} u_{\alpha,\text{id}[i]} \\ u_{\beta,1\text{id}[i]} \\ u_{\beta,2\text{id}[i]} \end{bmatrix} &\sim \text{MVNormal}\left(\begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix}, \mathbf{S}\right) \\ \mathbf{S} &= \begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 \\ 0 & \sigma_{\beta}\end{pmatrix}\mathbf{R}\begin{pmatrix} \sigma_{\alpha} & 0 \\ 0 \\ 0 & \sigma_{\beta}\end{pmatrix} \end{align*}\]
Priors: \[\begin{align*} \alpha &\sim \text{Normal}(0,0.2) \\ \beta &\sim \text{Normal}(0,0.5) \\ \sigma &\sim \text{Exponential}(1) \\ \sigma_{\alpha} &\sim \text{Exponential}(1) \\ \sigma_{\beta} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
m1 <-
brm(data = d,
family = gaussian,
NA.std ~ 1 + NA_lag + PA_lag + (1 + NA_lag + PA_lag | id),
prior = c(prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 1), class = b),
prior(exponential(1), class = sd),
prior(exponential(1), class = sigma),
prior(lkj(2), class = cor)),
iter = 5000, warmup = 1000, chains = 4, cores = 4,
seed = 14,
file = here("files/models/m82.1"))What if I wanted to model positive affect? Usually, I’d have to fit a second model.
But what if I told you I could estimate both positive and negative affect simultaneously? Let’s start with an extremely basic, intercept-only model.
Likelihood function
\[\begin{align*} \text{NA}_i &\sim \text{Normal}(\mu_{\text{NA},i}, \sigma_{\text{NA}}) \\ \mu_{\text{NA},i} &= \alpha_{\text{NA},\text{id}[i]} \\ \text{PA}_i &\sim \text{Normal}(\mu_{\text{PA},i}, \sigma_{\text{PA}}) \\ \mu_{\text{PA},i} &= \alpha_{\text{PA},\text{id}[i]} \end{align*}\]
Varying intercepts:
\[\begin{align*} \alpha_{\text{NA},\text{id}[i]} &= \alpha_{\text{NA}} + u_{\alpha,1\text{NA},\text{id}[i]} \\ \alpha_{\text{PA},\text{id}[i]} &= \alpha_{\text{PA}} + u_{\alpha,\text{PA},\text{id}[i]} \\ \end{align*}\]
Random Effects: Multivariate Distribution
\[\begin{align*} \mathbf{u}_{\text{id}[i]} = \begin{bmatrix} u_{\alpha,\text{NA},\text{id}[i]} \\ u_{\alpha,\text{PA},\text{id}[i]} \end{bmatrix} &\sim \text{MVNormal}(\mathbf{0}, \boldsymbol{\Sigma}) \\ \boldsymbol{\Sigma} &= \mathbf{L} \mathbf{R} \mathbf{L} \quad \\ \text{with} \quad \mathbf{L} &= \text{diag}(\sigma_{\alpha,\text{NA}}, \sigma_{\alpha,\text{PA}}) \end{align*}\]
Priors
\[\begin{align*} \alpha_{\text{NA}}, \alpha_{\text{PA}} &\sim \text{Normal}(0, 0.2) \\ \sigma_{\text{NA}}, \sigma_{\text{PA}} &\sim \text{Exponential}(1) \\ \sigma_{\alpha,\text{NA}}, \sigma_{\beta,1\text{NA}}, \sigma_{\beta,2\text{NA}}, \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
bf_na = bf(NA.std ~ 1 + (1 | c | id))
bf_pa = bf(PA.std ~ 1 + (1 | c | id))
m2 <-
brm(data = d,
family = gaussian,
bf_na + bf_pa + set_rescor(TRUE),
prior = c(prior(normal(0, 0.2), class = Intercept),
prior(lkj(2), class = cor)),
iter = 5000, warmup = 1000, chains = 4, cores = 4,
seed = 14,
file = here("files/models/m82.2"))This model has partitioned our data into several key pieces. First, we can get the between-person variance and between-person correlations.
Family: MV(gaussian, gaussian)
Links: mu = identity; sigma = identity
mu = identity; sigma = identity
Formula: NA.std ~ 1 + (1 | c | id)
PA.std ~ 1 + (1 | c | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat
sd(NAstd_Intercept) 0.73 0.04 0.65 0.81 1.00
sd(PAstd_Intercept) 0.81 0.04 0.73 0.90 1.00
cor(NAstd_Intercept,PAstd_Intercept) -0.05 0.07 -0.20 0.09 1.02
Bulk_ESS Tail_ESS
sd(NAstd_Intercept) 434 1010
sd(PAstd_Intercept) 509 866
cor(NAstd_Intercept,PAstd_Intercept) 299 572
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
NAstd_Intercept -0.03 0.05 -0.14 0.07 1.01 231 486
PAstd_Intercept 0.00 0.06 -0.11 0.11 1.02 251 566
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_NAstd 0.63 0.00 0.63 0.64 1.00 20087 11406
sigma_PAstd 0.60 0.00 0.59 0.61 1.00 17390 10555
Residual Correlations:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
rescor(NAstd,PAstd) -0.05 0.01 -0.06 -0.03 1.00 17687 11771
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
between-person
sd(NAstd_Intercept) : variability in average negative affectsd(PAstd_Intercept) : variability in average positive affectcor(NAstd_Intercept,PAstd_Intercept): correlation between average NA and average PAwithin-person
sigma_NAstd: variability within participants in NAsigma_PAstd: variability within participants in PArescor(NAstd,PAstd): within-person correlation of PA and NANow we’ll add in some predictors. Let’s predict affect from lagged affect.
Likelihood function
\[\begin{align*} \text{NA}_i &\sim \text{Normal}(\mu_{\text{NA},i}, \sigma_{\text{NA}}) \\ \mu_{\text{NA},i} &= \alpha_{\text{NA},\text{id}[i]} + \beta_{1\text{NA},\text{id}[i]}\,\text{PA_lag}_i + \beta_{2\text{NA},\text{id}[i]}\,\text{NA_lag}_i \\ \\ \text{PA}_i &\sim \text{Normal}(\mu_{\text{PA},i}, \sigma_{\text{PA}}) \\ \mu_{\text{PA},i} &= \alpha_{\text{PA},\text{id}[i]} + \beta_{1\text{PA},\text{id}[i]}\,\text{PA_lag}_i + \beta_{2\text{PA},\text{id}[i]}\,\text{NA_lag}_i \end{align*}\]
Varying intercepts and slopes:
\[\begin{align*} \alpha_{\text{NA},\text{id}[i]} &= \alpha_{\text{NA}} + u_{\alpha,1\text{NA},\text{id}[i]} \\ \beta_{1\text{NA},\text{id}[i]} &= \beta_{1\text{NA}} + u_{\beta,1\text{NA},\text{id}[i]} \\ \beta_{2\text{NA},\text{id}[i]} &= \beta_{2\text{NA}} + u_{\beta,2\text{NA},\text{id}[i]} \\ \alpha_{\text{PA},\text{id}[i]} &= \alpha_{\text{PA}} + u_{\alpha,\text{PA},\text{id}[i]} \\ \beta_{1\text{PA},\text{id}[i]} &= \beta_{1\text{PA}} + u_{1\beta,\text{PA},\text{id}[i]} \\ \beta_{2\text{PA},\text{id}[i]} &= \beta_{2\text{PA}} + u_{2\beta,\text{PA},\text{id}[i]} \end{align*}\]
Random Effects: Multivariate Distribution
\[\begin{align*} \mathbf{u}_{\text{id}[i]} = \begin{bmatrix} u_{\alpha,\text{NA},\text{id}[i]} \\ u_{\beta,1\text{NA},\text{id}[i]} \\ u_{\beta,2\text{NA},\text{id}[i]} \\ u_{\alpha,\text{PA},\text{id}[i]} \\ u_{\beta,1\text{PA},\text{id}[i]} \\ u_{\beta,2\text{PA},\text{id}[i]} \end{bmatrix} &\sim \text{MVNormal}(\mathbf{0}, \boldsymbol{\Sigma}) \\ \boldsymbol{\Sigma} &= \mathbf{L} \mathbf{R} \mathbf{L} \quad \\ \text{with} \quad \mathbf{L} &= \text{diag}(\sigma_{\alpha,\text{NA}}, \sigma_{\beta,1\text{NA}}, \sigma_{\beta,2\text{NA}}, \sigma_{\alpha,\text{PA}}, \sigma_{\beta,1\text{PA}}, \sigma_{\beta,2\text{PA}}) \end{align*}\]
Priors
\[\begin{align*} \alpha_{\text{NA}}, \alpha_{\text{PA}} &\sim \text{Normal}(0, 0.2) \\ \beta_{1\text{NA}}, \beta_{2\text{NA}}, \beta_{1\text{PA}}, \beta_{2\text{PA}} &\sim \text{Normal}(0, 0.5) \\ \sigma_{\text{NA}}, \sigma_{\text{PA}} &\sim \text{Exponential}(1) \\ \sigma_{\alpha,\text{NA}}, \sigma_{\beta,1\text{NA}}, \sigma_{\beta,2\text{NA}}, \sigma_{\alpha,\text{PA}}, \sigma_{\beta,1\text{PA}}, \sigma_{\beta,2\text{PA}} &\sim \text{Exponential}(1) \\ \mathbf{R} &\sim \text{LKJcorr}(2) \end{align*}\]
bf_na = bf(NA.std ~ 1 + NA_lag + PA_lag + (1 + NA_lag + PA_lag | c | id))
bf_pa = bf(PA.std ~ 1 + NA_lag + PA_lag + (1 + NA_lag + PA_lag | c | id))
m3 <-
brm(data = d,
family = gaussian,
bf_na + bf_pa + set_rescor(TRUE),
prior = c(prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 1), class = b),
prior(lkj(2), class = cor)),
iter = 5000, warmup = 1000, chains = 4, cores = 4,
seed = 14,
file = here("files/models/m82.3")) Family: MV(gaussian, gaussian)
Links: mu = identity; sigma = identity
mu = identity; sigma = identity
Formula: NA.std ~ 1 + NA_lag + PA_lag + (1 + NA_lag + PA_lag | c | id)
PA.std ~ 1 + NA_lag + PA_lag + (1 + NA_lag + PA_lag | c | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat
sd(NAstd_Intercept) 0.44 0.03 0.39 0.50 1.00
sd(NAstd_NA_lag) 0.15 0.01 0.12 0.18 1.00
sd(NAstd_PA_lag) 0.07 0.02 0.03 0.10 1.00
sd(PAstd_Intercept) 0.43 0.03 0.38 0.48 1.00
sd(PAstd_NA_lag) 0.05 0.02 0.01 0.08 1.00
sd(PAstd_PA_lag) 0.11 0.01 0.09 0.14 1.00
cor(NAstd_Intercept,NAstd_NA_lag) 0.09 0.10 -0.10 0.29 1.00
cor(NAstd_Intercept,NAstd_PA_lag) 0.30 0.15 -0.00 0.60 1.00
cor(NAstd_NA_lag,NAstd_PA_lag) 0.07 0.18 -0.30 0.40 1.00
cor(NAstd_Intercept,PAstd_Intercept) -0.13 0.08 -0.28 0.02 1.00
cor(NAstd_NA_lag,PAstd_Intercept) -0.00 0.11 -0.22 0.22 1.00
cor(NAstd_PA_lag,PAstd_Intercept) -0.12 0.16 -0.42 0.18 1.01
cor(NAstd_Intercept,PAstd_NA_lag) -0.06 0.17 -0.39 0.28 1.00
cor(NAstd_NA_lag,PAstd_NA_lag) -0.03 0.21 -0.43 0.38 1.00
cor(NAstd_PA_lag,PAstd_NA_lag) 0.26 0.24 -0.24 0.69 1.00
cor(PAstd_Intercept,PAstd_NA_lag) -0.24 0.18 -0.59 0.12 1.00
cor(NAstd_Intercept,PAstd_PA_lag) -0.18 0.11 -0.39 0.04 1.00
cor(NAstd_NA_lag,PAstd_PA_lag) 0.10 0.15 -0.19 0.39 1.00
cor(NAstd_PA_lag,PAstd_PA_lag) -0.12 0.18 -0.48 0.24 1.00
cor(PAstd_Intercept,PAstd_PA_lag) 0.09 0.11 -0.11 0.30 1.00
cor(PAstd_NA_lag,PAstd_PA_lag) 0.32 0.21 -0.13 0.68 1.00
Bulk_ESS Tail_ESS
sd(NAstd_Intercept) 3584 6292
sd(NAstd_NA_lag) 6665 10373
sd(NAstd_PA_lag) 1557 1372
sd(PAstd_Intercept) 4387 7836
sd(PAstd_NA_lag) 2112 2250
sd(PAstd_PA_lag) 7127 10096
cor(NAstd_Intercept,NAstd_NA_lag) 6685 9729
cor(NAstd_Intercept,NAstd_PA_lag) 7006 7448
cor(NAstd_NA_lag,NAstd_PA_lag) 7576 7022
cor(NAstd_Intercept,PAstd_Intercept) 2349 4989
cor(NAstd_NA_lag,PAstd_Intercept) 897 1718
cor(NAstd_PA_lag,PAstd_Intercept) 280 469
cor(NAstd_Intercept,PAstd_NA_lag) 11866 10018
cor(NAstd_NA_lag,PAstd_NA_lag) 8743 9256
cor(NAstd_PA_lag,PAstd_NA_lag) 2757 5270
cor(PAstd_Intercept,PAstd_NA_lag) 14554 10682
cor(NAstd_Intercept,PAstd_PA_lag) 12167 12720
cor(NAstd_NA_lag,PAstd_PA_lag) 4607 8122
cor(NAstd_PA_lag,PAstd_PA_lag) 2163 3286
cor(PAstd_Intercept,PAstd_PA_lag) 14121 12777
cor(PAstd_NA_lag,PAstd_PA_lag) 1588 2031
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
NAstd_Intercept -0.03 0.03 -0.10 0.03 1.00 1348 2892
PAstd_Intercept -0.01 0.03 -0.07 0.05 1.00 2691 5026
NAstd_NA_lag 0.35 0.02 0.32 0.38 1.00 7233 10519
NAstd_PA_lag 0.01 0.01 -0.01 0.03 1.00 7186 11061
PAstd_NA_lag 0.04 0.01 0.02 0.06 1.00 16764 13101
PAstd_PA_lag 0.45 0.01 0.43 0.48 1.00 12268 12423
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma_NAstd 0.59 0.00 0.59 0.60 1.00 27696 10725
sigma_PAstd 0.54 0.00 0.54 0.55 1.00 31518 11100
Residual Correlations:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
rescor(NAstd,PAstd) -0.02 0.01 -0.04 -0.01 1.00 32445 12034
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
These models can help us address many questions. First, let’s examine the between- and within-person variance of these outcomes.
# A tibble: 2 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 between -0.133 -0.284 0.0217 0.95 mean qi
2 within -0.0249 -0.0423 -0.00733 0.95 mean qi
To clarify: this is the within-person correlation after accounting for the predictors in the model. This can help us detect any additional patterns in the data.
m3 %>%
gather_draws(r_id__NAstd[id, term], r_id__PAstd[id, term]) %>%
mutate(.variable=str_remove(.variable, "r_id__")) %>%
pivot_wider(names_from = .variable, values_from = .value) %>%
# just for presenting to class
arrange(.draw, id, term) %>% select(-.chain, -.iteration) %>%
mutate(across(c(NAstd, PAstd), \(x) round(x, 2)))# A tibble: 9,264,000 × 5
# Groups: id, term [579]
id term .draw NAstd PAstd
<int> <chr> <int> <dbl> <dbl>
1 1 Intercept 1 0.14 0.05
2 1 NA_lag 1 0.11 0.08
3 1 PA_lag 1 0.07 0.13
4 2 Intercept 1 -0.4 0.88
5 2 NA_lag 1 -0.03 0
6 2 PA_lag 1 -0.04 0.16
7 3 Intercept 1 0.82 0
8 3 NA_lag 1 0.33 -0.09
9 3 PA_lag 1 0.1 -0.2
10 4 Intercept 1 0.2 -0.08
# ℹ 9,263,990 more rows
nd = data.frame(
PA_lag = seq(from = min(d$PA_lag), to=max(d$PA_lag), length.out=50),
NA_lag = 0,
id = max(d$id) + 1
)
nd %>% add_epred_draws(m3, allow_new_levels=T) %>%
filter(.draw <= 50) %>%
ggplot(aes(x = PA_lag, y = .epred)) +
geom_line(aes(group = .draw, color = .category), alpha=.2) +
scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
facet_wrap(~.category) +
guides(color="none") +
labs(x = "Positive Affect (lagged)", y = "Expected score")nd = data.frame(
NA_lag = seq(from = min(d$PA_lag), to=max(d$PA_lag), length.out=50),
PA_lag = 0,
id = max(d$id) + 1
)
nd %>% add_epred_draws(m3, allow_new_levels=T) %>%
filter(.draw <= 50) %>%
ggplot(aes(x = NA_lag, y = .epred)) +
geom_line(aes(group = .draw, color = .category), alpha=.2) +
scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
facet_wrap(~.category) +
guides(color="none") +
labs(x = "Negative Affect (lagged)", y = "Expected score")But of course, we’ve allowed these effects to vary for each individual.
nd = expand.grid(
PA_lag = seq(from = min(d$PA_lag), to=max(d$PA_lag), length.out=50),
NA_lag = 0,
id = sample(d$id, size = 4, replace=F)
)
nd %>% add_epred_draws(m3) %>%
filter(.draw <= 20) %>%
ggplot(aes(x = PA_lag, y = .epred)) +
geom_line(aes(group = .draw, color = .category), alpha=.2) +
scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
facet_grid(.category~id) +
guides(color="none") +
labs(x = "Positive Affect (lagged)", y = "Expected score")We may even want to see whether there’s a correlation between the effect of positive affect on PA and on NA.
But of course, each of these points is a summary of a distribution! The real benefit of modeling these outcomes jointly is that we can also get a distribution of the correlations between these effects.
m3 %>% spread_draws(`cor.*`, regex = T) %>%
pivot_longer(starts_with("cor"),
names_prefix = "cor_id__") %>%
group_by(name) %>%
mean_qi(value) %>%
arrange(desc(abs(value)))# A tibble: 15 × 7
name value .lower .upper .width .point .interval
<chr> <dbl> <dbl> <dbl> <dbl> <chr> <chr>
1 PAstd_NA_lag__PAstd_PA_lag 0.318 -0.132 0.681 0.95 mean qi
2 NAstd_Intercept__NAstd_PA_l… 0.303 -0.00369 0.601 0.95 mean qi
3 NAstd_PA_lag__PAstd_NA_lag 0.261 -0.240 0.692 0.95 mean qi
4 PAstd_Intercept__PAstd_NA_l… -0.241 -0.588 0.123 0.95 mean qi
5 NAstd_Intercept__PAstd_PA_l… -0.183 -0.391 0.0356 0.95 mean qi
6 NAstd_Intercept__PAstd_Inte… -0.133 -0.284 0.0217 0.95 mean qi
7 NAstd_PA_lag__PAstd_PA_lag -0.123 -0.479 0.242 0.95 mean qi
8 NAstd_PA_lag__PAstd_Interce… -0.121 -0.419 0.180 0.95 mean qi
9 NAstd_NA_lag__PAstd_PA_lag 0.102 -0.190 0.393 0.95 mean qi
10 PAstd_Intercept__PAstd_PA_l… 0.0950 -0.113 0.296 0.95 mean qi
11 NAstd_Intercept__NAstd_NA_l… 0.0946 -0.103 0.291 0.95 mean qi
12 NAstd_NA_lag__NAstd_PA_lag 0.0685 -0.300 0.405 0.95 mean qi
13 NAstd_Intercept__PAstd_NA_l… -0.0645 -0.393 0.283 0.95 mean qi
14 NAstd_NA_lag__PAstd_NA_lag -0.0334 -0.433 0.383 0.95 mean qi
15 NAstd_NA_lag__PAstd_Interce… -0.00442 -0.223 0.220 0.95 mean qi
Traditionally, psycho-social research has focused on identifying trait-like behaviors - for example, how negative a person is on average. But more recent work has broadened to include states - that is, whether and how much individuals fluctuate in their thoughts, feelings, and behaviors over time.
On the macro time scale, we typically assume stable traits, but on the micro time scale (day to day), we often observe substantial variability in these same traits. This presents a methodological challenge: how do we simultaneously model both the stable personality traits (the “location”) and the fluctuations (the “scale”)? The standard approach has been a two-stage process:
iMs) in a mixed effects modeliSDs) from the residualsBut this approach has several drawbacks:
In a standard linear mixed effects model with repeated measurements, we have:
\[ y_i = X_i\beta + Z_ib_i + \epsilon_i \] Where:
The key limitation of this standard model is that it assumes the same error variance \((\sigma^2)\) for everyone. The MELSM, instead, allows \(\sigma^2\) to differ at the individual level and even at different time points. The scale component is modeled as:
\[ \varphi_i = \text{exp}(W_i\eta + V_it_i) \] Where:
m4 <-
brm(data = d,
family = gaussian,
bf(
# location
PA.std ~ 1 + (1 | c | id),
# scale (the c in |c| allows for correlation between location and scale)
sigma ~ 1 + (1 | c | id)
),
prior = c(prior(normal(0, 0.2), class = Intercept),
prior(exponential(1), class = sd),
prior(normal(0,1), class = Intercept, dpar=sigma),
prior(exponential(1), class = sd, dpar=sigma),
prior(lkj(2), class = cor)),
iter = 5000, warmup = 1000, chains = 4, cores = 4,
seed = 14,
file = here("files/models/m82.4")) Family: gaussian
Links: mu = identity; sigma = log
Formula: PA.std ~ 1 + (1 | c | id)
sigma ~ 1 + (1 | c | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat
sd(Intercept) 0.81 0.04 0.73 0.90 1.00
sd(sigma_Intercept) 0.43 0.02 0.39 0.48 1.00
cor(Intercept,sigma_Intercept) -0.34 0.07 -0.47 -0.21 1.00
Bulk_ESS Tail_ESS
sd(Intercept) 400 789
sd(sigma_Intercept) 1517 3427
cor(Intercept,sigma_Intercept) 962 1969
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.01 0.06 -0.10 0.12 1.03 215 408
sigma_Intercept -0.64 0.03 -0.70 -0.57 1.00 666 1753
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
We’ve seen the parameters for the location section before: these are the samed “fixed” parameters we’ve been covering in MLMs the past 4 lectures. Let’s talk about what the sigma part means.
Estimate Est.Error Q2.5 Q97.5
-0.64 0.03 -0.70 -0.57
As a reminder, this is the intercept of the log of sigma (\(\text{log}(\sigma)\)). So we need to exponentiate before we can interpret it as a standard deviation.
This is just the “average” variability of participants. Some of them vary a lot, and some vary a little.
m4 |> spread_draws(b_sigma_Intercept, r_id__sigma[id, term] ) |>
filter(term == "Intercept") |>
mutate(sigma=b_sigma_Intercept+r_id__sigma) |>
mean_qi(sigma) |>
mutate(across(c(sigma, .lower, .upper), exp)) |>
arrange(sigma) |>
mutate(rank=row_number()) |>
ggplot( aes(x=rank, y = sigma, ymin=.lower, ymax=.upper) ) +
geom_pointrange(linewidth = 0.4, fatten = 0.3, color = "#5e8485") +
scale_x_continuous("participants ranked by posterior mean sigma", breaks = NULL) +
ylab(expression(exp(eta[0]+italic(u)[2][italic(i)])))Our posterior distribution contains both locations (means) and scales \(\text{log}(\sigma)\) for each of our clusters (participants). We can plot these to see whether they’re related.
Of course, the correlation between these values can also be found in our model output.
The real power of these models comes when we incorporate predictors for both location and scale.
m5 <-brm(
data = d,
family = gaussian,
bf(
# location
PA.std ~ 1 + day + PA_lag + NA_lag + steps.pm*steps.pmd +
(1 + day + steps.pmd | c | id),
sigma ~ 1 + PA_lag + NA_lag + steps.pm*steps.pmd + (1 + steps.pmd | c | id)
),
prior = c(prior(normal(0, 0.2), class = Intercept),
prior(normal(0, 1), class = b),
prior(exponential(1), class = sd),
prior(normal(0,1), class = Intercept, dpar=sigma),
prior(exponential(1), class = sd, dpar=sigma),
prior(lkj(2), class = cor)),
iter = 5000, warmup = 1000, chains = 4, cores = 4,
seed = 14,
file = here("files/models/m82.5"))Our model clearly has not mixed – be sure to run this model for many iterations if you actually want to use it. As it was, this model took 80 minutes to run on my computer, so I didn’t have time to go back and run for longer.
Family: gaussian
Links: mu = identity; sigma = log
Formula: PA.std ~ 1 + day + PA_lag + NA_lag + steps.pm * steps.pmd + (1 + day + steps.pmd | c | id)
sigma ~ 1 + PA_lag + NA_lag + steps.pm * steps.pmd + (1 + steps.pmd | c | id)
Data: d (Number of observations: 13033)
Draws: 4 chains, each with iter = 5000; warmup = 1000; thin = 1;
total post-warmup draws = 16000
Multilevel Hyperparameters:
~id (Number of levels: 193)
Estimate Est.Error l-95% CI u-95% CI Rhat
sd(Intercept) 0.61 0.13 0.49 0.90 1.60
sd(day) 0.03 0.11 0.00 0.23 1.60
sd(steps.pmd) 0.25 0.18 0.12 0.57 1.56
sd(sigma_Intercept) 0.38 0.11 0.13 0.49 1.60
sd(sigma_steps.pmd) 0.13 0.04 0.03 0.18 1.46
cor(Intercept,day) -0.08 0.38 -0.43 0.67 1.60
cor(Intercept,steps.pmd) -0.29 0.37 -0.91 0.08 1.60
cor(day,steps.pmd) -0.14 0.11 -0.35 0.08 1.19
cor(Intercept,sigma_Intercept) -0.47 0.28 -0.94 -0.18 1.60
cor(day,sigma_Intercept) -0.14 0.11 -0.40 0.07 1.37
cor(steps.pmd,sigma_Intercept) 0.41 0.34 0.07 0.98 1.57
cor(Intercept,sigma_steps.pmd) -0.09 0.46 -0.52 0.70 1.59
cor(day,sigma_steps.pmd) -0.08 0.12 -0.28 0.15 1.10
cor(steps.pmd,sigma_steps.pmd) -0.27 0.32 -0.79 0.12 1.58
cor(sigma_Intercept,sigma_steps.pmd) -0.10 0.44 -0.85 0.33 1.58
Bulk_ESS Tail_ESS
sd(Intercept) 7 11
sd(day) 7 11
sd(steps.pmd) 7 18
sd(sigma_Intercept) 7 11
sd(sigma_steps.pmd) 9 11
cor(Intercept,day) 7 11
cor(Intercept,steps.pmd) 7 11
cor(day,steps.pmd) 14 22
cor(Intercept,sigma_Intercept) 7 11
cor(day,sigma_Intercept) 9 12
cor(steps.pmd,sigma_Intercept) 7 11
cor(Intercept,sigma_steps.pmd) 7 11
cor(day,sigma_steps.pmd) 24 1751
cor(steps.pmd,sigma_steps.pmd) 7 11
cor(sigma_Intercept,sigma_steps.pmd) 7 11
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
Intercept 0.91 2.28 -0.04 6.52 1.59 7
sigma_Intercept -0.19 1.09 -0.85 2.76 1.60 7
day -0.01 0.04 -0.10 -0.00 1.58 7
PA_lag 0.11 0.44 -0.76 0.38 1.59 7
NA_lag 0.28 0.43 0.01 1.06 1.59 7
steps.pm 0.05 0.19 -0.27 0.26 1.56 7
steps.pmd -0.39 0.87 -1.97 0.13 1.59 7
steps.pm:steps.pmd 0.25 0.49 -0.07 1.13 1.59 7
sigma_PA_lag 0.02 0.09 -0.12 0.35 1.60 7
sigma_NA_lag 0.04 0.04 0.00 0.14 1.51 7
sigma_steps.pm -0.12 0.15 -0.52 0.05 1.59 7
sigma_steps.pmd -0.06 0.06 -0.24 -0.00 1.59 7
sigma_steps.pm:steps.pmd 0.01 0.08 -0.07 0.21 1.59 7
Tail_ESS
Intercept 11
sigma_Intercept 12
day 11
PA_lag 11
NA_lag 12
steps.pm 20
steps.pmd 11
steps.pm:steps.pmd 11
sigma_PA_lag 11
sigma_NA_lag 11
sigma_steps.pm 11
sigma_steps.pmd 11
sigma_steps.pm:steps.pmd 11
Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
We can visualize the fixed effects:
m5 |>
gather_draws(`^b_.*`, regex=T) |>
mutate(
out = ifelse( str_detect(.variable, "sigma"), "var", "location"),
.variable = str_remove(.variable, "^b_sigma_"),
.variable = str_remove(.variable, "^b_")
) |>
filter(.variable != "Intercept") |>
ggplot( aes(x=.value, y=.variable) ) +
stat_halfeye() +
facet_wrap(~out, nrow=2, scales="free") +
labs(x="coef estimate", y=NULL, title="posterior")People who walk more have higher PA
Estimate Est.Error Q2.5 Q97.5
0.05 0.19 -0.27 0.26
and are less variable:
means = d |>
select(PA_lag, day, NA_lag, steps.pmd) |>
summarise(across(everything(),mean))
tibble(steps.pm = seq( from=min(d$steps.pm), to=max(d$steps.pm), length=100)) |>
cross_join(means) |>
add_predicted_draws(m5, re_formula = NA) |>
ggplot( aes(x=steps.pm, y=.prediction) ) +
geom_point(alpha=.5) +
labs(x="Average steps", y="PA.std") Now that you know the basic MLM structure, you can expand these models in complex ways:
These models demand a lot more attention, forethought, and horsepower. But they can be worth it. Model-implied estimates appropriately account for uncertainty, whereas calculed estimates do not.