week 9: advanced methods

measurement

library(tidyverse)
library(psych)
library(cowplot)
library(patchwork)
library(here)
library(brms) 
library(tidybayes)
library(ggdag)
library(ggrepel) # for putting labels on figures
library(broom) #for the tidy functions

measurement error

Measurement is the principled assignment of numbers to qualitities. And no matter what you measure or how you measure it, you’ll have some error.

Some tools tend to produce very little error (e.g., the length of this table in in inches). Other tools tend to produce more error. In the social sciences, our job is made harder by the fact that we often measure qualities that do not have an objective physical reality, like happiness or well-being. Measurement error is exacerbated when there is little available data.

Unfortunately, our statistic models will assume your measure has no error… unless you can tell the model how much error there is.

In measurement theory, we may assume that

\[ X_i = T_i + \epsilon_i \]

Meaning that for any observation \(i\), the observed score \(X\) on some measure is the sum of the true score \(T\) and error \(\epsilon\). Classical test theory assumes that \(\epsilon_i\) is randomly distributed, but other theories (IRT) disagree. Regardless, we can move forward with the assumption that observed scores are caused by some true score and some error.

marriage example

data(WaffleDivorce, package="rethinking")
d <- WaffleDivorce

rethinking::precis(d) 
                          mean           sd     5.5%        94.5%
Location                   NaN           NA       NA           NA
Loc                        NaN           NA       NA           NA
Population        6.119600e+00 6.876156e+00  0.65780 1.897690e+01
MedianAgeMarriage 2.605400e+01 1.243630e+00 24.26950 2.826100e+01
Marriage          2.011400e+01 3.797905e+00 15.20850 2.649150e+01
Marriage.SE       1.399400e+00 7.969749e-01  0.54950 2.902200e+00
Divorce           9.688000e+00 1.820814e+00  6.66950 1.273050e+01
Divorce.SE        9.618000e-01 5.253675e-01  0.34085 1.893050e+00
WaffleHouses      3.234000e+01 6.578959e+01  0.00000 1.357450e+02
South             2.800000e-01 4.535574e-01  0.00000 1.000000e+00
Slaves1860        7.937834e+04 1.497309e+05  0.00000 4.355531e+05
Population1860    6.287293e+05 7.813127e+05  0.00000 1.903357e+06
PropSlaves1860    9.405132e-02 1.744486e-01  0.00000 4.561000e-01
                       histogram
Location                        
Loc                             
Population              ▇▃▁▁▁▁▁▁
MedianAgeMarriage ▁▁▂▂▃▇▅▃▁▁▂▁▁▁
Marriage              ▁▃▇▇▇▅▂▁▁▁
Marriage.SE             ▁▇▅▃▁▂▁▁
Divorce                 ▂▃▅▅▇▂▃▁
Divorce.SE          ▂▇▇▃▃▂▁▂▂▁▁▁
WaffleHouses            ▇▁▁▁▁▁▁▁
South                 ▇▁▁▁▁▁▁▁▁▂
Slaves1860            ▇▁▁▁▁▁▁▁▁▁
Population1860          ▇▃▂▁▁▁▁▁
PropSlaves1860      ▇▁▁▁▁▁▁▁▁▁▁▁
Code
dag_coords <-
  tibble(name = c("A", "M", "D", "Dobs", "eD"),
         x    = c(1, 2, 2, 3, 4),
         y    = c(2, 3, 1, 1, 1))

dagify(M    ~ A,
       D    ~ A + M,
       Dobs ~ D + eD,
       coords = dag_coords) %>%
  tidy_dagitty() %>% 
  mutate(color = ifelse(name %in% c("D", "eD"), "a", "b")) %>% 
  
  ggplot(aes(x = x, y = y, xend = xend, yend = yend)) +
  geom_dag_point(aes(color = color),
                 size = 10, show.legend = F) +
  geom_dag_text(parse = T, label = c("A", "D", expression(D[obs]), 
  "M",expression(italic(e)[D]))) +
  geom_dag_edges() +
  theme_void()
Code
p1 <- d %>%
  ggplot(aes(x = MedianAgeMarriage, 
             y = Divorce,
             ymin = Divorce - Divorce.SE, 
             ymax = Divorce + Divorce.SE)) +
  geom_pointrange(shape = 20, alpha = 2/3, color="#1c5253") +
  labs(x = "Median age marriage" , 
       y = "Divorce rate")

p2 <-
  d %>%
  ggplot(aes(x = log(Population), 
             y = Divorce,
             ymin = Divorce - Divorce.SE, 
             ymax = Divorce + Divorce.SE)) +
  geom_pointrange(shape = 20, alpha = 2/3, color="#e07a5f") +
  scale_y_continuous(NULL, breaks = NULL) +
  xlab("log population")

p1 | p2

How do you incorporate the standard error into the model? First, we’ll think through this logically, then we’ll use the code.

First, measurement error refers to the amount of variability we would expect to see in statistics across studies. In other words, error is the measure of the spread of a distribution. What’s unknown in this case is the mean of the distribution.

\[ D_{\text{obs},i} \sim \text{Normal}(D_i, D_{\text{SE},i}) \]

Well hey, isn’t that the kind of parameter that Bayesian analysis has been trying to estimate all along?

mathematical model

\[\begin{align} D_{\text{obs},i} &\sim \text{Normal}(D_i, D_{\text{SE},i}) \\ D_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha + \beta_AA_i + \beta_MM_i \\ \alpha &\sim \text{Normal}(0, 0.2) \\ \beta_A &\sim \text{Normal}(0, 0.5) \\ \beta_M &\sim \text{Normal}(0, 0.5) \\ \sigma &\sim \text{Exponential}(1) \end{align}\]

For the sake of easier priors, we’ll standardize all our variables first. Note that standardizing standard errors means keeping them in the same unit as the varible they refer to.

d <-
  d %>% 
  mutate(D_obs = (Divorce - mean(Divorce)) / sd(Divorce),
         D_sd  = Divorce.SE / sd(Divorce),
         M     = (Marriage - mean(Marriage)) / sd(Marriage),
         A     = (MedianAgeMarriage - mean(MedianAgeMarriage)) / sd(MedianAgeMarriage),
         M_obs = M,
         M_sd  = Marriage.SE / sd(Marriage))
m0 <- 
  brm(data = d, 
      family = gaussian,
      D_obs ~ 1 + A + M,
      prior = c(prior(normal(0, 0.2), class = Intercept),
                prior(normal(0, 0.5), class = b),
                prior(exponential(1), class = sigma)),
      iter = 2000, warmup = 1000, cores = 4, chains = 4,
      seed = 15,
      # note this line
      save_pars = save_pars(latent = TRUE),
      file = here("files/models/m92.0"))
m1 <- 
  brm(data = d, 
      family = gaussian,
      D_obs | mi(D_sd) ~ 1 + A + M,
      prior = c(prior(normal(0, 0.2), class = Intercept),
                prior(normal(0, 0.5), class = b),
                prior(exponential(1), class = sigma)),
      iter = 2000, warmup = 1000, cores = 4, chains = 4,
      seed = 15,
      # note this line
      save_pars = save_pars(latent = TRUE),
      file = here("files/models/m92.1"))
m1
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: D_obs | mi(D_sd) ~ 1 + A + M 
   Data: d (Number of observations: 50) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.05      0.10    -0.24     0.14 1.00     4597     3090
A            -0.62      0.16    -0.92    -0.30 1.00     3829     3423
M             0.05      0.17    -0.27     0.38 1.00     3671     3297

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.58      0.11     0.39     0.81 1.00     1682     1720

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
posterior_summary(m1)
                Estimate  Est.Error         Q2.5        Q97.5
b_Intercept  -0.05287903 0.09560518  -0.23886745   0.13726794
b_A          -0.61700465 0.15906799  -0.92245919  -0.29941922
b_M           0.05250207 0.16596210  -0.27093761   0.38138566
sigma         0.58407876 0.10676964   0.38881281   0.81491478
Intercept    -0.05287903 0.09560518  -0.23886745   0.13726794
Yl[1]         1.16380958 0.36399705   0.45678248   1.88019251
Yl[2]         0.68798528 0.55545971  -0.40532881   1.78866913
Yl[3]         0.42646352 0.33696468  -0.22714592   1.07800780
Yl[4]         1.41990481 0.45735398   0.51982522   2.33835163
Yl[5]        -0.90062131 0.12857833  -1.15488620  -0.64688189
Yl[6]         0.65994433 0.39848512  -0.10032271   1.45295620
Yl[7]        -1.36526810 0.35504558  -2.05486652  -0.66823571
Yl[8]        -0.33840244 0.47247230  -1.25609775   0.58214114
Yl[9]        -1.90150049 0.58817751  -3.05336435  -0.72470074
Yl[10]       -0.62132190 0.17186028  -0.95587821  -0.27755085
Yl[11]        0.76341246 0.28108009   0.20629632   1.31262617
Yl[12]       -0.55322097 0.48671358  -1.53032059   0.41006406
Yl[13]        0.18014145 0.50648969  -0.85327426   1.15270187
Yl[14]       -0.87210178 0.22594215  -1.31849712  -0.42656000
Yl[15]        0.56087538 0.30848278  -0.03307509   1.16024922
Yl[16]        0.28374664 0.40123574  -0.53788724   1.07797302
Yl[17]        0.49757257 0.42622575  -0.34424766   1.32234855
Yl[18]        1.25343126 0.34850197   0.58866869   1.93645181
Yl[19]        0.43293128 0.38544033  -0.30272767   1.20778192
Yl[20]        0.40469307 0.54221317  -0.60265472   1.49375005
Yl[21]       -0.55866095 0.31505583  -1.18076309   0.06387204
Yl[22]       -1.10085094 0.26189265  -1.60983803  -0.58917295
Yl[23]       -0.26811533 0.26452671  -0.78206170   0.24950601
Yl[24]       -1.00154932 0.30528291  -1.62109423  -0.40641259
Yl[25]        0.43228569 0.39575018  -0.33219269   1.22640132
Yl[26]       -0.03375897 0.31355089  -0.66626319   0.58620335
Yl[27]       -0.01176553 0.50233364  -0.99239479   1.00288051
Yl[28]       -0.14619945 0.38981782  -0.92315210   0.61324160
Yl[29]       -0.26257719 0.50330444  -1.20212247   0.74789655
Yl[30]       -1.80061034 0.23677345  -2.25647365  -1.33209152
Yl[31]        0.17265156 0.41087733  -0.63822136   0.99652013
Yl[32]       -1.66073596 0.16260915  -1.98035225  -1.34235191
Yl[33]        0.11852901 0.24138612  -0.36122545   0.60586708
Yl[34]       -0.05418447 0.51141807  -1.11458891   0.91539123
Yl[35]       -0.12291325 0.22737764  -0.57408527   0.32393579
Yl[36]        1.28479965 0.40761437   0.48882609   2.11017466
Yl[37]        0.22368318 0.35026426  -0.46485333   0.91459730
Yl[38]       -1.02508669 0.21833912  -1.44644004  -0.59395760
Yl[39]       -0.92956225 0.54071523  -1.96249272   0.16346135
Yl[40]       -0.67944819 0.32342300  -1.31483203  -0.05196159
Yl[41]        0.24407462 0.54826453  -0.81917624   1.32536575
Yl[42]        0.74595423 0.33110341   0.09928354   1.38997465
Yl[43]        0.19636347 0.18408346  -0.16880604   0.55807729
Yl[44]        0.80183926 0.42335058  -0.05159185   1.60462754
Yl[45]       -0.40557248 0.52661056  -1.42459914   0.68427256
Yl[46]       -0.39240888 0.25684943  -0.89652352   0.09167503
Yl[47]        0.13440823 0.30195290  -0.46337386   0.72571630
Yl[48]        0.55957284 0.44795065  -0.31515399   1.44366592
Yl[49]       -0.63550849 0.27979049  -1.18884521  -0.08282839
Yl[50]        0.85003581 0.58270897  -0.33293414   1.97866945
lprior       -1.36690837 0.44820845  -2.44940868  -0.71213312
lp__        -78.15515223 6.57052200 -91.46921227 -65.53798980
Code
states <- c("AL", "AR", "ME", "NH", "RI", "DC", "VT", "AK", "SD", "UT", "ID", "ND", "WY")

d_est <-
  posterior_summary(m1) %>% 
  data.frame() %>% 
  rownames_to_column("term") %>% 
  mutate(D_est = Estimate) %>% 
  select(term, D_est) %>% 
  filter(str_detect(term, "Yl")) %>% 
  bind_cols(d)


  d_est %>%
  ggplot(aes(x = D_sd, y = D_est - D_obs)) +
  geom_hline(yintercept = 0, linetype = 2) +
  geom_point(alpha = 2/3) +
  geom_text_repel(data = . %>% filter(Loc %in% states),  
                  aes(label = Loc), 
                  size = 3, seed = 15) 
Code
states <- c("AR", "ME", "RI", "ID", "WY", "ND", "MN")

as_draws_df(m1) %>% 
  expand_grid(A = seq(from = -3.5, to = 3.5, length.out = 50)) %>% 
  mutate(fitted = b_Intercept + b_A * A) %>% 
  
  ggplot(aes(x = A)) +
  stat_lineribbon(aes(y = fitted),
                  .width = .95, size = 1/3,
                  color = "grey20",
                  fill = "grey80") +
  geom_segment(data = d_est,
               aes(xend = A, y = D_obs, yend = D_est),
               linewidth = 1/5) +
  geom_point(data = d_est,
             aes(y = D_obs)) +
  geom_point(data = d_est,
             aes(y = D_est),
             shape = 1, stroke = 1/3) +
  geom_text_repel(data = d %>% filter(Loc %in% states),  
                  aes(y = D_obs, label = Loc), 
                  size = 3, seed = 15) +
  labs(x = "median age marriage (std)",
       y = "divorce rate (std)") +
  coord_cartesian(xlim = range(d$A), 
                  ylim = range(d$D_obs))

predictors have error too

Of course, predictors can have error too. Luckily brms has a way to handle this. Here’s an updated mathematical model that incorporates measurement error in the marriage variable.

\[\begin{align} D_{\text{obs},i} &\sim \text{Normal}(D_i, D_{\text{SE},i}) \\ D_i &\sim \text{Normal}(\mu_i, \sigma) \\ \mu_i &= \alpha + \beta_AA_i + \beta_MM_{obs,i} \\ M_{obs,i} &\sim \text{Normal}(M_i, M_{\text{SE},i}) \\ M_i &\sim \text{Normal}(0, 1) \\ \alpha &\sim \text{Normal}(0, 0.2) \\ \beta_A &\sim \text{Normal}(0, 0.5) \\ \beta_M &\sim \text{Normal}(0, 0.5) \\ \sigma &\sim \text{Exponential}(1) \end{align}\]

And here we incorporate the measurement error into the model code:

m2 <- 
  brm(data = d, 
      family = gaussian,
      D_obs | mi(D_sd) ~ 1 + A + me(M, M_sd),
      prior = c(prior(normal(0, 0.2), class = Intercept),
                prior(normal(0, 0.5), class = b),
                prior(normal(0, 1), class = meanme),
                prior(exponential(1), class = sigma)),
      iter = 2000, warmup = 1000, cores = 4, chains = 4,
      seed = 15,
      # note this line
      save_pars = save_pars(latent = TRUE),
      file = here("files/models/m92.2"))
m2
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: D_obs | mi(D_sd) ~ 1 + A + me(M, M_sd) 
   Data: d (Number of observations: 50) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept    -0.03      0.10    -0.22     0.17 1.00     2692     2785
A            -0.52      0.16    -0.84    -0.21 1.00     1800     2492
meMM_sd       0.27      0.23    -0.17     0.71 1.01     1311     2335

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.55      0.11     0.35     0.79 1.00      969     1345

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Here’s an example of when your causal model is worse at out-of-sample prediction.

m0 <- add_criterion(m0, "loo")
m1 <- add_criterion(m1, "loo")
m2 <- add_criterion(m2, "loo")
loo_compare(m0, m1, m2)
   elpd_diff se_diff
m0    0.0       0.0 
m1  -89.6      46.7 
m2 -140.7      65.6 

But that’s ok! Remember, it’s not always in your best interest to prioritize prediction. We’re introducing more uncertainty in our model so that we’re not fooled into thinking we have precision. The noise is the point in this case.

Let’s compare our coefficient estimates.

Code
# get posterior draws for coefficients starting with b_
post0 = as_draws_df(m0) |> select(starts_with("b")) |> mutate(model="0")
post1 = as_draws_df(m1) |> select(starts_with("b")) |> mutate(model="1")
post2 = as_draws_df(m2) |> select(starts_with("b")) |> mutate(model="2")

full_join(post0, post1) |>
  full_join(post2) |> 
  pivot_longer(-model) |> 
  with_groups(c(model, name), mean_qi, value) |>
  select(model, name, value, .lower, .upper) |>
  mutate(estimate = sprintf("%.2f [%.2f, %.2f]", value, .lower, .upper)) |>
  select(model, name, estimate) |>
  pivot_wider(names_from = model, values_from = estimate) |>
  rename("Model 0" = "0", "Model 1" = "1", "Model 2" = "2") |>
  mutate(name = str_remove(name, "b_")) |>
  knitr::kable(col.names = c("Parameter", "Model 0", "Model 1", "Model 2"),
               caption = "Coefficient estimates [95% CI] for each model")
Coefficient estimates [95% CI] for each model
Parameter Model 0 Model 1 Model 2
A -0.61 [-0.91, -0.29] -0.62 [-0.92, -0.30] -0.52 [-0.84, -0.21]
Intercept 0.00 [-0.20, 0.19] -0.05 [-0.24, 0.14] -0.03 [-0.22, 0.17]
M -0.06 [-0.37, 0.25] 0.05 [-0.27, 0.38] NA [NA, NA]
bsp_meMM_sd NA [NA, NA] NA [NA, NA] 0.27 [-0.17, 0.71]

Our coefficient estimate is stronger because we’ve regularized on both our outcome and our predictor.

Code
color_y <- "#1c5253"  
color_p <- "#e07a5f"

# wrangle
full_join(
  # D
  tibble(Loc   = d %>% pull(Loc),
         D_obs = d %>% pull(D_obs),
         D_est = posterior_summary(m2) %>% 
           data.frame() %>% 
           rownames_to_column("term") %>% 
           filter(str_detect(term, "Yl")) %>% 
           pull(Estimate)) %>% 
    pivot_longer(-Loc, values_to = "d") %>% 
    mutate(name = if_else(name == "D_obs", "observed", "posterior")),
  # M
  tibble(Loc   = d %>% pull(Loc),
         M_obs = d %>% pull(M_obs),
         M_est = posterior_summary(m2) %>% 
           data.frame() %>% 
           rownames_to_column("term") %>% 
           filter(str_detect(term, "Xme_")) %>% 
           pull(Estimate)) %>% 
    pivot_longer(-Loc, values_to = "m") %>% 
    mutate(name = if_else(name == "M_obs", "observed", "posterior")),
  by = c("Loc", "name")
)  %>% 
  
  # plot!
  ggplot(aes(x = m, y = d)) +
  geom_line(aes(group = Loc),
            linewidth = 1/4) +
  geom_point(aes(color = name)) +
  scale_color_manual(values = c(color_p, color_y)) +
  labs(subtitle = "Shrinkage of both divorce rate and marriage rate", 
       x = "Marriage rate (std)" , 
       y = "Divorce rate (std)")

Modeling with error

- incorporating measurement error into your model results in regularization: observations associated with more error are less influential on the model, and the estimated true score of those points can be (but are not necessarily) quite different from the observation. 

- this will result in a model that is more conservative but one you can have more faith in. 

- of course, there's an obvious application of this kind of modeling...

meta analysis

The goal of a meta-analysis is estimating meaningful parameters. A set of studies that are broadly comparable are either…

1. identical replications of each other in that all studies are identical samples of the same population with the same outcome measures, etc, or 

2. so different that the results of any one study provide no information about the results of any of the others, or

3. the studies as exchangeable but not necessarily either identical or completely unrelated; in other words we allow differences from study to study, but such that the differences are not expected a priori to have predictable effects favoring one study over another.

In other words, you can answer the question with complete pooling, no pooling, or partial pooling. And generally speaking, the third of these is probably the one you want.

Our data come from the second large-scale replication project by the Many Labs team (Klein et al., 2018). Of the 28 studies replicated in the study, we will focus on the replication of the trolley experiment from Hauser et al. (2007).

From Klein and colleagues:

According to the principle of double effect, an act that harms other people is more morally permissible if the act is a foreseen side effect rather than the means to the greater good. Hauser et al. (2007) compared participants’ reactions to two scenarios to test whether their judgments followed this principle. In the foreseen-side-effect scenario, a person on an out-of-control train changed the train’s trajectory so that the train killed one person instead of five. In the greater-good scenario, a person pushed a fat man in front of a train, killing him, to save five people. Whereas 89% of participants judged the action in the foreseen-side-effect scenario as permissible (95% CI = [87%, 91%]), only 11% of participants in the greater-good scenario judged it as permissible (95% CI = [9%,13%]). The difference between the percentages was significant, \(χ2(1, N=2,646) = 1,615.96\), \(p<.001\), \(w=.78\), \(d=2.50\), 95%CI=[2.22,2.86]. Thus, the results provided evidence for the principle of double effect. (p. 459, emphasis in the original)

You can download the data from OSF. You want the .csv file in Trolley Dilemma 1 > Hauser.1 > By Order > Data > Hauser_1_study_by_order_all_CLEAN_CASE.csv

d <- read_csv(here("files/data/external_data/Hauser_1_study_by_order_all_CLEAN_CASE.csv"))
glimpse(d)
Rows: 6,842
Columns: 27
$ uID              <dbl> 65, 68, 102, 126, 145, 263, 267, 298, 309, 318, 350, …
$ variable         <chr> "Yes", "Yes", "Yes", "Yes", "Yes", "No", "Yes", "Yes"…
$ factor           <chr> "SideEffect", "SideEffect", "SideEffect", "SideEffect…
$ .id              <chr> "ML2_Slate1_Brazil__Portuguese_execution_illegal_r.cs…
$ source           <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ haus1.1          <dbl> 1, 1, 1, 1, 1, 2, 1, 1, 2, 1, 1, 1, 1, 1, 1, 2, 1, 1,…
$ haus1.1t_1       <dbl> 39.054, 36.792, 56.493, 21.908, 25.635, 50.633, 58.66…
$ haus2.1          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ haus2.1t_1       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ Source.Global    <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ Source.Primary   <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ Source.Secondary <chr> "brasilia", "brasilia", "brasilia", "wilfredlaur", "w…
$ Country          <chr> "Brazil", "Brazil", "Brazil", "Canada", "Canada", "Ca…
$ Location         <chr> "Social and Work Psychology Department, University of…
$ Language         <chr> "Portuguese", "Portuguese", "Portuguese", "English", …
$ Weird            <dbl> 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ Execution        <chr> "illegal", "illegal", "illegal", "illegal", "illegal"…
$ SubjectPool      <chr> "No", "No", "No", "Yes", "Yes", "Yes", "Yes", "Yes", …
$ Setting          <chr> "In a classroom", "In a classroom", "In a classroom",…
$ Tablet           <chr> "Computers", "Computers", "Computers", "Computers", "…
$ Pencil           <chr> "No, the whole study was on the computer (except mayb…
$ StudyOrderN      <chr> "Hauser|Ross.Slate1|Rottenstrich|Graham|Kay|Inbar|And…
$ IDiffOrderN      <chr> "ID: Global self-esteem SISE|ID: Mood|ID: Subjective …
$ study.order      <dbl> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ analysis.type    <chr> "Order", "Order", "Order", "Order", "Order", "Order",…
$ subset           <chr> "all", "all", "all", "all", "all", "all", "all", "all…
$ case.include     <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,…
d <- d %>% 
  mutate(y   = ifelse(variable == "Yes", 1, 0),
         loc = factor(Location,
                      levels = distinct(d, Location) %>% pull(Location),
                      labels = 1:59))

You could pool all the observations into a single model. The original study used a chi-square test to compare two percentages, but what we’re really talking about is a trial with a binary outcome (yes or no) and predicting that response from a binary predictor (greater good vs side effect). So the better way to model these data is with a binomial regression with a logit link.

glm0 = glm(data = d, y ~ factor, family = binomial(logit)) 

glm0|> summary()

Call:
glm(formula = y ~ factor, family = binomial(logit), data = d)

Coefficients:
                 Estimate Std. Error z value Pr(>|z|)    
(Intercept)      -1.57272    0.04575  -34.38   <2e-16 ***
factorSideEffect  2.44640    0.05894   41.51   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 9397.7  on 6841  degrees of freedom
Residual deviance: 7304.6  on 6840  degrees of freedom
AIC: 7308.6

Number of Fisher Scoring iterations: 4

But in a meta-analysis, you don’t have all of the data. Let’s pretend each of these sites did a separate study and shared only their effect size and standard error.

glms <- d %>% 
  select(loc, y, factor) %>% 
  nest(data = c(y, factor)) %>% 
  mutate(glm = map(data, ~update(glm0, data = .))) %>% 
  mutate(coef = map(glm, tidy)) %>% 
  select(-data, -glm) %>% 
  unnest(coef) %>% 
  filter(term == "factorSideEffect") %>%
  mutate(OR = exp(estimate))

what did we do?

glms %>% 
  mutate_if(is.double, round, digits = 3)
# A tibble: 59 × 7
   loc   term             estimate std.error statistic p.value    OR
   <fct> <chr>               <dbl>     <dbl>     <dbl>   <dbl> <dbl>
 1 1     factorSideEffect     2.32     0.475      4.89       0 10.2 
 2 2     factorSideEffect     3.64     0.644      5.64       0 37.9 
 3 3     factorSideEffect     2.37     0.399      5.96       0 10.7 
 4 4     factorSideEffect     2.24     0.263      8.54       0  9.43
 5 5     factorSideEffect     2.02     0.505      4.00       0  7.52
 6 6     factorSideEffect     2.49     0.571      4.36       0 12.0 
 7 7     factorSideEffect     2.53     0.658      3.84       0 12.5 
 8 8     factorSideEffect     1.78     0.459      3.87       0  5.93
 9 9     factorSideEffect     1.81     0.378      4.79       0  6.12
10 10    factorSideEffect     2.37     0.495      4.79       0 10.7 
# ℹ 49 more rows
Code
glms %>% 
    arrange(estimate) |> 
    mutate(
        rank = row_number()) |> 
    ggplot(aes(x = rank, y = estimate)) +
    geom_point() +
    geom_errorbar( aes(ymin=estimate-1.96*std.error,
                       ymax=estimate+1.96*std.error),
                    alpha=.5, width=.5) +
    coord_flip() +
    labs(y = expression(italic(y[j])~("log odds")))
Code
glms %>% 
    arrange(estimate) |> 
    mutate(
        OR = exp(estimate),
        rank = row_number()) |> 
    ggplot(aes(x = rank, y = estimate)) +
    geom_point() +
    geom_errorbar( aes(ymin=exp(estimate-1.96*std.error),
                       ymax=exp(estimate+1.96*std.error)),
                    alpha=.5, width=.5) +
    coord_flip() +
    labs(y = expression(italic(y[j])~("odds ratio")))
Code
glms %>% 
  ggplot(aes(x = std.error, y = estimate)) +
  geom_point() +
  labs(x = expression(sigma[italic(j)]~("log-odds")),
       y = expression(italic(y[j])~("log-odds")))

Given what we’ve done already, conducting a meta-analysis on this is relatively straight-forward. The key here is to recognize that the coefficient estimates are Gaussian distributed. Recall from our lecture on binary outcomes: the distribution on the outcome was Bernoulli, but the prior on the coefficients were normal. So we’ll work with the coefficients for the prediction of log odds; we can transform at the end to see how this translates into odds ratios.

\[\begin{align*} \text{estimate}_j &\sim \text{Normal}(\theta_j, \text{std.error}_j) \\ \theta_j &\sim \text{Normal}(\mu, \tau) \\ \mu &\sim \text{Normal}(0, 1.5) \\ \tau &\sim \text{Exponential}(1) \end{align*}\]

m3 <- 
  brm(data = glms, 
      family = gaussian,
      estimate | se(std.error) ~ 1 + (1 | loc),
      prior = c(prior(normal(0, 1.5), class = Intercept),
                prior(exponential(1), class = sd)),
      iter = 2000, warmup = 1000, cores = 4, chains = 4,
      seed = 15,
      file = here("files/models/m92.3"))

From Solomon Kurtz:

se() is one of the brms helper functions designed to provide additional information about the criterion variable. Here it informs brm() that each estimate value has an associated measurement error defined in the std.error column. Unlike the mi() function, which we used earlier in the chapter to accommodate measurement error and the Bayesian imputation of missing data, the se() function is specially designed to handle meta-analyses. se() contains a sigma argument which is set to FALSE by default. This will return a model with no estimate for sigma, which is what we want. The uncertainty around the estimate-value for each study, \(j\), has already been encoded in the data as std.error.

m3
 Family: gaussian 
  Links: mu = identity; sigma = identity 
Formula: estimate | se(std.error) ~ 1 + (1 | loc) 
   Data: glms (Number of observations: 59) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~loc (Number of levels: 59) 
              Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.43      0.09     0.26     0.63 1.00     1665     2474

Regression Coefficients:
          Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept     2.55      0.09     2.38     2.72 1.00     3487     2793

Further Distributional Parameters:
      Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma     0.00      0.00     0.00     0.00   NA       NA       NA

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Remember that while our regression coefficients are likely to be gaussian distributed, they’re predictors for a logit-link binomial outcome. So to estimate the relative likelihoods, we really need to transform these back.

These effect sizes represent the difference or relative probability. Relative probabilities are expressed as odds ratios; a coefficient in a binomial regression is the log-odds. To get the odds, just invert the log (i.e., exponentiate).

m3 |> 
    as_draws_df() |> 
    mutate(odds_ratio = exp(b_Intercept)) |> 
    mean_qi(odds_ratio)
# A tibble: 1 × 6
  odds_ratio .lower .upper .width .point .interval
       <dbl>  <dbl>  <dbl>  <dbl> <chr>  <chr>    
1       12.8   10.8   15.2   0.95 mean   qi       

For comparison, the replication project found an odds ratio of 11.54.