week 10: final countdown

general linear madness

library(tidyverse)
library(psych)
library(cowplot)
library(patchwork)
library(here)
library(brms) 
library(tidybayes)
library(bayesplot)
library(rstan) # new package!

thinking

Today’s class is less of a coherent topic and more of a plea to think about your data. Complex statistics are never a substitute for understanding the relationships between your constructs and drawing out the causal model.

It’s quite easy to learn about a model or technique and look for a reason to throw it at data. (I’m guilty of that. I’m probably doing that right now.) A problem with that approach is that we typically start with the model or framework and bend the data to fit that, or at least, start by assuming the assumptions of the model. But the best models are the ones that come from the question and theory. We should start with the assumptions of our questions and theory, then find a model that fits with those.

McElreath provides several examples in this lecture/chapter about how to start from the theory and work towards a model that actually makes sense, instead of throwing a standard model at a problem. Along the way, he uses Stan code, so it’s as good a time as any to look at that. We’ll use two ways to integrate Stan with brms.

hidden minds

Recall the developmental psychology experiment from the lecture. There are 629 children in this dataset between the ages of 4 and 14. The outcome y here takes the values 1, 2, and 3. It indicates which of the three options were chosen, where 1 indicates the unchosen color, 2 indicates the majority demonstrated color, and 3 indicates the minority demonstrated color.

data(Boxes, package = "rethinking")
d <- Boxes
rm(Boxes)

rethinking::precis(d)

                    mean        sd 5.5% 94.5%      histogram
y              2.1208267 0.7279860    1     3     ▃▁▁▁▇▁▁▁▁▅
gender         1.5055644 0.5003669    1     2     ▇▁▁▁▁▁▁▁▁▇
age            8.0302067 2.4979055    5    13     ▇▃▅▃▃▃▂▂▂▁
majority_first 0.4848967 0.5001696    0     1     ▇▁▁▁▁▁▁▁▁▇
culture        3.7519873 1.9603189    1     8 ▃▂▁▇▁▂▁▂▁▂▁▁▁▁

Code

# wrangle
d %>% 
  mutate(color = factor(y,
                        levels = 1:3,
                        labels = c("unchosen", "majority", "minority"))) %>%
  mutate(y = str_c(y, " (", color, ")")) %>% 
  count(y) %>% 
  mutate(percent = (100 * n / sum(n))) %>% 
  mutate(label = str_c(round(percent, digits = 1), "%")) %>% 
  
  # plot
  ggplot(aes(y = y)) +
  geom_col(aes(x = n)) +
  geom_text(aes(x = n + 4, label = label), hjust = 0) +
  scale_x_continuous(expression(italic(n)), expand = expansion(mult = c(0, 0.1))) +
  labs(title = "The criterion variable y indexes three kinds of color choices",
       subtitle = "Children tended to prefer the 'majority' color.",
       y = NULL) +
  theme(axis.text.y = element_text(hjust = 0),
        axis.ticks.y = element_blank())

a generative model

Consider a group of children in which half of them choose at random and the other half follow the majority. If we simulate choices for these children, we can figure out how often we might see the “2” choice, the one that indicates the majority color.

set.seed(5)
N <- 30 # number of children

# half are random: sample from 1,2,3 at random for each
y1 <- sample( 1:3 , size=N/2 , replace=TRUE )

# half follow majority
y2 <- rep( 2 , N/2 )

# combine and shuffle y1 and y2
y <- sample( c(y1,y2) )

# count the 2s
sum(y==2)/N

[1] 0.6333333

About two-thirds of the choices are for the majority color, but only half the children are actually following the majority.

We’ll consider 5 different strategies children might use.

Follow the Majority: Copy the majority demonstrated color.
Follow the Minority: Copy the minority demonstrated color.
Maverick: Choose the color that no demonstrator chose.
Random: Choose a color at random, ignoring the demonstrators.
Follow First: Copy the color that was demonstrated first. This was either the majority color (when majority_first equals 1) or the minority color (when 0).

An important step here is enumerating the likelihood of each choice (1, 2, 3) for each strategy.

Strategy	Choice 1	Choice 2	Choice 3
Majority	0	1	0
Minority	0	0	1
Maverick	1	0	0
Random	1/3	1/3	1/3
Follow First	0*	1 or 0**	0 or 1**

Notes:

*Follow First strategy for Choice 1: Always 0 (never chooses option 1)
**Follow First strategy varies based on majority_first[i]:
- If majority_first[i] == 1: Choice 2 = 1, Choice 3 = 0
- If majority_first[i] != 1: Choice 2 = 0, Choice 3 = 1

We can’t directly measure each child’s strategy. It is an unobserved variable. But each strategy has a specific probability of producing each choice. We can use that fact to compute the probability of each choice, given parameters which specify the probability of each strategy. Then we let Bayes loose and get the posterior distribution of each strategy back. Before we can let Bayes loose, we’ll need to enumerate the parameters, assign priors to each, and also figure out some technical issues for coding.

we can’t directly measure each child’s strategy. It is an unobserved variable. But each strategy has a specific probability of producing each choice. We can use that fact to compute the probability of each choice, given parameters which specify the probability of each strategy. Then we let Bayes loose and get the posterior distribution of each strategy back. Before we can let Bayes loose, we’ll need to enumerate the parameters, assign priors to each, and also figure out some technical issues for coding. We’ve used these before when studying categorical outcomes.

\[ p \sim \text{Dirichlet}([4,4,4,4,4]) \]

Code

library(gtools)  # for rdirichlet function
set.seed(123)

# Draw samples from both distributions
samples_high <- rdirichlet(30, c(4,4,4,4,4))
samples_low <- rdirichlet(30, c(2,2,2,2,2))

# Convert to data frames and add identifiers
df_high <- data.frame(samples_high) %>%
  mutate(sample_id = 1:30, 
         distribution = "Dirichlet([4,4,4,4,4])") %>%
  setNames(c(paste0("X", 1:5), "sample_id", "distribution"))

df_low <- data.frame(samples_low) %>%
  mutate(sample_id = 1:30, 
         distribution = "Dirichlet([2,2,2,2,2])") %>%
  setNames(c(paste0("X", 1:5), "sample_id", "distribution"))

# Combine data frames
df_combined <- rbind(df_high, df_low)

# Reshape to long format for plotting
df_long <- df_combined %>%
  pivot_longer(cols = X1:X5, 
               names_to = "component", 
               values_to = "probability") %>%
  mutate(component = as.numeric(gsub("X", "", component)))

# Create the plot
ggplot(df_long, aes(x = component, y = probability, group = sample_id)) +
  geom_line(alpha = 0.6, color = "steelblue") +
  geom_hline(yintercept = 0.2, linetype = "dashed", color = "red", alpha = 0.7) +
  facet_wrap(~ distribution, ncol = 2) +
  labs(title = "Comparison of Dirichlet Distribution Samples",
       subtitle = "30 samples from each distribution (red line shows uniform expectation)",
       x = "Component",
       y = "Probability") +
  scale_x_continuous(breaks = 1:5) +
  theme_minimal() +
  theme(panel.grid.minor = element_blank(),
        strip.text = element_text(size = 12, face = "bold"))

Great, now we need to express the likelihood. For each observed choice \(y_i\), each strategy \(s\) implies a probability of seeing \(y_i\).

\[ \text{Pr}(y_i) = \sum_{s=1}^5 p_s\text{Pr}(y_i|s) \]

Read this as the probability of \(y_i\) is the weighted average of the probabilities of \(y_i\) conditional on each strategy \(s\).This expression is a mixture, as in earlier chapters. Sometimes you’ll read that this marginalizes out the unknown strategy. This just means averaging over the strategies, using some probability of each to get the weight of each in the average. Above, the values in \(p\) provide these weights.

mathematical model

We can put all these pieces together in a (very simple) statistical model.

\[\begin{align*} y_i &\sim \text{Categorical}(\theta) \\ \theta &= \sum_{s=1}^5 p_s\text{Pr}(y_i|s) \\ p \sim \text{Dirichlet}([4,4,4,4,4]) \end{align*}\]

On to the Stan code…

stan by me, when you’re not strong

First, we list the data. Recall our outcome is y and we’re using the variable majority_first to indicate the order in which children saw the two groups of peers. N is not part of our data, but Stan uses this to index sample size.

data{
    int N;
    array[N] int y;
    array[N] int majority_first;
}

Next, we list our parameters. In this case, we’re trying to estimate the proportion of children who use each strategy:

parameters{
    simplex[5] p;
}

Now for the model. This bit is chunky.

model{
    vector[5] phi;
    
    // prior
    p ~ dirichlet( rep_vector(4,5) );
    
    // probability of data
    for ( i in 1:N ) {
        if ( y[i]==2 ) phi[1]=1; else phi[1]=0; // majority
        if ( y[i]==3 ) phi[2]=1; else phi[2]=0; // minority
        if ( y[i]==1 ) phi[3]=1; else phi[3]=0; // maverick
        phi[4]=1.0/3.0;                         // random
        if ( majority_first[i]==1 )             // follow first
            if ( y[i]==2 ) phi[5]=1; else phi[5]=0;
        else
            if ( y[i]==3 ) phi[5]=1; else phi[5]=0;
        
        // compute log( p_s * Pr(y_i|s )
        for ( j in 1:5 ) phi[j] = log(p[j]) + log(phi[j]);
        // compute average log-probability of y_i
        target += log_sum_exp( phi );
    }
}

To add run this code in R, add all of this to a character string. The function cat() allows you to see the string with the formatting (line breaks, etc).

boxes_stan = "
  data{
    int N;
    array[N] int y;
    array[N] int majority_first;
    }
  parameters{
    simplex[5] p;
    }
  model{
    vector[5] phi;
    
    // prior
    p ~ dirichlet( rep_vector(4,5) );
    
    // probability of data
    for ( i in 1:N ) {
        if ( y[i]==2 ) phi[1]=1; else phi[1]=0; // majority
        if ( y[i]==3 ) phi[2]=1; else phi[2]=0; // minority
        if ( y[i]==1 ) phi[3]=1; else phi[3]=0; // maverick
        phi[4]=1.0/3.0;                         // random
        if ( majority_first[i]==1 )             // follow first
            if ( y[i]==2 ) phi[5]=1; else phi[5]=0;
        else
            if ( y[i]==3 ) phi[5]=1; else phi[5]=0;
        
        // compute log( p_s * Pr(y_i|s )
        for ( j in 1:5 ) phi[j] = log(p[j]) + log(phi[j]);
        // compute average log-probability of y_i
        target += log_sum_exp( phi );
    }
  }
"
cat(boxes_stan)


  data{
    int N;
    array[N] int y;
    array[N] int majority_first;
    }
  parameters{
    simplex[5] p;
    }
  model{
    vector[5] phi;
    
    // prior
    p ~ dirichlet( rep_vector(4,5) );
    
    // probability of data
    for ( i in 1:N ) {
        if ( y[i]==2 ) phi[1]=1; else phi[1]=0; // majority
        if ( y[i]==3 ) phi[2]=1; else phi[2]=0; // minority
        if ( y[i]==1 ) phi[3]=1; else phi[3]=0; // maverick
        phi[4]=1.0/3.0;                         // random
        if ( majority_first[i]==1 )             // follow first
            if ( y[i]==2 ) phi[5]=1; else phi[5]=0;
        else
            if ( y[i]==3 ) phi[5]=1; else phi[5]=0;
        
        // compute log( p_s * Pr(y_i|s )
        for ( j in 1:5 ) phi[j] = log(p[j]) + log(phi[j]);
        // compute average log-probability of y_i
        target += log_sum_exp( phi );
    }
  }

Let’s integrate this with brms so we can make use of some of the helper functions.

First, we’ll create an empty brmsfit object.

# We need to create a dummy model that has similar structure to what we want
m1 <- brm(
  y ~ 1,  # Simple intercept model as placeholder
  data = d,
  family = categorical(),  # Since you have categorical outcomes
  chains = 0,  # Don't actually fit
  silent = 2,
  refresh = 0
)

Next, we prepare the data for Stan:

library(rstan)
stan_data <- list(
  N = nrow(d),
  y = d$y,
  majority_first = d$majority_first
)

The function stan_model will convert our character string to something that Stan can recognize.

# Compile the custom Stan model
custom_stan_model <- stan_model(model_code = boxes_stan)

# Fit the custom model
custom_fit <- sampling(
  custom_stan_model,
  data = stan_data,
  chains = 4, iter = 2000, warmup = 1000, cores = 4
)

# Replace the fit slot in the brmsfit object
m1$fit <- custom_fit

# Also update the model slot to contain the Stan code
m1$model <- boxes_stan

# Update the data
m1$data <- d

And that’s it! You can now use (some) of the usual functions to evaluate your model.

Note that some functions will work, but others will not.

posterior_summary(m1)

         Estimate  Est.Error          Q2.5        Q97.5
p[1]    0.2571846 0.03657030    0.18430188    0.3266894
p[2]    0.1379903 0.03226541    0.07350504    0.1988042
p[3]    0.1474092 0.03055084    0.08339295    0.2033196
p[4]    0.1981705 0.08014053    0.06273866    0.3693370
p[5]    0.2592454 0.03187899    0.19698956    0.3205356
lp__ -667.1548403 1.50296677 -670.99662701 -665.3105596

m1 # doesn't work

mcmc_rank_overlay(m1)

gather_draws(m1, p[strategy]) |> 
  ggplot( aes(x=.value, fill=as.factor(strategy) )) +
  stat_halfeye(alpha=.5)

time for a new topic

We’ve already discussed the phenomenon of time, and we included some models that regressed an outcome (like emotion) on the most recent emotion from the same individual. This is an example of AUTOREGRESSION. In an autoregressive model, the value of the outcome in the previous time step is called a lag variable and added as a predictor to the right side of a linear model.

\[ \text{Positive Emotion}_{\text{today}} = \alpha + \beta(\text{Positive Emotion}_{\text{yesterday}}) \]

These are extremely common, especially in the social sciences. And, they have some problems.

auto-problems

that’s not really causal
- some causal processes act slower than others
if the states are measured with error – they probably are – then the model propagates error
there is no biological, economic, or physical interpretation of the parameters

What should we do instead? Ordinary Differential Equations (ODE)

a hare-y example

data(Lynx_Hare, package = "rethinking")
glimpse(Lynx_Hare)

Rows: 21
Columns: 3
$ Year <int> 1900, 1901, 1902, 1903, 1904, 1905, 1906, 1907, 1908, 1909, 1910,…
$ Lynx <dbl> 4.0, 6.1, 9.8, 35.2, 59.4, 41.7, 19.0, 13.0, 8.3, 9.1, 7.4, 8.0, …
$ Hare <dbl> 30.0, 47.2, 70.2, 77.4, 36.3, 20.6, 18.1, 21.4, 22.0, 25.4, 27.1,…

Based on example by Bob Carpenter (2018) (worth reading!).

Code

# for annotation
text <-
  tibble(name  = c("Hare", "Lynx"),
         label = c("Lepus", "Lynx"),
         Year  = c(1913.5, 1915.5),
         value = c(78, 52))

# wrangle
Lynx_Hare %>% 
  pivot_longer(-Year) %>% 
  
  # plot!
  ggplot(aes(x = Year, y = value)) +
  geom_line(aes(color = name),
            linewidth = 1/4) +
  geom_point(aes(fill = name),
             size = 3, shape = 21, color = "white") +
  geom_text(data = text,
            aes(label = label, color = name),
            hjust = 0) +
  scale_fill_grey(start = 0, end = .5) +
  scale_color_grey(start = 0, end = .5) +
  scale_y_continuous("thousands of pelts", breaks = 0:4 * 20, limits = c(0, 90)) +
  theme(legend.position = "none")

Let’s start by modeling hares. You’ll be tempted to model the number of hares, \(H\), at a given time point, but consider modeling the rate of change in hares instead:

\[ \frac{\text{d}H}{\text{d}t} = H_t \times (\text{birth rate}) - H_t \times (\text{death rate}) \]

For the sake of simplicity, let’s say the birth and death rates are constants.

\[ \frac{\text{d}H}{\text{d}t} = H_tb_h - H_tm_h = H_t(b_h-m_h) \]

And the presence of the Lynx modifies the mortality rate: \[ \frac{\text{d}H}{\text{d}t} = H_tb_h - H_tm_h = H_t(b_h-L_tm_h) \]

\[ \frac{\text{d}H}{\text{d}t} = H_t(b_h-L_tm_h) \]

Similar logic gives up the rate of change for Lynxs:

\[ \frac{\text{d}L}{\text{d}t} = L_t(H_tb_h-m_h) \]

Now, this model is powerful because we can learn from it without data!

simulated hares

sim_lynx_hare <- function(n_steps, init, theta, dt = 0.002) { 
  #empty vectors to hold the lynx and hares
  L <- rep(NA, n_steps)
  H <- rep(NA, n_steps)
  
  # set initial values
  L[1] <- init[1]
  H[1] <- init[2]
  
  for (i in 2:n_steps) {
    H[i] <- H[i - 1] + dt * H[i - 1] * (theta[1] - theta[2] * L[i - 1])
    L[i] <- L[i - 1] + dt * L[i - 1] * (theta[3] * H[i - 1] - theta[4])
  }

  # return a tibble
  tibble(t = 1:n_steps,
         H = H,
         L = L)  
}

# set the four theta values
theta <- c(0.5, 0.05, 0.025, 0.5)

# simulate
z <- sim_lynx_hare(n_steps = 1e4, 
                   init = c(filter(Lynx_Hare, Year == 1900) %>% pull("Lynx"), 
                            filter(Lynx_Hare, Year == 1900) %>% pull("Hare")), 
                   theta = theta)

# what did we do?
glimpse(z)

Rows: 10,000
Columns: 3
$ t <int> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2…
$ H <dbl> 30.00000, 30.01800, 30.03600, 30.05401, 30.07203, 30.09005, 30.10807…
$ L <dbl> 4.000000, 4.002000, 4.004005, 4.006014, 4.008028, 4.010046, 4.012069…

Code

z %>% 
  pivot_longer(-t) %>% 
  
  ggplot(aes(x = t, y = value, color = name)) +
  geom_line(linewidth = 1/4) +
  scale_color_grey(start = 0, end = .5) +
  scale_x_continuous(expression(time~(italic(t))), breaks = NULL) +
  scale_y_continuous("number (thousands)", breaks = 0:4 * 10, limits = c(0, 45)) +
  theme(legend.position = "none")

Anyone notice a problem with our data? We don’t have the number of hares or lynx – we have the number of pelts! So we have a different causal model.

We want a statistical model that can connect \(h_t\) to \(H_t\) and connect \(l_t\) to \(L_t\). Part of the model would include the probability a hare was trapped on a given year, \(p_h\), and a similar probability for lynx, \(p_l\). To make things worse, further imagine the number of pelts for each, in a given year, was rounded to the nearest 100 and divided by 1,000. Those are our values.

We can do this though. Suppose for example there is a population of \(H_t = 10^4\) hares. Suppose also that the annual trapping rate varies according to a beta distribution \(p_h\) ∼ Beta(2, 18). This means the average is 10%, but it is very rarely double that. We get a binomial count of pelts sampled for the population of hares, and then that is rounded to the nearest 100 and divided by 1000. Let’s see what this sort of distribution looks like:

Code

N <- 1e4
Ht <- 1e4
p <- rbeta(N,2,18)
h <- rbinom(N, size=Ht, prob=p)
h <- round( h/1000, 2 )
h |> rethinking::dens(, xlab="thousands of pelts", lwd=2)

This is our distribution of number of hares. We should consider what kind of likelihood function approximates this. A Log-Normal does a good job: right constraints (not below 0) and skew (positive).

\[ h_t \sim \text{Log-Normal}( \text{log}(pH_t), \sigma_H ) \]

\(h_t\) is the expected number of hairs trapped or observed pelts.

An important fact about this measurement process is that there is no good way to estimate \(p\), not without lots of data at least. So we’re going to just fix it with a strong prior. If this makes you uncomfortable, notice that the model has forced us to realize that we cannot do any better than relative population estimates, unless we have a good way to know \(p\). A typical time series model would just happily spin on its epicycles, teaching us nothing this useful.

statistical model

probabilities of observed variables

\[\begin{align*} h_t &\sim \text{Log-Normal}( \text{log}(p_hH_t), \sigma_H ) \\ l_t &\sim \text{Log-Normal}( \text{log}(p_lL_t), \sigma_L ) \end{align*}\]

unobserved variables \[\begin{align*} H_1 &\sim \text{Log-Normal}( \text{log}10, 1 ) \\ L_1 &\sim \text{Log-Normal}( \text{log}10, 1 ) \\ H_{T>1} &= H_1 + \int_1^TH_t(b_H -L_tm_H)dt \\ L_{T>1} &= L_1 + \int_1^TL_t(H_tb_L -m_L)dt \\ \end{align*}\]

priors

\[\begin{align*} \sigma_H &\sim \text{Exponential}(1) \\ \sigma_L &\sim \text{Exponential}(1) \\ p_H &\sim \text{Beta}(\alpha_H, \beta_H) \\ p_L &\sim \text{Beta}(\alpha_L, \beta_L) \\ b_H &\sim \text{Half-Normal}(1, 0.5) \\ b_L &\sim \text{Half-Normal}(0.05, 0.05) \\ m_H &\sim \text{Half-Normal}(0.05, 0.05) \\ m_L &\sim \text{Half-Normal}(1, 0.5) \\ \end{align*}\]

For this example, we’ll transform our data into a long-format to fit with some constraints on brms.

Lynx_Hare_long <-
  Lynx_Hare %>% 
  pivot_longer(-Year,
               names_to = "population", 
               values_to = "pelts") %>% 
  mutate(delta = if_else(population == "Lynx", 1, 0),
         t     = Year - min(Year) + 1) %>% 
  arrange(delta, Year)

# what did we do?
Lynx_Hare_long |> arrange(Year) |> head()

# A tibble: 6 × 5
   Year population pelts delta     t
  <int> <chr>      <dbl> <dbl> <dbl>
1  1900 Hare        30       0     1
2  1900 Lynx         4       1     1
3  1901 Hare        47.2     0     2
4  1901 Lynx         6.1     1     2
5  1902 Hare        70.2     0     3
6  1902 Lynx         9.8     1     3

stan in the place where you were

Instead of writing the entire Stan mode, we’ll just write the bit that tells Stan how to fit this kind of model, called a Lotka-Volterra.

LotkaVolterra <- "
// Specify dynamical system (ODEs)
array[] real ode_LV(real t,              // time
                    array[] real y,      // the system rate
                    array[] real theta,  // the parameters (i.e., the birth and mortality rates)
                    array[] real x_r,    // data constant, not used here
                    array[] int x_i) {   // data constant, not used here
  // the outcome
  array[2] real dydt;
  
  // differential equations
  dydt[1] = (theta[1] - theta[2] * y[2]) * y[1]; // Hare process
  dydt[2] = (theta[3] * y[1] - theta[4]) * y[2]; // Lynx process
  
  return dydt;  // return a 2-element array
}

// Integrate ODEs and prepare output
real LV(real t, real Hare0, real Lynx0, 
        real brHare, real mrHare, 
        real brLynx, real mrLynx,
        real delta) {
        
  array[2] real y0;     // Initial values
  array[4] real theta;  // Parameters
  array[1, 2] real y;   // ODE solution
  
  // Set initial values
  y0[1] = Hare0; 
  y0[2] = Lynx0;
  
  // Set parameters
  theta[1] = brHare; 
  theta[2] = mrHare;
  theta[3] = brLynx; 
  theta[4] = mrLynx;
  
  // Solve ODEs
  y = integrate_ode_rk45(ode_LV, 
                         y0, 0, rep_array(t, 1), theta,
                         rep_array(0.0, 0), rep_array(1, 1),
                         0.001, 0.001, 100); // tolerances, steps
                         
  // Return relevant population values based on our dummy-variable coding method
  return (y[1, 1] * (1 - delta) + 
          y[1, 2] * delta);
}
"

Now, we can incorporate this function into brms. For the sake of making everything fit, we’ll do this in pieces. First, we have our formula.

lv_formula <- 
  bf(pelts ~ log(eta * p), # p is the proportion of creatures trapped
     nlf(eta ~ LV(t, H1, L1, bh, mh, bl, ml, delta)),
     H1 ~ 1, L1 ~ 1,
     bh ~ 1, mh ~ 1,
     bl ~ 1, ml ~ 1,
     sigma ~ 0 + population,
     p ~ 0 + population,
     nl = TRUE
  )

And our priors:

lv_priors <- c(
  prior(lognormal(log(10), 1), nlpar = H1, lb = 0),
  prior(lognormal(log(10), 1), nlpar = L1, lb = 0),
  prior(normal(1, 0.5),     nlpar = bh, lb = 0),
  prior(normal(0.05, 0.05), nlpar = bl, lb = 0),
  prior(normal(0.05, 0.05), nlpar = mh, lb = 0),
  prior(normal(1, 0.5),     nlpar = ml, lb = 0),
  prior(exponential(1), dpar = sigma, lb = 0),
  prior(beta(40, 200), nlpar = p, lb = 0, ub = 1)
)

And now we fit:

m2 <- 
  brm(data = Lynx_Hare_long, 
      family = brmsfamily("lognormal", link_sigma = "identity"),
      formula = lv_formula, 
      prior = lv_priors, 
      iter = 2000, warmup = 1000, chains = 4, cores = 4,
      init = 0,
      stanvars = stanvar(scode = LotkaVolterra, block = "functions"),
      file = here("files/models/m102.2"))

Importantly, each of these parameters has a biological meaning. That’s not only cool, it’s useful. If only more of our models has meaningful parameters.

m2

 Family: lognormal 
  Links: mu = identity; sigma = identity 
Formula: pelts ~ log(eta * p) 
         eta ~ LV(t, H1, L1, bh, mh, bl, ml, delta)
         H1 ~ 1
         L1 ~ 1
         bh ~ 1
         mh ~ 1
         bl ~ 1
         ml ~ 1
         sigma ~ 0 + population
         p ~ 0 + population
   Data: Lynx_Hare_long (Number of observations: 42) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                     Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
H1_Intercept           132.51     21.36    96.59   181.13 1.00     2460
L1_Intercept            38.75      6.90    27.37    54.80 1.00     2564
bh_Intercept             0.55      0.06     0.43     0.67 1.00     2096
mh_Intercept             0.00      0.00     0.00     0.01 1.00     2099
bl_Intercept             0.00      0.00     0.00     0.01 1.00     1810
ml_Intercept             0.80      0.09     0.64     1.01 1.00     2119
p_populationHare         0.18      0.02     0.13     0.23 1.00     2260
p_populationLynx         0.17      0.02     0.13     0.22 1.00     2523
sigma_populationHare     0.25      0.05     0.18     0.36 1.00     3555
sigma_populationLynx     0.26      0.05     0.18     0.37 1.00     3347
                     Tail_ESS
H1_Intercept             2610
L1_Intercept             2488
bh_Intercept             2487
mh_Intercept             2263
bl_Intercept             2395
ml_Intercept             2613
p_populationHare         2858
p_populationLynx         2127
sigma_populationHare     2479
sigma_populationLynx     1983

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The parameters cooperate to produce population dynamics that are not easy to read in the table. Luckily, we can get the posterior predictive distributions.

Code

expose_functions(m2, vectorize=TRUE)

pred = predicted_draws(m2, newdata=Lynx_Hare_long)

pred |> 
  filter(.draw <= 20) |> 
    ggplot(aes(x = Year, y = .prediction)) +
      geom_line(aes(group = interaction(.draw, population), color = population),
                linewidth = 1/3, alpha = 1/2) +
      scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
      scale_y_continuous("thousands of pelts", breaks = 0:6 * 20) +
      scale_x_continuous(NULL) +
      coord_cartesian(ylim = c(0, 120)) +
  theme(legend.position = "top")

If we want to get the population of animals, rather than the number of pelts, we have to work backwards. We have from our predictions the number of pelts, which is:

\[ \text{pelts} = \text{population} \times p_{\text{trapped}} \]

We estimate that value of \(p\) for each animal – albeit with a tight prior – in the model.

post = 
  as_draws_df(m2) |> 
  filter(row_number() <= 20)

Code

post |> 
  select(starts_with("b_p_")) |> 
  rowid_to_column(var = ".draw") |> 
  pivot_longer(
    -.draw, 
    names_to = "population", 
    names_prefix = "b_p_population",
  values_to = "ptrapped") |> 
  full_join(pred) |> 
  mutate(pop_est = .prediction/ptrapped) |> 
    ggplot(aes(x = Year, y = pop_est)) +
      geom_line(aes(group = interaction(.draw, population), color = population),
                linewidth = 1/3, alpha = 1/2) +
      scale_color_manual(values = c("#1c5253" , "#e07a5f")) +
      scale_y_continuous("thousands of animals") +
      scale_x_continuous(NULL) +
  theme(legend.position = "top")