out.fect <- fect(Y ~ D + X1 + X2, data = sim_base, index = c("id","time"),
method = "fe", force = "two-way", se = TRUE,
parallel = TRUE, cores = 16, nboots = 1000)8 Effect Heterogeneity
We provide several methods for researchers to explore heterogeneous treatment effects (HTE). These methods help distinguish between effect modification — how the treatment effect varies across subpopulations — and causal moderation — whether changing the moderator causally alters the treatment effect. This chapter demonstrates both descriptive HTE tools and the formal causal moderation framework. R script used in this chapter can be downloaded here.
8.1 Basic HTE Visualization
We start with descriptive tools using sim_base. These work with any estimation method; here we demonstrate with the FE estimator.
8.1.1 Box plot
One way to understand HTE is to use a series of box plots to visualize the estimated individualistic treatment effects of observations under the treatment condition (by setting type = "box"). Although these effects are not identified at the individual observation level, their level of dispersion is informative of treatment effects heterogeneity at different (relative) time periods, as well as model performance.
8.1.2 By calendar time
Another way to explore HTE is to investigate how the treatment effect evolves over time. In the plot below, the point estimates represent the ATTs by calendar time; the blue curve and band represent a lowess fit of the estimates and its 95% confidence interval, respectively; and the red horizontal dashed line represents the ATT (averaged over all time periods).
8.1.3 By a covariate
By setting type = "hte" or type = "heterogeneous", we can also plot the HTE by arbitrary covariates that are unaffected by the treatment. As before, the blue curve and band represent a lowess fit of the estimates and its 95% confidence interval, respectively. The red dashed line represents the ATT. The histogram at the bottom of the figure illustrates the distribution of the covariates, and can be turned off using show.count = FALSE. In our simulated case, the effect size is unrelated to the values of covariate X1.
plot(out.fect, type = "hte", covariate = "X1")We can also plot the CATT when a covariate is discrete. To demonstrate this, we artificially create a moderating variable X3, which must be included in the outcome model and then specified in the heterogeneous treatment effect plot.
sim_base$X3 <- sample(1:3, size = nrow(sim_base), replace = TRUE)
out.fect.X3 <- fect(Y ~ D + X1 + X2 + X3, data = sim_base, index = c("id","time"),
method = "fe", se = TRUE, seed = 123,
nboots = 1000, parallel = TRUE, cores = 16)
#>
#> +----------------------------------------------------------+
#> | Parallel computing: using 16 of 14 available cores. |
#> | |
#> | To change: set cores = <n> in fect(). |
#> | Default: min(available - 2, 8). |
#> +----------------------------------------------------------+As expected, there is not much effect heterogeneity along X3. In the resulting figure, we can also assign labels to the discrete values in the moderator.
8.2 Causal Moderation
A prevalent approach for examining treatment effect heterogeneity in political science is the two-way fixed effects (TWFE) model with a multiplicative interaction term. Despite its widespread use, this model conflates two conceptually distinct quantities:
- Effect modification: How does the average treatment effect vary across subpopulations defined by the moderator? This is a correlational relationship.
- Causal moderation: Does exogenously changing the moderator causally alter the treatment effect? This is a mechanistic relationship.
Under the ignorability of \(M\) assumption — i.e., there are no unobserved links between the moderator and confounders — the two estimands are numerically identical. In practice, divergence between the two estimates provides a useful diagnostic.
8.2.1 Estimation with cm = TRUE
The cm (causal moderation) option in fect() fits two separate imputation models: one for untreated potential outcomes \(\hat{g}_0\) and one for treated potential outcomes \(\hat{g}_1\). The causal moderation CME is then \(\hat{\theta}(m) = \hat{g}_1(m) - \hat{g}_0(m)\).
Currently cm is available for the "fe" and "ife" methods.
out.cm <- fect(Y ~ D + X1 + X2, data = sim_base, index = c("id", "time"),
method = "fe", force = "two-way", se = TRUE,
cm = TRUE, parallel = TRUE, cores = 16, nboots = 1000)8.2.2 Effect modification vs. causal moderation
The default HTE plot shows the effect modification estimate (projecting \(\hat{\delta}_{it}\) onto the moderator):
plot(out.cm, type = "hte", covariate = "X1",
xlab = "Moderator (X1)", ylab = "Effect on Y")By adding cm = TRUE to the plot call, we get the causal moderation estimate:
When the ignorability assumption holds (as in the simulated data), the two curves should be similar.
8.2.3 Scatter without smoothing
To inspect the raw relationship without loess smoothing, set loess.fit = FALSE:
8.3 Diagnostics
8.3.1 Placebo test (pre-treatment HTE)
The placebo test applies the HTE estimation to pre-treatment periods only. Under the conditional parallel trends assumption (CPTA), there should be no relationship between the moderator and the “treatment effect” before treatment onset:
plot(out.cm, type = "hte", covariate = "X1",
pretreatment = TRUE, num.pretreatment = 3,
xlab = "X1", ylab = "Placebo Effect")If the curve is flat around zero, it supports CPTA. A significant non-zero pattern signals that the moderator co-varies with pre-existing trends. The same test can be applied to the causal moderation estimate:
8.4 Over-Identification Test
The linear CM model assumes \(M\) enters linearly. If the true relationship involves \(M^2\) or \(M \times X\), the linear model is misspecified. fect_iden() tests this by regressing residuals on nonlinear terms: \[
H_0:\text{Nonlinear terms have no explanatory power} \quad \Rightarrow \quad n \times R^2 \sim \chi^2(\text{df})
\]
iden.test <- fect_iden(out.cm, moderator = "X1")cat("=== Treated cells (e1) ===\n")
#> === Treated cells (e1) ===
cat(" n =", iden.test$e1$n, "\n")
#> n = 1493
cat(" R-squared =", round(iden.test$e1$r2, 4), "\n")
#> R-squared = 7e-04
cat(" Test stat =", round(iden.test$e1$stat, 3), "\n")
#> Test stat = 1.03
cat(" df =", iden.test$e1$df, "\n")
#> df = 1
cat(" p-value =", round(iden.test$e1$p, 4), "\n\n")
#> p-value = 0.3102
cat("=== Control cells (e0) ===\n")
#> === Control cells (e0) ===
cat(" n =", iden.test$e0$n, "\n")
#> n = 5497
cat(" R-squared =", round(iden.test$e0$r2, 4), "\n")
#> R-squared = 0.0012
cat(" Test stat =", round(iden.test$e0$stat, 3), "\n")
#> Test stat = 6.519
cat(" df =", iden.test$e0$df, "\n")
#> df = 1
cat(" p-value =", round(iden.test$e0$p, 4), "\n")
#> p-value = 0.0107- Large p-values (\(> 0.05\)): No evidence against the linear specification.
- Small p-values (\(< 0.05\)): Nonlinear effects present. Consider more flexible specifications.
You can also test quadratic and interaction terms separately:
# Quadratic terms only (no interactions)
iden.quad <- fect_iden(out.cm, moderator = "X1", interaction = FALSE)
cat("Quadratic-only: p =",
round(iden.quad$e1$p, 4), "(treated),",
round(iden.quad$e0$p, 4), "(control)\n")
#> Quadratic-only: p = NA (treated), NA (control)
# Interactions only (no quadratics)
iden.inter <- fect_iden(out.cm, moderator = "X1", quadratic = FALSE)
cat("Interaction-only: p =",
round(iden.inter$e1$p, 4), "(treated),",
round(iden.inter$e0$p, 4), "(control)\n")
#> Interaction-only: p = 0.4804 (treated), 0.8892 (control)8.5 Discrete Moderator with Causal Moderation
Since X3 is randomly assigned and unrelated to the DGP, both estimates should show no significant heterogeneity.
8.6 Summary of Parameters
| Parameter | Description |
|---|---|
cm = TRUE in fect()
|
Estimate causal moderation model (dual imputation) |
cm = TRUE in plot()
|
Plot causal moderation effect \(\hat{\theta}(m)\) |
type = "hte" |
HTE plot: individual effects vs. covariate |
covariate |
Specify the moderator variable |
pretreatment = TRUE |
Restrict to pre-treatment periods (placebo test) |
num.pretreatment |
Number of pre-treatment periods to include |
loess.fit = FALSE |
Scatter only, without loess smoothing |
covariate.labels |
Labels for discrete moderator categories |
fect_iden() |
Over-identification test for linearity |