User’s Guide – interflex

An R package that produces “flexible” marginal effect estimates with multiplicative interaction models

Reference: How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice

Authors: Jens Hainmueller (Stanford); Jonathan Mummolo (Stanford); Yiqing Xu (UCSD)

Maintainer: Yiqing Xu [yiqingxu@ucsd.edu]

Date: March 5, 2017

Version: 1.0.3.

Changes in this version: CRAN Version.

Changes in Version 1.0.2: (1) Fix bugs; (2) Integrate the Wald test into inter.binning; (3) Complete R help files; (4) Updated fast fixed effects algorithm.

Changes in Version 1.0.1: (1) Enhanced speed (with C++ code embedded); (2) Support linear fixed effect models; (3) Improved cross-validation procedure; (4) Improved visualization.

R source files can be found on Github. R code appearing in this demonstration can be downloaded from here.

Comments and suggestions will be greatly appreciated!


Contents

  1. Installation
  2. Load package and toy examples
  3. Raw plots (and GAM plots)
  4. Binning estimates
  5. Kernel estimates
  6. Fixed effects

Installation

You can install the interflex package from CRAN:

install.packages('interflex', type = "source", repos = 'http://cran.us.r-project.org') 

Or you can install the up-to-date development version from Github:

install.packages('devtools', repos = 'http://cran.us.r-project.org') # if not already installed
library(devtools)
install_github('xuyiqing/interflex')

interflex depends on the following packages, which will be installed AUTOMATICALLY when interflex is being installed; if not, please install them manually:

require(Rcpp) # for processing C++ code
require(ggplot2)  # for plotting
require(plotrix) # for plotting
require(mgcv) # for GAM
require(pcse) # in case panel-corrected standard errors will be used
require(foreach)  # for parallel computing in kernel estimation
require(doParallel) # for parallel computing in kernel estimation
require(lmtest) # for wald test
require(Lmoments) # for L-kurtosis measure

Mac users who encounter “-lgfortran” or “-lquadmath” error during installation, please check out the solution here. Typing the following two lines of code in your Terminal should solve this problem.

curl -O http://r.research.att.com/libs/gfortran-4.8.2-darwin13.tar.bz2
sudo tar fvxz gfortran-4.8.2-darwin13.tar.bz2 -C /

Load Package and Toy Examples

Now let’s load the package as well as four simulated toy datasets:

library(interflex)
data(interflex)
ls()
## [1] "s1" "s2" "s3" "s4"

s1 is a case of dichotomous treatment indicator with linear marginal effects; s2 is a case of continuous treatment indicator with linear marginal effects; s3 is a case of dichotomous treatment indicator with nonlinear marginal effects; and s4 is a case of dichotomous treatment indicator, nonlinear marginal effects, with additive two-way fixed effects. They are generated using the following code.

set.seed(1234)
n<-200
d1<-sample(c(0,1),n,replace=TRUE) # dichotomous treatment
d2<-rnorm(n,3,1) # continuous treatment
x<-rnorm(n,3,1) # moderator
z<-rnorm(n,3,1) # covariate
e<-rnorm(n,0,1) # error term

## linear marginal effect
y1<-5 - 4 * x - 9 * d1 + 3 * x * d1 + 1 * z + 2 * e
y2<-5 - 4 * x - 9 * d2 + 3 * x * d2 + 1 * z + 2 * e
s1<-cbind.data.frame(Y = y1, D = d1, X = x, Z1 = z)
s2<-cbind.data.frame(Y = y2, D = d2, X = x, Z1 = z)

## quadratic marginal effect
x3 <- runif(n, -3,3) # uniformly distributed moderator
y3 <- d1*(x3^2-2.5) + (1-d1)*(-1*x3^2+2.5) + 1 * z + 2 * e
s3 <- cbind.data.frame(D=d1, X=x3, Y=y3, Z1 = z)

## adding two-way fixed effects
n  <- 500
d4 <-sample(c(0,1),n,replace=TRUE) # dichotomous treatment
x4 <- runif(n, -3,3) # uniformly distributed moderator
z4 <- rnorm(n, 3,1) # covariate
alpha <- 20 * rep(rnorm(n/10), each = 10)
xi <- rep(rnorm(10), n/10)
y4 <- d4*(x4^2-2.5) + (1-d4)*(-1*x4^2+2.5) + 1 * z4 + alpha + xi + 2 * rnorm(n,0,1)
s4 <- cbind.data.frame(D=d4, X=x4, Y=y4, Z1 = z4, unit = rep(1:(n/10), each = 10), year = rep(1:10, (n/10)))

The implied population marginal effects for the DGPs of s1 and s2 are \[ME(X) = 3X - 9;\] the implied population marginal effects for the DGPs of s3 and s4 are \[ME(X) = 2X^{2} - 5.\]

The interflex package ships four functions: inter.raw, inter.gam, inter.binning, and inter.kernel, which we will explain in detail below.


Raw Plots

The first step of the diagnostics is to plot raw data. We supply the function inter.raw with the variable names of the outcome Y, the treatment D, and the moderator X. You can also supply labels for these variables. If you supply a variable name to the weights option, the linear and LOESS fits will be adjusted based on the weights. Note that the correlations between covariates Z and Y are NOT partialed out.

inter.raw(Y = "Y", D = "D", X = "X", data = s1, weights = NULL, Ylabel = "Outcome", Dlabel = "Treatment", Xlabel="Moderator")

inter.raw(Y = "Y", D = "D", X = "X", data = s2, Ylabel = "Outcome", Dlabel = "Treatment", Xlabel="Moderator")