The following functions have been wrapped and may differ slightly from the original code. Therefore, they should NOT be considered the official source code for replication. For an exact replication of the original paper, please refer to the source GitHub repository.
1.1 Installation
Several R packages are required for subsequent data analysis and visualization. The code below checks for all required messages and installs the missing ones. After the installation, call to load the packages.
Next, we outline the 10 wrapper functions designed to address 6 research objectives. The table below offers a brief overview of each category. Clicking “More” will direct you to a subsequent section with detailed explanations as well as the full code.
Note
To use the wrapper functions, the simplest method is to source the R script with the following code. Please note, this is NOT the official replication file, and you should carefully review the code before using it.
Calculates and visualizes the Conditional Average Treatment Effect on the Treated (CATT) using Augmented Inverse Probability Weighting (AIPW)-Generalized Random Forests (GRF).
love.plot() is a summary plot of covariate balance before and after conditioning popularized by Dr. Thomas E. Love.
Code
love.plot<-function(data_pre, data_post, treat, covar, threshold=0.1, title="Love.Plot"){standardized_diff<-function(data, treat, covar){treated<-data[data[[treat]]==1, ]control<-data[data[[treat]]==0, ]std_diff<-sapply(covar, function(var){mean_treated<-mean(treated[[var]], na.rm =TRUE)mean_control<-mean(control[[var]], na.rm =TRUE)sd_pooled<-sqrt((var(treated[[var]], na.rm =TRUE)+var(control[[var]], na.rm =TRUE))/2)(mean_treated-mean_control)/sd_pooled})return(std_diff)}std_diff_pre<-standardized_diff(data_pre, treat, covar)std_diff_post<-standardized_diff(data_post, treat, covar)love_data<-data.frame( Variable =rep(covar, 2), Std_Diff =c(std_diff_pre, std_diff_post), Matching =rep(c("Pre-Matching", "Post-Matching"), each =length(covar)))p<-ggplot(love_data, aes(x =Variable, y =Std_Diff, color =Matching))+geom_point(size =3)+geom_hline(yintercept =0, linetype ="dashed", color ="black")+geom_hline(yintercept =threshold, linetype ="dashed", color ="red")+geom_hline(yintercept =-threshold, linetype ="dashed", color ="red")+coord_flip()+labs(title =title, x ="Covariates", y ="Standardized Mean Differences")+theme_minimal()+theme( panel.border =element_rect(color ="black", fill =NA, size =1))return(p)}
assess_overlap() estimates the overlap in propensity scores between the treatment group and the control group. It fits a probability forest to estimate propensity scores based on covariates (cov) and treatment indicators in the input dataset. Then, the function adjusts propensity scores close to zero to facilitate further calculations. Finally, it calls plot_hist to visualize the distribution of the propensity scores or their log odds, depending on the odds parameter.
Code
assess_overlap<-function(data, treat, cov, odds=TRUE, num.trees=NULL, seed=1234, breaks=50, xlim=NULL, ylim=NULL){if(is.null(num.trees)){p.forest1<-probability_forest(X =data[, cov], Y =as.factor(data[,treat]), seed =seed)}else{p.forest1<-probability_forest(X =data[, cov], Y =as.factor(data[,treat]), seed =seed, num.trees =num.trees)}data$ps_assoverlap<-p.forest1$predictions[,2]#range(lcps.plus$ps)data$ps_assoverlap[which(abs(data$ps_assoverlap)<=1e-7)]<-1e-7#range(lcps.plus$ps)if(odds==TRUE){plot_hist(data, "ps_assoverlap", treat, odds =TRUE, breaks =breaks, density =TRUE, main ="", xlim =xlim, ylim =ylim)}else{plot_hist(data, "ps_assoverlap", treat, odds =FALSE, breaks =breaks, density =TRUE, main ="", xlim =c(0, 1), ylim =ylim)}return(data)}
Arguments
Data
data: The targeted dataset.
var: The variable of interest to plot.
treat: The (binary) treatment indicator documented in the dataset (usually 0 for control and 1 for treated).
Analysis
odds: If TRUE, the function transforms the variable into log odds.
cov: Covariates used to estimate the propensity score.
num.trees: Number of trees to use in the probability forest. If NULL, a default is used.
seed: Seed for reproducibility.
Plotting
breaks, density, xlim, ylim, xlab: Parameters for histogram aesthetics and scaling.
text.size: The size of the text for additional information on the plot.
1.2.3psmatch()
psmatch() function matches observations in the treatment group with those in the control group according to propensity scores. The matching procedure is executed on a one-to-one basis without replacement. The function then yields a subset of the original dataset with only the matched cases.
Code
psmatch<-function(data, Y, treat, cov, num.trees=4000, seed=1234, replace=FALSE, estimand="ATT"){set.seed(seed)# need to set seed b/c tie-breaking is randomdata$psmatch<-probability_forest(X =data[, cov], Y =as.factor(data[, treat]), seed =seed, num.trees =num.trees)$predictions[,2]mout<-Match(Y =data[,Y], Tr =data[,treat], X =data$psmatch, estimand =estimand, M =1, BiasAdjust =FALSE, replace=replace, ties =FALSE)data<-data[c(mout$index.treated, mout$index.control), ]return(data)}
Arguments
The parameters used in plot_hist and assess_overlap are carried through for the inputs for the psmatch function. Note that, we differentiate psmatch by using Y to represents the outcome variable of interest. Additional parameters are:
replace: A boolean indicating whether sampling of controls is with replacement (default is FALSE).
estimand: The estimand to be estimated, defaulting to ATT.
1.2.4estimate_all() & plot_coef()
estimate_all() is a comprehensive tool for estimating the Average Treatment Effect on the Treated (ATT) with observational data. Here, we condensed several estimates such as:
Difference in Means
Regression
Oaxaca Blinder (OM:Reg) and Generalized Random Forests (OM:GRF) as an outcome model
Inverse Probability Weighting (IPW), Covariate Balancing Propensity Score(CBPS), and Entropy Balancing
Double/debiased matching learning using elastic net
Augmented Inverse Probability Weighting (AIPW) with GRF
Code
quiet<-function(x){sink(tempfile())on.exit(sink())invisible(force(x))}# difference in meansdiff<-function(data, Y, treat){fml<-as.formula(paste(Y, "~", treat))out<-summary(lm_robust(fml, data =data, se_type ="stata"))$coefficients[treat, c(1, 2, 5, 6)]return(out)# extract coef, se, ci.lower, ci.upper}# regression adjustmentreg<-function(data, Y, treat, covar){fml<-as.formula(paste(Y, "~", treat, "+", paste(covar, collapse =" + ")))out<-summary(lm_robust(fml, data =data, se_type ="stata"))$coefficients[treat, c(1, 2, 5, 6)]# extract coef, se, ci.lower, ci.upperreturn(out)}# matching#library(Matching)matching<-function(data, Y, treat, covar){m.out<-Match(Y =data[, Y], Tr =data[, treat], X =data[, covar], Z =data[, covar], estimand ="ATT", M =5, replace =TRUE, ties =TRUE, BiasAdjust =TRUE)out<-c(m.out$est[1], m.out$se[1], m.out$est[1]-1.96*m.out$se[1],m.out$est[1]+1.96*m.out$se[1])return(out)}psm<-function(data, Y, treat, covar){ps<-probability_forest(X =data[, covar], Y =as.factor(data[,treat]), seed =1234, num.trees =4000)$predictions[,2]m.out<-Match(Y =data[, Y], Tr =data[, treat], X =matrix(ps, nrow(data), 1), estimand ="ATT", M =1, replace =FALSE, ties =FALSE, BiasAdjust =FALSE)if(is.null(m.out$se)==FALSE){se<-m.out$se[1]}else{se<-m.out$se.standard[1]}out<-c(m.out$est[1], se, m.out$est[1]-1.96*se,m.out$est[1]+1.96*se)return(out)}# OM (reg)om.reg<-function(data, Y, treat, covar){tr<-which(data[, treat]==1)co<-which(data[, treat]==0)fml<-as.formula(paste(Y, "~", paste(covar, collapse =" + ")))out.co<-lm(fml, data =data[co, ])Y.tr.hat<-predict(out.co, newdata =data[tr, covar, drop =FALSE])newdata<-cbind.data.frame(Y =c(data[tr, Y], Y.tr.hat), treat =rep(c(1, 0), each =length(tr)))out<-summary(lm_robust(Y~treat, data =newdata, se_type ="stata"))$coefficients["treat", c(1, 2, 5, 6)]return(out)}# OM (grf)#library(grf)om.grf<-function(data, Y, treat, covar){tr<-which(data[, treat]==1)co<-which(data[, treat]==0)out.co<-regression_forest(X =data[co, covar, drop =FALSE], Y =as.vector(data[co, Y]))Y.tr.hat<-as.vector(unlist(predict(out.co, newdata =data[tr, covar, drop =FALSE])))newdata<-cbind.data.frame(Y =c(data[tr, Y], Y.tr.hat), treat =rep(c(1, 0), each =length(tr)))out<-summary(lm_robust(Y~treat, data =newdata, se_type ="stata"))$coefficients["treat", c(1, 2, 5, 6)]return(out)}# IPWipw<-function(data, Y, treat, covar){ps<-probability_forest(X =data[, covar, drop =FALSE], Y =as.factor(data[, treat]), seed =1234)$predictions[,2]fml<-as.formula(paste(Y, "~", treat))weights<-rep(1, nrow(data))co<-which(data[, treat]==0)weights[co]<-ps[co]/(1-ps[co])out<-summary(lm_robust(fml, data =data, weights =weights, se_type ="stata"))$coefficients[treat, c(1, 2, 5, 6)]# extract coef, se, ci.lower, ci.upperreturn(out)}# CBPS#library("CBPS")cbps<-function(data, Y, treat, covar){fml<-as.formula(paste(treat, "~", paste(covar, collapse =" + ")))ps<-quiet(CBPS(fml, data =data, standardize =TRUE)$fitted.values)fml<-as.formula(paste(Y, "~", treat))weights<-rep(1, nrow(data))co<-which(data[, treat]==0)weights[co]<-ps[co]/(1-ps[co])out<-summary(lm_robust(fml, data =data, weights =weights, se_type ="stata"))$coefficients[treat, c(1, 2, 5, 6)]return(out)}# ebal#library(hbal)ebal<-function(data, Y, treat, covar){ebal.out<-hbal::hbal(Y =Y, Treat =treat, X =covar, data =data, expand.degree =1)out<-hbal::att(ebal.out, dr =FALSE)[1, c(1, 2, 5, 6)]return(out)}# hbal# hbal <- function(data, Y, treat, covar) {# hbal.out <- hbal::hbal(Y = Y, Treat = treat, X = covar, data = data, expand.degree = 2, # cv = TRUE)# out <- hbal::att(hbal.out, dr = FALSE)[1, c(1, 2, 5, 6)]# return(out)# }# AIPWaipw<-function(data, Y, treat, covar){#library("grf")for(varinc(Y, treat, covar)){data[, var]<-as.vector(data[, var])}c.forest<-causal_forest(X =data[, covar, drop =FALSE], Y =data[, Y], W =data[, treat], seed =1234)att<-average_treatment_effect(c.forest, target.sample ="treated", method ="AIPW")att<-c(att, att[1]-1.96*att[2], att[1]+1.96*att[2])return(att)}aipw.match<-function(data, Y, treat, covar){# match on psps<-probability_forest(X =data[, covar], Y =as.factor(data[, treat]), seed =1234)$predictions[,2]m.out<-Match(Y =data[, Y], Tr =data[, treat], X =ps, estimand ="ATT", M =1, replace =FALSE, ties =FALSE, BiasAdjust =FALSE)mb<-quiet(MatchBalance(treat~ps, data =data, match.out =m.out, nboots=0))ks<-mb$AfterMatching[[1]]$ks$ks$statistics<-data[c(m.out$index.treated, m.out$index.control), ]out<-aipw(s, Y, treat, covar)#return(out)return(c(out, ks))}### This script checks for robustness by estimating original model### using double/debiased machine learning using DoubleML packagedml<-function(data, Y=NULL, treat=NULL, covar=NULL, clust_var=NULL, ml_l=lrn("regr.lm"), ml_m=lrn("regr.lm")){if(is.null(covar)){stop("No controls in specification.")}#require(DoubleML)#require(mlr3learners)#require(fixest)#require(ggplot2)if(is.null(clust_var)==TRUE){dat=data[,c(Y,treat,covar)]dat=na.omit(dat)dml_dat=DoubleMLData$new(dat, y_col =Y, d_cols =treat, use_other_treat_as_covariate =FALSE, x_cols =covar)}else{dat=data[,c(Y, treat, covar, clust_var)]dat[,clust_var]=as.numeric(factor(dat[,clust_var]))dat=dat[is.na(dat[,Y])==FALSE,]dat=dat[is.na(dat[,D])==FALSE,]features=data.frame(model.matrix(formula(paste(c('~ 1',treat,covar), collapse="+")), dat))dat=cbind(dat[,c(Y,clust_var)],features)dml_dat=DoubleMLClusterData$new(dat, y_col =Y, d_cols =treat, cluster_cols =clust_var, use_other_treat_as_covariate =FALSE, x_cols =covar)}# Set active treatment treatmentdml_dat$set_data_model(treat)# Estimate with DMLset.seed(pi)dml_mod=DoubleMLPLR$new(dml_dat, ml_l=ml_l, ml_m=ml_m)quiet(dml_mod$fit())out=c(dml_mod$coef[treat], dml_mod$se[treat], dml_mod$confint()[treat,])return(out)}# execute all estimators## estimate allestimate_all<-function(data, Y, treat, covar, methods=c("diff", "reg", "om.reg", "om.grf","matching", "psm", "ipw", "cbps", "ebal", "dml", "aipw_grf")){results<-as.data.frame(matrix(NA, length(methods), 4))rownames(results)<-methodscolnames(results)<-c("Estimate", "SE", "CI_lower", "CI_upper")m<-1if("diff"%in%methods){results[m, ]<-diff(data, Y, treat)m<-m+1}if("reg"%in%methods){results[m, ]<-reg(data, Y, treat, covar)m<-m+1}if("om.reg"%in%methods){results[m, ]<-om.reg(data, Y, treat, covar)m<-m+1}if("om.grf"%in%methods){results[m, ]<-om.grf(data, Y, treat, covar)m<-m+1}if("matching"%in%methods){results[m, ]<-matching(data, Y, treat, covar)m<-m+1}if("psm"%in%methods){results[m, ]<-psm(data, Y, treat, covar)m<-m+1}if("ipw"%in%methods){results[m, ]<-ipw(data, Y, treat, covar)m<-m+1}if("cbps"%in%methods){results[m, ]<-cbps(data, Y, treat, covar)m<-m+1}if("ebal"%in%methods){results[m, ]<-quiet(ebal(data, Y, treat, covar))m<-m+1}# if ("hbal" %in% methods) {# results[m, ] <- quiet(hbal(data, Y, treat, covar))# m <- m + 1# }if("dml"%in%methods){results[m, ]<-dml(data, Y, treat, covar)m<-m+1}if("aipw_grf"%in%methods){results[m, ]<-aipw(data, Y, treat, covar)m<-m+1}return(results)}
1.2.4.1 Function calls
quiet: Suppresses output from a function call.
diff: Difference in means estimator. It runs a linear regression adjusting for robust standard errors and returns the coefficient, standard error, and confidence interval for the treatment variable.
reg: Regression adjustment. Similar to diff but includes additional covariates in the regression model.
matching: Propensity score matching using the Matching package. It aligns treated units to control units based on covariates and returns the estimated ATT and its confidence interval.
psm: Propensity score matching using a probability forest, followed by matching and estimation of the ATT.
om.reg: Outcome modeling using regression. It predicts the outcome for the treated units based on the model fitted to the control units, and then estimates the ATT.
om.grf: Outcome modeling using generalized random forests, noted as GRF.
ipw: Inverse probability weighting,denoted as IPW. It weights observations by the inverse of their estimated propensity scores and calculates the treatment effect with a weighted regression.
cbps: Covariate balancing propensity score, noted as CBPS. It estimates propensity scores to achieve balance on covariates across groups.
ebal: Entropy balancing. It reweights the data to balance the covariate distributions.
hbal: Hierarchical balancing. It is an extension of ebal with more complex balancing methods.
aipw: Augmented inverse probability weighting, noted as AIPW.
aipw.match: Combines matching on propensity scores with AIPW.
dml: Double machine learning. It uses machine learning algorithms to control for confounders when estimating treatment effects.
plot_coef() plots the the ATT estimates, allowing for visual comparison.
Code
plot_coef<-function(out, methods=c("diff", "reg", "om.reg", "om.grf", "matching", "psm", "ipw", "cbps", "ebal", "dml", "aipw_grf"),labels=c("Diff-in-Means", "Reg", "OM: Reg", "OM: GRF","NN\nMatching", "PS\nMatching","IPW", "CBPS", "Ebal", "DML\nElasnet", "AIPW-GRF"),main=NULL,ylab="Estimate",band=NULL,line=NULL,grid=TRUE,main.pos=1,main.line=-2,ylim=NULL,textsize=1){if(is.null(methods)==TRUE){methods<-rownames(out)}if(is.null(labels)==TRUE){labels<-methods}# # check# if (is.null(out)==FALSE) {# if (inherits(out, "ivDiag") == FALSE) {stop("\"out\" needs to be a \"ltz\" object.")}# }# # # title# if (is.null(main)==TRUE) {# main <- "Estimates with 95% CIs"# }# Data for the plotdata<-outrg<-range(data[,c(3,4)], na.rm =TRUE)adj<-rg[2]-rg[1]if(is.null(ylim)==TRUE){ylim<-c(min(0, rg[1]-0.3*adj), max(0, rg[2]+0.35*adj))}adj2<-ylim[2]-ylim[1]# Set up the plotncoefs<-length(methods)par(mar =c(2.5, 4, 1, 2))plot(1:ncoefs, data[, 1], xlim =c(0.5, ncoefs+0.5), ylim =ylim, ylab ="", xlab ="", main ="", axes =FALSE, xaxt ="n", yaxt ="n", type ="n")axis(1, at =1:ncoefs, labels =labels, las =1, cex.axis =0.8)axis(2, cex.axis =0.7)mtext(main, main.pos, line =main.line, cex =textsize)mtext(ylab, 2, line =2.5)if(is.null(band)==FALSE){rect(-0.5, band[1], ncoefs+1, band[2], col ="#ff000030", border ="white")# label at bottom}if(is.null(line)==FALSE){abline(h =line, col ="red", lty =2)}if(grid==TRUE){abline(h =axTicks(2), lty ="dotted", col ="gray50")abline(v =c(0.5, c(1:ncoefs)+0.5), lty ="dotted", col ="gray50")# horizontal grid}abline(h =0, col ="red", lwd =2, lty ="solid")segments(y0 =data[, 3], x0 =c(1:ncoefs), y1 =data[, 4], x1 =c(1:ncoefs), lwd =2)#CIpoints(1:ncoefs, data[, 1], pch =16, col =1, cex =1.2)#point coefsbox()}
1.2.5catt() & plot_catt()
These functions aim to estimate and visualize the Conditional Average Treatment Effect on the Treated (CATT). By using robust standard errors ( se_type = “stata”), we aim to obtain reliable standard errors even in the presence of heteroskedasticity or other violations of the classical linear regression assumptions.
catt() estimates the CATT using causal forests.
Code
catt<-function(data, Y, treat, covar){tau.forest<-causal_forest(X =data[, covar], Y =data[, Y], W =data[, treat], num.trees =4000)tau0<-average_treatment_effect(tau.forest, target.sample ="treated", method ="AIPW")tau<-tau.forest$predictionstau.tr<-tau[which(data[, treat]==1)]return(list(catt =tau.tr, att =tau0))}
plot_catt() plots the CATT density and the ATT estimates, allowing for visual comparison.
Code
plot_catt<-function(catt1, catt2, att1, att2,xlab=NULL, ylab=NULL, main=NULL, axes.range=NULL,file=NULL, width=7, height=7){if(is.null(axes.range)==TRUE){axes.range<-range(c(catt1,catt2))}drange<-axes.range[2]-axes.range[1]axes.range<-axes.range+c(-0.1, 0.1)*drangeden1<-density(catt1)den2<-density(catt2)max.den<-max(c(den1$y, den2$y))adj<-drange*0.15/max.denif(!is.null(file)){pdf(file =file, width =width, height =height)par(mar =c(4, 4, 3, 2))}plot(1, xlim =axes.range, ylim =axes.range, type ="n", xlab =xlab, ylab =ylab, main =main)abline(h =0, v =0, col ="gray", lty =3)abline(0, 1, col ="red", lty =2)y1<-adj*den1$y+axes.range[1]-drange*0.03polygon(den1$x, y1, col="#AAAAAA50", border =NA)lines(den1$x, y1, col ="gray", lwd =1)y2<-adj*den2$y+axes.range[1]-drange*0.03polygon(y2, den2$x, col="#AAAAAA50", border =NA)lines(y2, den2$x, col ="gray", lwd =1)points(catt1, catt2, cex =0.5, col ="#AAAAAA80", pch =16)points(catt1, catt2, cex =0.5, col ="#777777", pch =1, lwd =0.5)if(is.null(att1)==FALSE){points(att1, att2, cex =2, col =2, pch =3, lwd =2)}box()if(!is.null(file)){graphics.off()}}
1.2.5.1 Function calls
causal_forest: The catt function begins by training a causal forest model using the causal_forest function from the grf package.
average_treatment_effect: After the causal forest is trained, the function estimates ATT using the doubly-robust AIPW method.
1.2.6est_qte() & plot_qte()
est_qte() function estimates the Quantile Treatment Effect on the Treated (QTET) using doubly robust methods. This effect is the difference in a particular quantile of the outcome distribution between the treated and untreated units.
Code
est_qte<-function(Y, treat, covar, data, ps=TRUE,probs=seq(0.05,0.95,0.05), cores=20,ylim=NULL){# Set upif(is.null(covar)){formla<-as.formula(paste(Y, "~", treat))}else{formla<-as.formula(paste(Y, "~", treat, "+", paste(covar, collapse ="+")))}if(is.null(covar)|ps==FALSE){mod<-ci.qtet(formla, xformla =NULL, data =data, probs =probs, se =TRUE, iters =1000, pl =TRUE, cores =cores)}else{xformla<-as.formula(paste("~", paste(covar, collapse ="+")))mod<-ci.qtet(formla, xformla =xformla, data =data, probs =probs, se =TRUE, iters =1000, pl =TRUE, cores =cores)}return(mod)}
plot_qte() function visualizes the QTET estimates.
Code
plot_qte<-function(mod, mod2=NULL, bm=NULL, main="", ylim=NULL,col=NULL){# ylimif(is.null(ylim)){ylim<-range(c(mod$qte.lower, mod$qte.upper))}# Plotpar(mar =c(3, 3, 1, 1))plot(1, type ="n", xlab ="", ylab ="", xlim =c(0, 1), ylim =ylim, axes =FALSE)box(); axis(1, at =seq(0.1, 0.9, 0.2)); axis(2)mtext("QTET", 2, line =2)mtext("Quantile", 1, line =2)abline(h =0, lty =1, lwd =2, col ="gray")title(main, line =-1.5)# model 2if(is.null(mod2)==FALSE){polygon(c(mod2$probs, rev(mod2$probs)), c(mod2$qte.lower, rev(mod2$qte.upper)), col ="#FC94AF50", border =NA)}# benchmarkif(is.null(bm)==FALSE){polygon(c(bm$probs, rev(bm$probs)), c(bm$qte.lower, rev(bm$qte.upper)), col ="#ADD8E680", border =NA)}# mainif(is.null(col)==TRUE){col1<-"gray30"col2<-"#AAAAAA90"}else{col1<-col[1]col2<-col[2]}polygon(c(mod$probs, rev(mod$probs)), c(mod$qte.lower, rev(mod$qte.upper)), col =col2, border =NA)if(is.null(mod2)==FALSE){lines(mod2$probs, mod2$qte, col ="#DC143C80", lwd =2)points(mod2$probs, mod2$qte, col ="#DC143C", pch =17, cex =0.8)}if(is.null(bm)==FALSE){lines(bm$probs, bm$qte, col =4, lwd =2)points(bm$probs, bm$qte, col =4, pch =16)}lines(mod$probs, mod$qte, col =col1, lwd =2)lines(mod$probs, mod$qte.lower, col =col1, lty =3, lwd =1.5)lines(mod$probs, mod$qte.upper, col =col1, lty =3, lwd =1.5)points(mod$probs, mod$qte, col =col1, pch =16)}
Arguments
ps: A boolean argument; if set to TRUE, propensity scores are used in the estimation.
probs: A sequence of probabilities for which the quantile treatment effects are estimated.
cores: Number of cores to use for parallel computation, which speeds up the process.
mod, mod2, bm: Within the function for QTET estimation, the mod parameter is mandatory, whereas mod2 and bm are optional and may be included for comparative analysis.
1.2.7sens_ana()
sens_ana() function conducts sensitivity analysis on an estimated treatment effect to assess how susceptible the findings are to potential unobserved confounding.
probability_forest: The probability forest is trained on the covariates to estimate the propensity score using the probability of each unit receiving the treatment given the observed covariates.
sensemakr: The function from the sensemakr package utilizes sensitivity analysis on the linear model with the treatment variable, optional benchmark covariates, and a kd multiplier that specifies the range of the sensitivity analysis in terms of the proportion of the treatment effect that is due to the omitted variable.
Note
To use the above functions, we provide an additional R script. You can source the script and apply these functions. However, this is NOT the official replication file, so please review it carefully before use.
Calónico, Sebastian, and Jeffrey Smith. 2017. “The Women of the National Supported Work Demonstration.”Journal of Labor Economics 35 (S1): S65–97.
Dehejia, Rajeev H, and Sadek Wahba. 1999. “Causal Effects in Nonexperimental Studies: Reevaluating the Evaluation of Training Programs.”Journal of the American Statistical Association 94 (448): 1053–62.
———. 2002. “Propensity Score-Matching Methods for Nonexperimental Causal Studies.”Review of Economics and Statistics 84 (1): 151–61.
Imbens, Guido W, Donald B Rubin, and Bruce I Sacerdote. 2001. “Estimating the Effect of Unearned Income on Labor Earnings, Savings, and Consumption: Evidence from a Survey of Lottery Players.”American Economic Review, 778–94.
LaLonde, Robert J. 1986. “Evaluating the Econometric Evaluations of Training Programs with Experimental Data.”The American Economic Review, 604–20.