Fixed Effects Counterfactual Estimators
fect.Rd
Implements counterfactual estimators in TSCS data analysis and statistical tools to test their identification assumptions.
Usage
fect(formula = NULL, data, Y, D, X = NULL, group = NULL,
na.rm = FALSE,
index, force = "two-way", r = 0, lambda = NULL, nlambda = 10,
CV = NULL, k = 10, cv.prop = 0.1, cv.treat = FALSE,
cv.nobs = 3, cv.donut = 0, criterion = "mspe",
binary = FALSE, QR = FALSE,
method = "fe",
se = FALSE, vartype = "bootstrap", nboots = 200, alpha = 0.05,
parallel = TRUE, cores = NULL, tol = 0.001, seed = NULL,
min.T0 = NULL, max.missing = NULL,
proportion = 0.3, pre.periods = NULL,
f.threshold = 0.5, tost.threshold = NULL,
knots = NULL, degree = 2,
sfe = NULL, cfe = NULL,
balance.period = NULL, fill.missing = FALSE,
placeboTest = FALSE, placebo.period = NULL,
carryoverTest = FALSE, carryover.period = NULL, carryover.rm = NULL,
loo = FALSE, permute = FALSE, m = 2, normalize = FALSE)
Arguments
- formula
an object of class "formula": a symbolic description of the model to be fitted, e.g, Y~D+X1+X2
- data
a data frame, can be a balanced or unbalanced panel data.
- Y
the outcome indicator.
- D
the treatment indicator. The treatment should be binary (0 and 1).
- X
time-varying covariates. Covariates that have perfect collinearity with specified fixed effects are dropped automatically.
- group
the group indicator. If specified, the group-wise ATT will be estimated.
- na.rm
a logical flag indicating whether to list-wise delete missing observations. Default to FALSE. If
na.rm = FALSE
, it allows the situation when Y is missing but D is not missing for some observations. Ifna.rm = TRUE
, it will list-wise delete observations whose Y, D, or X is missing.- index
a two-element string vector specifying the unit and time indicators. Must be of length 2. Every observation should be uniquely defined by the pair of the unit and time indicator.
- force
a string indicating whether unit or time or both fixed effects will be imposed. Must be one of the following, "none", "unit", "time", or "two-way". The default is "two-way".
- r
an integer specifying the number of factors. If
CV = TRUE
, the cross validation procedure will select the optimal number of factors fromr
to 5.- lambda
a single or sequence of positive numbers specifying the hyper-parameter sequence for matrix completion method. If
lambda
is a sequence andCV = 1
, cross-validation will be performed.- nlambda
an integer specifying the length of hyper-parameter sequence for matrix completion method. Default is
nlambda = 10
.- CV
a logical flag indicating whether cross-validation will be performed to select the optimal number of factors or hyper-parameter in matrix completion algorithm. If
r
is not specified, the procedure will search throughr = 0
to5
.- k
an integer specifying number of cross-validation rounds. Default is
k = 10
.- cv.prop
a numerical value specifying the proportion of testing set compared to sample size during the cross-validation procedure.
- cv.treat
a logical flag speficying whether to only use observations of treated units as testing set.
- cv.nobs
an integer specifying the length of continuous observations within a unit in the testing set. Default is
cv.nobs = 3
.- cv.donut
an integer specifying the length of removed observations at the head and tail of the continuous observations specified by
cv.nobs
. These removed observations will not be used to fit the data nor be in the validation set for the cross-validation, e.g, ifcv.nobs=3
andcv.donut = 1
, the first and the last observation in each triplet will not be included in the test set. Default iscv.donut = 0
.- criterion
criterion used for model selection. Default is "mspe".
"mspe"
for the mean squared prediction error,"gmspe"
for the geometric-mean squared prediction errors, ifcriterion="moment"
, we average the residuals in test sets by their relative periods to treatments and then average the squares of these period-wise deviations weighted by the number of observations at each period, it yields a better pre-trend fitting on test sets rather than a better prediction ability."pc"
for the information criterion of interactive fixed effects or generalized synthetic control model.- binary
This version doesn't support this option. a logical flag indicating whether a probit link function will be used.
- QR
This version doesn't support this option. a logical flag indicating whether QR decomposition will be used for factor analysis in probit model.
- method
a string specifying which imputation algorithm will be used.
"fe"
for fixed effects model,"ife"
for interactive fixed effects model,"mc"
for matrix copletion method,"polynomial"
for polynomial trend terms,"bspline"
for regression splines,"gsynth"
for generalized synthetic control method, and"cfe"
for complex fixed effects method Default ismethod = "fe"
.- se
a logical flag indicating whether uncertainty estimates will be produced.
- vartype
a string specifying the type of variance estimator. Choose from
vartype = c("bootstrap", "jackknife", "parametric")
. Default value is"bootstrap"
.- nboots
an integer specifying the number of bootstrap runs. Ignored if
se = FALSE
.- alpha
significant level for hypothesis test and CIs. Default value is
alpha = 0.05
.- parallel
a logical flag indicating whether parallel computing will be used in bootstrapping and/or cross-validation. Ignored if
se = FALSE
.- cores
an integer indicating the number of cores to be used in parallel computing. If not specified, the algorithm will use the maximum number of logical cores of your computer (warning: this could prevent you from multi-tasking on your computer).
- tol
a positive number indicating the tolerance level.
- seed
an integer that sets the seed in random number generation. Ignored if
se = FALSE
andr
is specified.- min.T0
an integer specifying the minimum value of observed periods that a unit is under control.
- max.missing
an integer. Units with number of missing values greater than it will be removed. Ignored if this parameter is set "NULL"(i.e.
max.missing = NULL
, the default setting).- proportion
a numeric value specifying pre-treatment periods that have observations larger than the proportion of observations at period 0. These pre-treatment periods are used used for goodness-of-fit test. Ignore if
se = FALSE
. Deafult isproportion = 0.3
.- pre.periods
a vector specifying the range of pre-treatment period used for goodness-of-fit test. If left blank, all pre-treatment periods specified by
proportion
will be used. Ignore ifse = FALSE
.- f.threshold
a numeric value specifying the threshold for the F-statistic in the equivalent test. Ignore if
se = FALSE
. Deafult isf.threshold = 0.5
.- tost.threshold
a numeric value specifying the threshold for the two-one-sided t-test. If
alpha=0.05
, TOST checks whether the 90 The default value is 0.36 times the standard deviation of the outcome variable after two-way fixed effects are partialed out.- knots
a numeric vector speicfying the knots for b-spline curve trend term.
- degree
an integer speifcying the order of either the b-spline or the polynomial trend term.
- sfe
a vector specifying other fixed effects in addition to unit or time fixed effects that is used when
method="cfe"
.- cfe
a vector of lists specifying interactive fixed effects when
method="cfe"
. For each list, the value of the first element is the name of the group variable for which fixed effects are to be estimated. The value of the second element is the name of a regressor (e.g., a time trend).- balance.period
a vector of length 2 specifying the range of periods for a balanced sample which has no missing observation in the specified range.
- fill.missing
a logical flag indicating whether to allow missing observations in this balanced sample. The default is FALSE.
- placeboTest
a logic flag indicating whether to perform placebo test.
- placebo.period
an integer or a two-element numeric vector specifying the range of pre-treatment periods that will be assigned as pseudo treatment periods.
- carryoverTest
a logic flag indicating whether to perform (no) carryover test.
- carryover.period
an integer or a two-element numeric vector specifying the range of post-treatment periods that will be assigned as pseudo treatment periods.
- carryover.rm
an integer specifying the range of post-treatment periods that will be assigned as pseudo treatment periods.
- loo
a logic flag indicating whether to perform the leave-one-period-out goodness-of-fit test, which is very time-consuming.
- permute
a logic flag indicating whether to perform permutation test.
- m
an integer specifying the block length in permutation test. Default value is
m = 2
.- normalize
a logic flag indicating whether to scale outcome and covariates. Useful for accelerating computing speed when magnitude of data is large. The default is
normalize=FALSE
.
Details
fect
implements counterfactual estimators in TSCS data analysis. These estimators first impute counterfactuals for
each treated observation in a TSCS dataset by fitting an outcome model (fixed effects model, interactive fixed effects model, or
matrix completion) using the untreated observations. They then estimate the individualistic treatment effect for each treated
observation by subtracting the predicted counterfactual outcome from its observed outcome. Finally, the average treatment effect
on the treated (ATT) or period-specific ATTs are calculated. A placebo test and an equivalence test are included to evaluate the
validity of identification assumptions behind these estimators. Data must be with a dichotomous treatment.
Value
- Y.dat
a T-by-N matrix storing data of the outcome variable.
- D.dat
a T-by-N matrix storing data of the treatment variable.
- I.dat
a T-by-N matrix storing data of the indicator for whether is observed or missing.
- Y
name of the outcome variable.
- D
name of the treatment variable.
- X
name of the time-varying control variables.
- index
name of the unit and time indicators.
- force
user specified
force
option.- T
the number of time periods.
- N
the total number of units.
- p
the number of time-varying observables.
- r.cv
the number of factors included in the model -- either supplied by users or automatically chosen via cross-validation.
- lambda.cv
the optimal hyper-parameter in matrix completion method chosen via cross-validation.
- beta
coefficients of time-varying observables from the interactive fixed effect model.
- sigma2
the mean squared error of interactive fixed effect model.
- IC
the information criterion.
- est
result of the interactive fixed effect model based on observed values.
- MSPE
mean squared prediction error of the cross-validated model.
- CV.out
result of the cross-validation procedure.
- niter
the number of iterations in the estimation of the interactive fixed effect model.
- factor
estimated time-varying factors.
- lambda
estimated loadings.
- lambda.tr
estimated loadings for treated units.
- lambda.co
estimated loadings for control units.
- mu
estimated ground mean.
- xi
estimated time fixed effects.
- alpha
estimated unit fixed effects.
- alpha.tr
estimated unit fixed effects for treated units.
- alpha.co
estimated unit fixed effects for control units.
- validX
a logic value indicating if multicollinearity exists.
- validF
a logic value indicating if factor exists.
- id
a vector of unit IDs.
- rawtime
a vector of time periods.
- obs.missing
a matrix stroing status of each unit at each time point.
- Y.ct
a T-by-N matrix storing the predicted Y(0).
- eff
a T-by-N matrix storing the difference between actual outcome and predicted Y(0).
- res
residuals for observed values.
- eff.pre
difference between actual outcome and predicted Y(0) for observations of treated units under control.
- eff.pre.equiv
difference between actual outcome and predicted Y(0) for observations of treated units under control based on baseline (two-way fixed effects) model.
- pre.sd
by period residual standard deviation for estimated pre-treatment average treatment effects.
- att.avg
average treatment effect on the treated.
- att.avg.unit
by unit average treatment effect on the treated.
- time
term for switch-on treatment effect.
- count
count of each term for switch-on treatment effect.
- att
switch-on treatment effect.
- time.off
term for switch-off treatment effect.
- att.off
switch-off treatment effect.
- count.off
count of each term for switch-off treatment effect.
- att.placebo
average treatment effect for placebo period.
- att.carryover
average treatment effect for carryover period.
- eff.calendar
average treatment effect for each calendar period.
- eff.calendar.fit
loess fitted values of average treatment effect for each calendar period.
- N.calandar
number of treated observations at each calendar period.
- balance.avg.att
average treatment effect for the balance sample.
- balance.att
switch-on treatment effect for the balance sample.
- balance.time
term of switch-on treatment effect for the balance sample.
- balance.count
count of each term for switch-on treatment effect for the balance sample.
- balance.att.placebo
average treatment effect for placebo period of the balance sample.
- group.att
average treatment effect for different groups.
- group.output
a list saving the switch-on treatment effects for different groups.
- est.att.avg
inference for
att.avg
.- est.att.avg.unit
inference for
att.avg.unit
.- est.att
inference for
att.on
.- est.att.off
inference for
att.off
.- est.placebo
inference for
att.placebo
.- est.carryover
inference for
att.carryover
.- est.eff.calendar
inference for
eff.calendar
.- est.eff.calendar.fit
inference for
eff.calendar.fit
.- est.balance.att
inference for
balance.att
.- est.balance.avg
inference for
balance.avg.att
.- est.balance.placebo
inference for
balance.att.placebo
.- est.beta
inference for
beta
.- est.group.att
inference for
group.att
.- est.group.output
inference for
group.output
.- att.avg.boot
bootstrap results for
att.avg
.- att.avg.unit.boot
bootstrap results for
att.avg.unit
.- att.count.boot
bootstrap results for
count
.- att.off.boot
bootstrap results for
att.avg.off
.- att.off.count.boot
bootstrap results for
count.off
.- att.placebo.boot
bootstrap results for
att.placebo
.- att.carryover.boot
bootstrap results for
att.carryover
.- balance.att.boot
bootstrap results for
balance.att
.- att.bound
equivalence confidence interval for equivalence test.
- att.off.bound
equivalence confidence interval for equivalence test for switch-off effect.
- beta.boot
bootstrap results for
beta
.- test.out
goodness-of-fit test and equivalent test results for pre-treatment fitting check.
- loo.test.out
leave-one-period-out goodness-of-fit test and equivalent test results for pre-treatment fitting check.
- permute
permutation test results for sharp null hypothesis.
References
Jushan Bai. 2009. "Panel Data Models with Interactive Fixed Effects." Econometrica.
Yiqing Xu. 2017. "Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models." Political Analysis.
Athey, Susan, et al. 2021 "Matrix completion methods for causal panel data models." Journal of the American Statistical Association.
Licheng Liu, et al. 2022. "A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data." American Journal of Political Science.
For more details about the matrix completion method, see https://github.com/susanathey/MCPanel.
See also
plot.fect
and print.fect