fect: Fixed Effects Counterfactual Estimators

Stata package for A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data

Authors: Licheng Liu (MIT); Ye Wang (NYU); Yiqing Xu(Stanford); Ziyi Liu(PKU)

Date: Jun. 10, 2020

Package: fect

Main Reference: A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data

Version: 0.1.1 Please report bugs!

Installation
Data
Estimation
Model Selection
Uncertainty Estimates
Wald Test
Equivalence Test
Placebo Test

Installation

We provide 2 methods for the installation of fect:

Development Version.

You can install the development version of the package by typing the following commands in your STATA console:

. cap ado uninstall fect

. net install fect, from(https://raw.githubusercontent.com/xuyiqing/fect_stata/master/) replace

Manual Installation.

Manual installation takes three simple steps:

Download the zip file from: https://github.com/xuyiqing/fect_stata
Unzip the file
Type the following commands in your STATA console:
. cap ado uninstall fect

. net install fect, all replace from(full_local_path)

Note

The package fect relies on the package reghdfe, users are recommended to update it to the latest version.

. ssc install reghdfe, replace

Two datasets are shipped with the fect package (these sample datasets will be copied into your current directory during installation). In this demo, we will primarily use simdata1. It follows a staggered adoption treatment assignement mechanism, which contains 100 treated units, 100 control units and 35 time periods. The treatment kicks in a staggered adoption fashion from period 21. This package can estimate the average treatment effect (ATT) of the treatment D on the outcome Y, only using information of the outcome, the treatment indicator and the covariates (X1 and X2) from observations under the control status. First we load the data.

. clear

. set more off

. sysuse simdata1
(Written by R.              )

Estimation

In the current version of fect, users can use five methods to make counterfactual predictions by specifying the method option: fe(fixed effect), ife(interactive fixed effects), mc(matrix completion), bspline(unit-specific bsplines) and polynomial(unit-specific time trends). Now we will illustrate the main grammar and options in fect.

Fixed Effects(FE)

In order to estimate the ATT, users should specify the outcome variable, the unit indicator, the time indicator, the treatment indicator and covariates in the command. As an example, the following command will estimate the average treatment effect(ATT) using the two-way fixed effects(FE) model. The force option (can be “none”, “unit”, “time”, and “two-way”) specifies the additive component(s) of the fixed effects included in the model. The default option is “two-way” (including both unit and time fixed effects). After the estimation, users can then access the stored results by calling e-class commands. The estimated average treatment effects are saved in e(ATT), the average treatment effects by period are saved in e(ATTs), and the estimated coefficients of covariates are saved in e(coefs). All of them are matrices.

. fect Y, treat(D) unit(id) time(time) cov(X1 X2) method("fe") force("two-way")
Balanced Panel Data

. mat list e(ATT)

e(ATT)[1,2]
          ATT          N
r1  3.4890777        900

. mat list e(ATTs)

e(ATTs)[47,3]
              s           n         ATT
 r1         -31          20  -.77109593
 r2         -30          20  -.60864002
 r3         -29          20  -.17312056
 r4         -28          40  -.46861663
 r5         -27          40  -.46439406
 r6         -26          40   .05438881
 r7         -25          60  -.52462173
 r8         -24          60  -.53337729
 r9         -23          60  -.24292614
r10         -22          80  -.53957587
r11         -21          80  -.65655398
r12         -20          80  -.49383137
r13         -19         100  -1.0453175
r14         -18         100  -.05875193
r15         -17         100  -.04025012
r16         -16         100  -.87856668
r17         -15         100  -.22867775
r18         -14         100  -.28897884
r19         -13         100  -.66388309
r20         -12         100  -.49370825
r21         -11         100   .05098252
r22         -10         100  -.06818308
r23          -9         100  -.02093183
r24          -8         100    .7874155
r25          -7         100   .33847243
r26          -6         100   .07501461
r27          -5         100   .81570309
r28          -4         100   .50591278
r29          -3         100   .76799464
r30          -2         100   1.5626798
r31          -1         100   1.1092701
r32           0         100   .56834823
r33           1         100   1.8688202
r34           2         100   2.1402876
r35           3         100   1.5997123
r36           4          80   3.5116136
r37           5          80   3.6261504
r38           6          80   2.1411843
r39           7          60   4.0951562
r40           8          60   4.4478188
r41           9          60   4.2820454
r42          10          40   4.0149078
r43          11          40   5.1074643
r44          12          40   5.8417196
r45          13          20   7.4508367
r46          14          20   8.5741386
r47          15          20   7.4203806

. mat list e(coefs)

e(coefs)[1,3]
         cons         X1         X2
r1  10.043207  .92796433  3.0523357

After the estimation, fect will draw a graph visualizing the estimated average treatment effects by period.

For the reference, the true population average effects in simdata1 rise from 1 to 3 from the first post-treatment period to the last. The bar plot at the bottom of the plot shows the number of treated units for each time period.

Interactive Fixed Effects(IFE)

In addition to the two-way fixed effects model, fect also supports the interactive fixed effects (IFE) model. For the IFE approach, users need to specify the number of factors(default to 1) using option r(). For example, the following command will estimate the average treatment effects using an IFE model with 2 factors.

. fect Y, treat(D) unit(id) time(time) cov(X1 X2) method("ife") force("two-way") r(2)
Balanced Panel Data

Users can access the results in the same way as in the FE model.

. mat list e(ATT)

e(ATT)[1,2]
          ATT          N
r1  1.1177092        900

. mat list e(ATTs)

e(ATTs)[47,3]
              s           n         ATT
 r1         -31          20  -.10835096
 r2         -30          20  -.06038439
 r3         -29          20   .35214481
 r4         -28          40  -.19475068
 r5         -27          40  -.16424079
 r6         -26          40   .27811673
 r7         -25          60   .15796195
 r8         -24          60  -.21596542
 r9         -23          60  -.00164069
r10         -22          80   .13636349
r11         -21          80  -.06286745
r12         -20          80  -.11484522
r13         -19         100  -.35158613
r14         -18         100   .26509574
r15         -17         100   .02230865
r16         -16         100  -.15006728
r17         -15         100   .18560499
r18         -14         100  -.03165312
r19         -13         100  -.22084346
r20         -12         100   .06744679
r21         -11         100   .21657231
r22         -10         100   .08356623
r23          -9         100  -.02173787
r24          -8         100   .08333752
r25          -7         100   .12237962
r26          -6         100  -.05976924
r27          -5         100  -.11873318
r28          -4         100   .00759421
r29          -3         100   .19577003
r30          -2         100  -.19017716
r31          -1         100    .0174254
r32           0         100  -.04931726
r33           1         100  -.04900107
r34           2         100   .52023405
r35           3         100    .3050181
r36           4          80   .76847726
r37           5          80   1.5267484
r38           6          80    .5630399
r39           7          60   .88419515
r40           8          60   1.5579557
r41           9          60   1.6127996
r42          10          40   1.1938372
r43          11          40   2.7152529
r44          12          40   2.1226153
r45          13          20   3.7801797
r46          14          20   4.2262712
r47          15          20   2.7478838

. mat list e(coefs)

e(coefs)[1,3]
         cons         X1         X2
r1  11.387863  .95892048  3.0127917

Note: When running on a large dataset, the IFE model may be time-consuming. One makeshift is to use a larger tolerance criterion for convergence (by setting the option tol() in the command), but it may yield a less accurate estimation.

Matrix Completion(MC)

For the MC method, users also need to specify a hyper-parameter (the penalty term) lambda(default to 1) in the command. For example, the following command will estimate the average treatment effect using a MC model with lambda equal to 0.004.

. fect Y, treat(D) unit(id) time(time) cov(X1 X2) method("mc") lambda(0.004)
Balanced Panel Data

. mat list e(ATT)

e(ATT)[1,2]
          ATT          N
r1  2.5668276        900

. mat list e(coefs)

e(coefs)[1,3]
         cons         X1         X2
r1  11.201599  .94348337  3.0173834

Model Selection

This package ships a cross-validation procedure to help determine the tuning parameter in ife, mc, bspline and polynomial models.The cross-validation procedure is ran for k folds (default to 10, can be set in the option kfold()) and the candidate tuning parameter corresponding to the minimal mean squared prediction error(mspe) will be selected. In each round, some observations are removed as the testing set to evaluate the prediction ability of the model with a certain tuning parameter. To alleviate the concern of the serial correlation within a unit, users can remove a consecutive number of observations from a unit as elements in the testing set in order to avoid over fitting caused by serial correlation. The consecutive number is specified in the option cvnobs() (default to 3). After the cross-validation, the mean squared prediction error for each hyper-parameter is saved in e(CV) and the minimum of it is saved as e(min_mspe).

The cross-validation procedure will be implemented in the following scenarios:

1, If users cannot determine which method to use, the cross-validation procedure with method set to “both” can automatically determine the method based on both methods’ out-of-sample predication performance. In this case, the algorithm selects the IFE method, which is the correct model specification. Note that FE is a special case of IFE when r=0. Users can set the maximum number of factors for searching in the ife model in option r() and the number of grids of lambda(s) in nlambda() (default to 10). For the mc model, a lambda normalized by the maximum singular value of the demeaned outcome will be computed and presented too.

. fect Y, treat(D) unit(id) time(time) cov(X1 X2) method("both")  r(4) nlambda(15)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Cross Validation...
fe r=0 force=two-way mspe=8.538
ife r=1 force=two-way mspe=5.437
ife r=2 force=two-way mspe=4.772
ife r=3 force=two-way mspe=5.339
ife r=4 force=two-way mspe=6.146
mc: lambda=.0178 lambda.norm=1 mspe=8.538
mc: lambda=.0105 lambda.norm=.588 mspe=6.98
mc: lambda=.0062 lambda.norm=.346 mspe=5.693
mc: lambda=.0036 lambda.norm=.203 mspe=5.187
mc: lambda=.0021 lambda.norm=.119 mspe=5.272
mc: lambda=.0013 lambda.norm=.07 mspe=5.309
mc: lambda=.0007 lambda.norm=.041 mspe=5.329
mc: lambda=.0004 lambda.norm=.024 mspe=5.391
mc: lambda=.0003 lambda.norm=.014 mspe=5.605
mc: lambda=.0001 lambda.norm=.008 mspe=6.386
mc: lambda=.0001 lambda.norm=.005 mspe=8.408
mc: lambda=.0001 lambda.norm=.003 mspe=8.461
mc: lambda=0 lambda.norm=.002 mspe=8.493
mc: lambda=0 lambda.norm=.001 mspe=8.511
mc: lambda=0 lambda.norm=.001 mspe=8.523
choose fe/ife model with optimal r=2

. di e(min_mspe)
4.7723098

. matrix list e(CV)

e(CV)[5,2]
            r       mspe
r1          0  8.5383835
r2          1  5.4368229
r3          2  4.7723098
r4          3  5.3389702
r5          4   6.146081

2, When the option lambda has a numlist input and the option method is set to “mc”, the results will be compared across all lambdas.

. fect Y, treat(D) unit(id) time(time) cov(X1 X2) method("mc") lambda(0.001 0.002 0.003 0.004 0.005)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Cross Validation...
mc: lambda=.001 lambda.norm=.056 mspe=5.315
mc: lambda=.002 lambda.norm=.112 mspe=5.283
mc: lambda=.003 lambda.norm=.168 mspe=5.208
mc: lambda=.004 lambda.norm=.224 mspe=5.202
mc: lambda=.005 lambda.norm=.28 mspe=5.374
optimal lambda=.004 in mc model

. matrix list e(CV)

e(CV)[5,2]
       lambda       mspe
r1       .001  5.3150454
r2       .002  5.2828522
r3       .003  5.2078257
r4       .004   5.202373
r5       .005  5.3743281

3, When the option cv is on, the mspe of the specified method will be calculated. For the ife method, models with number of factors ranging from 0 to r() will be compared. For the mc method, mspe for the specified lambda will be computed, no matter whether lambda() has a numlist input or not. Users can also restrict the testing set to observations only from treated units by setting the option cvtreat on.

. fect Y, treat(D) unit(id) time(time) cov(X1 X2) method("ife") r(4) cv cvtreat
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Cross Validation...
fe r=0 force=two-way mspe=8.616
ife r=1 force=two-way mspe=5.736
ife r=2 force=two-way mspe=4.905
ife r=3 force=two-way mspe=5.659
ife r=4 force=two-way mspe=6.661
optimal r=2 in fe/ife model

Uncertainty Estimates

The algorithm produces uncertainty estimates when se is on. One can use the non-parametric bootstrap procedure by setting vartype() to “bootstrap” (also the default setting). Note that it only works well when the treatment group is relatively large, e.g. larger than 40. The number of bootstrap runs is set by nboots(). Alternatively, users can obtain uncertainty estimates using the jackknife method by by setting vartype() to “jackknife”. The algorithm obtains standard errors by iteratively dropping one unit (the entire time-series) from the dataset. After the estimation, users can then access the stored results by calling e-class commands. The estimated average treatment effects are saved in e(ATT), the average treatment effects by period are saved in e(ATTs), and the estimated coefficients of covariates are saved in e(coefs).

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) method("ife") r(2) se nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times

. mat list e(ATT)

e(ATT)[1,6]
            ATT            N           sd  Lower_Bound  Upper_Bound       pvalue
r1    1.1177092          900    .25850742     .5651055    1.6380218            0

. mat list e(ATTs)

e(ATTs)[47,7]
                s             n           ATT        ATT_sd   ATT_p_value  ATT_Lower_~d  ATT_Upper_~d
 r1           -31            20    -.10835096      .4310421     .80152822     -.7895304     .85068578
 r2           -30            20    -.06038439     .50693184     .90518236    -1.0389884     .98678583
 r3           -29            20     .35214481     .41575682     .39699677    -.35386431     1.1824859
 r4           -28            40    -.19475068     .26625523     .46450824    -.78490198     .23869842
 r5           -27            40    -.16424079     .29858685     .58227795     -.8895551     .31997144
 r6           -26            40     .27811673     .24735154     .26085252     -.1542975     .85295516
 r7           -25            60     .15796195     .19663931     .42179668    -.21555533     .55887252
 r8           -24            60    -.21596542     .24899581     .38575268    -.62279195     .28883922
 r9           -23            60    -.00164069     .21150543     .99381071    -.45867634     .45035744
r10           -22            80     .13636349     .20719628     .51045024    -.27630085     .52443373
r11           -21            80    -.06286745     .20200366     .75563389    -.44826174     .38877547
r12           -20            80    -.11484522     .19158117     .54886627    -.53973246     .19411793
r13           -19           100    -.35158613     .15955622     .02755777    -.66962087     -.0519049
r14           -18           100     .26509574     .18690522      .1560906    -.12355641     .60600609
r15           -17           100     .02230865     .14717668      .8795203    -.28904003     .31616765
r16           -16           100    -.15006728     .17020826     .37795603    -.53622365     .17316614
r17           -15           100     .18560499     .19491062     .34096667    -.23731235     .53748679
r18           -14           100    -.03165312     .14598237     .82834208    -.31481344     .24974313
r19           -13           100    -.22084346     .18919691     .24310224    -.55068594     .14983009
r20           -12           100     .06744679     .16847959     .68891627    -.22389117     .40835366
r21           -11           100     .21657231     .15767087     .16957456    -.13480952     .48548201
r22           -10           100     .08356623     .16362417     .60954672     -.1950033     .45377755
r23            -9           100    -.02173787     .17451134     .90086854    -.31801969     .34963101
r24            -8           100     .08333752     .16451749     .61246556     -.2079104     .46332332
r25            -7           100     .12237962     .17575169     .48622772    -.28059134     .45818549
r26            -6           100    -.05976924     .14474356     .67965651    -.33350363     .24963236
r27            -5           100    -.11873318     .17456862     .49640831    -.45260999     .26205447
r28            -4           100     .00759421     .16938299     .96423918    -.30217379     .38218406
r29            -3           100     .19577003     .16366801     .23164152    -.08811902     .48325184
r30            -2           100    -.19017716     .14931282      .2027759     -.4272626     .08058163
r31            -1           100      .0174254      .1702151     .91846073    -.34297195     .36024907
r32             0           100    -.04931726     .16113946      .7595641    -.38931146     .20763522
r33             1           100    -.04900107     .31514415     .87643677    -.75633675     .50669175
r34             2           100     .52023405      .3510232      .1383269    -.17269914     1.2584279
r35             3           100      .3050181     .34021616     .36996332    -.37514228     .95727777
r36             4            80     .76847726     .33462676     .02164613        .15566     1.4568437
r37             5            80     1.5267484     .34181815     7.949e-06     .84480536     2.1678679
r38             6            80      .5630399     .29836237     .05914676     .00381862     1.1758828
r39             7            60     .88419515      .5170598     .08725769     -.0086214     2.0059652
r40             8            60     1.5579557     .52061015     .00276651      .3593998     2.6675785
r41             9            60     1.6127996       .465763     .00053478      .5999316     2.4206164
r42            10            40     1.1938372     .72642702     .10029251    -.08183594     2.7028506
r43            11            40     2.7152529      .5470615     6.929e-07     1.5546281     3.8903532
r44            12            40     2.1226153      .6129142     .00053389     .75084776      3.388536
r45            13            20     3.7801797     .96590805     .00009093      1.688713     5.3416557
r46            14            20     4.2262712      1.058465     .00006529     2.1043487      6.465354
r47            15            20     2.7478838     1.0413667     .00832176     .75889599     5.1415553

. mat list e(coefs)

e(coefs)[3,5]
             coef           sd      p_value  lower_bound  upper_bound
cons    11.387862    .29894611            0    10.780995    12.048532
  X1    .95892048    .02973494            0    .89947993    1.0079873
  X2    3.0127916    .02805474            0     2.955739    3.0678375

For better visuals, users can limit the period relative to the treatment in the plot to [-14,5]. Users can also change the label of the x-axis, the y-axis and the title by using the options xlabel(), ylabel() and title(), respectively.

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) method("ife") r(2) se preperiod(-14) offperiod(5) xlabel("s") ylabel("ATTs") vartype("jackknife")
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Jackknifing...

Wald Test

This package offer several tests for the accuracy and the robustness of the model. First, it offers a goodness-of-fit test (a variant of the Wald test) to gauge the presence of pre-treatment (differential) trends. Users can conduct this test by setting the option wald on. As a result, the p-value of the Wald test will be shown in the graph and be saved in e(wald_pvalue). A small p-value suggests that the test rejects the null hypothesis that the pre-treatment residuals averages over time are jointly close to zero. In the paper, however, we discuss why the Wald test is not as desirable as an equivalent test. Users can specify a test range in the option preperiod(). For example, preperiod(-4) means that we test pre-treatment trend of the last 5 periods prior(-4 to 0) to the treatment. All periods prior to the treatment will be gauged in the default setting.

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("ife") r(2) preperiod(-14) offperiod(5) wald nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Wald Testing...
Wald Testing: Already Simulated 100 Times
The p-value in wald test is .91

. di e(wald_pvalue)
.91

Equivalence Test

Users can evaluate whether the identification assumption is likely to be valid by conducting an equivalence test. It checks whether the 90% confidence intervals for estimated ATTs at pre-treatment period exceed a pre-specified range. While users can check the values of confidence intervals, fect give a visualization of the equivalence test. When the option equiTest is on, all of the pre-treatment residual average, the equivalence confidence intervals(0.36 times the residual standard deviation) and the minium bounds will be drawn. The test range is also specified in the option preperiod(). Below, we visualize the result of the equivalence test for each of the three estimators using our simulated data. These figures show that both the IFE and MC methods pass the equivalence test while the FE method does not.

FE model

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("fe") preperiod(-25) offperiod(0) equiTest nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Equivalence Test
Equivalence Test...Fail at s=-25
Equivalence Test...Fail at s=-24
Equivalence Test...Fail at s=-22
Equivalence Test...Fail at s=-21
Equivalence Test...Fail at s=-20
Equivalence Test...Fail at s=-19
Equivalence Test...Fail at s=-16
Equivalence Test...Fail at s=-8
Equivalence Test...Fail at s=-5
Equivalence Test...Fail at s=-3
Equivalence Test...Fail at s=-2
Equivalence Test...Fail at s=-1
Equivalence Test...Fail at s=0
Equivalence Test...Fail

IFE model

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("ife") r(2) preperiod(-25) offperiod(0) equiTest nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Equivalence Test
Equivalence Test...Pass

MC model

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("mc") lambda(0.004) preperiod(-25) offperiod(0) equiTest nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Equivalence Test
Equivalence Test...Pass

PlaceboTest

In addition to them, fect provides a placebo test to alleviate the concern of over-fitting in the pre-trend. Users can specify a range of pre-treatment periods as “placebo periods” in the option placeboperiod() to remove observations in the specified range for model fitting, and then test whether the estimated ATT in this range is significantly different from zero. By default, we set [-2, 0] as the placebo periods (placeboperiod(3)) and the plot will display the p-value of the placebo test. Below, we visualize the result of the placebo test for each of the three estimators using our simulated data. These figures show that only the IFE method pass the placebo test test.

FE model

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("fe") placeboTest nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Placebo Test...
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
Placebo Test Fail

IFE model

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("ife") r(2) placeboTest nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Placebo Test...
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
Placebo Test Pass

MC model

. fect Y,  treat(D) unit(id) time(time) cov(X1 X2) se method("mc") lambda(0.004) placeboTest nboots(100)
Balanced Panel Data
-------------------------------------------------------------------------------------------------------------------------------------------------------------
Placebo Test...
Bootstrapping...
ATT Estimation: Already Bootstrapped 100 Times
Placebo Test Fail

Note:

When running on huge dataset, the fect R version is faster than the Stata version and thus will be a better choice, this is because we use C++ and parallel computing in R to accelerate the computation.