install.packages("fdid")1 Get Started
This chapter provides installation instructions and introduces the mortality dataset used in the tutorial.
1.1 Installation
To install fdid from CRAN, run the code chunk below:
The development version of the fdid package can be installed from Github:
# If not already installed
install.packages('devtools', repos = 'http://cran.us.r-project.org')
# Directly install
devtools::install_github('xuyiqing/fdid')If you downloaded the source folder and want to install locally, first build a .tar.gz file and then install it:
# It can be implemented by "Terminal" of R Studio:
# (Typing the command after ':' in each step at the 'Terminal' page
# and replace "path/to/" with the actual path)
# 1. Set directory: cd path/to
# 2. Build package: R CMD build fdid
# 3. Install package: R CMD INSTALL fdid_0.1.0.tar.gzfdid depends on the following packages, which should be installed automatically when fdid is being installed. You can also install them manually.
install_all <- function(packages) {
installed_pkgs <- installed.packages()[, "Package"]
for (pkg in packages) {
if (!pkg %in% installed_pkgs) {
install.packages(pkg)
}
}
}
packages <- c("estimatr", "dplyr", "tidyr", "rlang", "tidyselect", "RColorBrewer",
"foreach", "doParallel", "ebal", "grf", "car",
"sandwich")
install_all(packages)1.2 Dataset
As in the empirical application of Xu, Zhao, and Ding (2026), we use the county-year panel from Cao, Xu, and Zhang (2022)[Paper] to illustrate the workflow.
Cao, Xu, and Zhang (2022) examines the role of social capital in disaster relief during China’s Great Famine and finds that the rise in the mortality rate during the famine years is significantly less in counties with a higher clan density. In their research design, the event is China’s Great Famine (1958–1961), and the baseline factor \(G\) is social capital, measured by the density of genealogy books.
fdid ships this long-format panel dataset mortality, which works directly with fdid_prepare() we’ll explain in Chapter 2. To load it, run the code chunk below:
The mortality dataset includes 16 variables. The variables are described below.
- Unit and period identifiers:
provididentifies the province andcountyidis the id of the county in this province;yearindicates the period of the variables. - Baseline factor in different measurement methods:
pczupuis the number of genealogy books in a county;anyzupuis a binary variable which equals 1 if there’s any genealogy book in a county;zupuis a discrete variable to differentiate counties with some genealogies (zupu = 1) and counties with many genealogies (zupu = 2). - Outcome:
mortalityis a county’s mortality in a year. - Pre-event covariates:
-
avggrain: per capita grain production -
nograin: ratio of non-farming land -
urban: share of urban population -
dis_bj: distance from Beijing (km) -
dis_pc: distance from provincial capital (km) -
rice: suitable for rice cultivation (indicator) -
minority: share of ethnic minorities -
edu: average years of education -
lnpop: log county population
-
head(mortality)