This RMarkdown tutorial replicates the core analyses in Tsai and Xu (2018): “Outspoken Insiders: Political Connections and Citizen Participation in Authoritarian China”. The replication, conducted by Jinwen Wu, a predoctoral fellow at Stanford University, is guided by Professor Yiqing Xu. The tutorial summarizes the main data analyses from the article; please refer to the original paper for a comprehensive understanding of the ideas presented.

Click the Code button at the top right and select Show All Code to reveal all code used in this RMarkdown. Click Show in paragraphs to reveal the code used to generate a finding. The R code and data files used in this RMarkdown can be downloaded here. The original replication files can be downloaded from here.

Tsai and Xu (2018) argue that complaint-making is a key mechanism for sustaining the quality of governance in some authoritarian regimes. They ask who is more likely to make complaints in such societies. Drawing on two original surveys—6,000 urban residents (China Public Governance Survey 2013) and 2,000 rural villagers (China Rural Governance Survey 2008)—they find that regime insiders, those with close personal ties to officials, are more likely to complain to and about the government.

1 Conceptual Framework

According to the resource mobilization model (Verba, Schlozman, and Brady 1995), time, money, and civic skills are necessary for citizens to take political action. While these resources are important, they may not fully address the unique challenges in nondemocratic and transitional systems.

Tsai and Xu (2018) note additional hurdles such as barriers to access and information (Tarrow 1998; Khanna and Johnston 2007), as well as political risk and uncertainty (Lieberman, Posner, and Tsai 2014). Formal procedures are often unclear, and criticizing the government can be risky.

Political connection, under the framework, is a resource and can operate through several mechanisms:

Access to Decision Makers: Personal connections can bridge the gap between citizens and an often opaque bureaucracy, making it easier to reach the right officials (Gold, Guthrie, and Wank 2002; Ledeneva 1998; Lust‐Okar 2006).
Information and “Know-How”: Connections can provide crucial knowledge about government policies, procedures, and how to effectively lodge complaints(Michelson 2006).
Protection Against Retribution: Ties to officials can serve as a form of protection or exert a “deterrent effect” against potential retaliation for voicing criticisms, reducing the perceived risks of participation (Shi 1997).

\[ \text{Participation}=f(\text{Resources}) = f\!\bigl(\text{time},\text{money},\text{skills},\underbrace{\text{connections}}_{\text{access + info + protection}}\bigr) \]

From this logic follow three hypotheses:

H1 (Proliferation): Insiders complain more often.
H2 (Dissatisfaction null): The gap is not driven by higher dissatisfaction among insiders.
H3 (Mechanisms): Information and access account for the gap.

2 Data & Measurement

To test their theory, Tsai and Xu (2018) draw on two unique survey datasets from China:

The China Public Governance Survey (CPGS) from 2013, which covered 145 urban districts with an effective sample size of 5,940 respondents.
The China Rural Governance Survey (CRGS) from 2008, which encompassed 101 villages and 1,971 rural respondents.

For urban residents, complaint-making behavior (the key outcome variable) includes raising questions, expressing dissatisfaction, or lodging complaints with the local government through various channels (e.g., visiting offices, calling hotlines). The survey of rural respondents asks whether they had raised questions with village authorities.

The primary “treatment” variable, political connections, is narrowly defined as kinship ties with individuals working in the administrative system. The survey questions for both urban and rural respondents are similar, asking whether they have relatives working in the government. (For the specific wording of the survey questions, please consult the original paper.)

The authors included socioeconomic characteristics (age, education, CCP membership, occupation, income) and regional fixed effects in logistic regression models to estimate the effect.

3 Replicating the Main Findings

3.1 Installing Packages

Several R packages are required for the data analysis and visualization. The code chunk below checks for all required packages and installs the missing ones.

Packages: “haven”, “dplyr”, “fixest”, “modelsummary”, “ggplot2”, “sensemakr”, “tidyr”, “estimatr”, “purrr”, “broom”, “patchwork”,“Matching”.

packages <- c("haven", "dplyr", "fixest", "modelsummary", "ggplot2", "sensemakr", "tidyr", "estimatr", "purrr", "broom", "patchwork", "Matching")


for (pkg in packages) {
  if (!requireNamespace(pkg, quietly = TRUE)) {
    install.packages(pkg)
  }
  library(pkg, character.only = TRUE)
}

We draw on three datasets to estimate how political connections affect complaint‐making and satisfaction in urban and rural China, to unpack mechanisms, and to conduct sensitivity checks.

insider_urban.dta contains urban‐resident survey responses used for baseline and controlled logistic fixed‐effects models.
insider_rural.dta contains rural‐resident survey responses for analogous models in the countryside.

Dataset	Data File	Role in the Analysis
Urban complaints	`insider_urban.dta`	Baseline & controlled logistic FE models for urban complaint‐making
Rural complaints	`insider_rural.dta`	Baseline & controlled logistic FE models for rural complaint‐making

# ——————————————————————————————————————————————
# Load all datasets
# ——————————————————————————————————————————————
urban         <- read_dta("insider_urban.dta")   %>% mutate(distrid = factor(distrid))
rural         <- read_dta("insider_rural.dta")   %>% mutate(v_id    = factor(v_id))

3.2 Connection Effect

Analyzing CPGS data, the authors find that urban residents’ complaints are mainly about government performance and public services (e.g., food safety, transportation, security, utilities).

par(mfrow = c(1,2))
issue_counts <- urban %>%
  filter(!is.na(issue)) %>%  # This line drops NA values
  count(issue) %>%
  mutate(issue_label = case_when(
    issue == 1 ~ "Public Security",
    issue == 2 ~ "Transportation",
    issue == 3 ~ "Utilities",
    issue == 4 ~ "Food and Drug Safety",
    issue == 5 ~ "Air Pollution",
    issue == 6 ~ "Community Environment",
    issue == 7 ~ "Public Health",
    issue == 8 ~ "Licenses, Permits, and Certificates",
    issue == 9 ~ "Right infringement",
    issue == 10 ~ "Misc.",
    TRUE ~ "Unknown"
  )) %>%
  mutate(issue_label = factor(issue_label, 
  levels = c("Food and Drug Safety", "Transportation", "Public Security", "Utilities", "Community Environment", "Right infringement", "Licenses, Permits, and Certificates", "Air Pollution", "Public Health", "Misc.")))

## ------------------------------------------------------------------
## (a) Issues reported
## ------------------------------------------------------------------
p1 <- issue_counts |>
  ggplot(aes(issue_label, n)) +
  geom_bar(stat = "identity", fill = "steelblue") +
  labs(x = NULL, y = "Number of Complaints",
       subtitle = "(a) Issues", caption = "") +
  scale_y_continuous(expand = expansion(mult = c(0, 0.1))) +
  theme_minimal() +
  theme(axis.text.x   = element_text(angle = 45, hjust = 1),
        plot.subtitle = element_text(hjust = .5))

## ------------------------------------------------------------------
## (b) Civic vs. private interests
## ------------------------------------------------------------------
concern_data <- urban %>%
  filter(!is.na(concernme), !is.na(govconn)) %>%
  mutate(concern_cat = case_when(
    concernme == 1 ~ "Only concerns me",
    concernme == 2 ~ "Concerns many people",
    concernme == 3 ~ "Concerns almost everybody"
  ),
  govconn_group = ifelse(govconn == 1, "Insider", "Outsider"))

# Calculate percentages
concern_summary <- concern_data %>%
  group_by(govconn_group, concern_cat) %>%
  summarise(count = n()) %>%
  group_by(govconn_group) %>%
  mutate(percentage = count / sum(count) * 100)

p2 <- concern_summary |>
  ggplot(aes(concern_cat, percentage, fill = govconn_group)) +
  geom_bar(stat = "identity", position = "dodge") +
  geom_text(aes(label = paste0(round(percentage), "%")),
            position = position_dodge(width = .9),
            vjust = -.5, size = 3) +
  labs(x = NULL, y = "Percentage",
       subtitle = "(b) Civic versus Private Interests",
       fill = NULL) +
  scale_fill_manual(values = c(Insider = "orange", Outsider = "steelblue")) +
  scale_y_continuous(limits = c(0, 50), expand = expansion(mult = c(0, 0.1))) +
  theme_minimal() +
  theme(axis.text.x   = element_text(angle = 45, hjust = 1),
        plot.subtitle = element_text(hjust = .5),
        legend.position = "top")

p1 + p2 + plot_layout(ncol = 2)

Replicating Figure 1 in the article. Note: In the right figure, the denominator is the total number of complaints reported by the respondents.

As shown by the bar chart above, insiders are less likely than outsiders to make complaints that concern only themselves (34% for insiders vs. 41% for outsiders) and more likely to raise complaints about issues that “concern almost everybody” (31% for insiders vs. 22% for outsiders).

Table 1 of the original paper presents the results of the main logit regressions, showing the effect of political connections on urban residents’ likelihood of making complaints to the government. Table 2 repeats the logit analysis for the rural sample, separating complaints made to the village committee (Cols. 1–2) from those directed to fellow villagers (Cols. 3–4).

The code below runs a similar model specification, but uses OLS with fixed effects, controls, and robust standard errors clustered at the community (city or village) level.

covar1    <- c("eduyr","ccp","govoff","age","age2","male","hukou")
covar2    <- c("eduyr","ccp","leader","age","age2","male")

# Urban: helper to build formulas
fmla_u <- function(dv, with_ctrl = FALSE) {
  rhs <- if (with_ctrl) c("govconn", covar1) else "govconn"
  as.formula(paste(dv, "~", paste(rhs, collapse = " + "), "| distrid"))
}
fit_u <- function(formula) {
  feols(
    formula,
    data    = urban,
    cluster = ~distrid
  )
}

# Rural: helper to build formulas
fmla_r <- function(dv, with_ctrl = FALSE) {
  rhs <- if (with_ctrl) c("govconn", covar2) else "govconn"
  as.formula(paste(dv, "~", paste(rhs, collapse = " + "), "| v_id"))
}
fit_r <- function(formula) {
  feols(
    formula,
    data    = rural,
    cluster = ~v_id
  )
}

## models
mods_u <- list(
  "To government"                           = fit_u(fmla_u("compgov")),
  "To government w controls"                = fit_u(fmla_u("compgov", TRUE)),
  "Through government offices"              = fit_u(fmla_u("comp_off")),
  "Through government offices w controls"   = fit_u(fmla_u("comp_off", TRUE))
)


mods_r <- list(
  "To village committee" = fit_r(fmla_r("compl")),
  "To village committee w controls" = fit_r(fmla_r("compl", TRUE)),
  "To fellow villagers" = fit_r(fmla_r("compl_vill")), 
  "To fellow villagers w controls" = fit_r(fmla_r("compl_vill", TRUE))
)


## clean names 
coef_map <- c(
  govconn = "Political connections",
  eduyr   = "Years of education",
  ccp     = "Communist party member",
  govoff  = "Government official",
  age     = "Age/10",
  age2    = "Age²/100",
  leader = "Village leader", 
  male    = "Male",
  hukou   = "Urban Hukou"
)


# create a dataframe for plotting coefficients of 'govconn'
coef_df_r <- bind_rows(
  lapply(names(mods_r), \(nm)
  tidy(mods_r[[nm]], conf.int = TRUE) |>
    filter(term == "govconn") |> # IV is govconn
    mutate(model = nm))
)


# collect govconn coefficients only for models with controls 
grab_coef <- function(model_list, area_label){
  bind_rows(lapply(names(model_list), \(nm){
    tidy(model_list[[nm]], conf.int = TRUE) |>
      filter(term == "govconn") |>
      mutate(outcome = nm,
             area    = area_label)
  }))
}
coef_df <- bind_rows(
  grab_coef(mods_u, "Urban"),
  grab_coef(mods_r, "Rural")
) |>
  filter(grepl("w controls", outcome)) |>                 # keep only control specs
  mutate(outcome = sub(" w controls", "", outcome),       # shorten labels
         outcome = factor(outcome, levels = unique(outcome)))

coef_df <- coef_df |>
  mutate(area = factor(area, levels = c("Urban", "Rural")))

ggplot(coef_df, 
       aes(x = outcome,
           y = estimate,
           ymin = conf.low,
           ymax = conf.high)) +
  # Map linetype to area so the error‐bar shows up in the legend
  geom_errorbar(aes(linetype = area), width = 0.10, linewidth = 0.6) +
  geom_point(aes(shape = area), size = 3) +
  geom_text(aes(
      label = sprintf("%.2f", estimate),
      y = ifelse(estimate >= 0, conf.high + 0.01, conf.low - 0.01)
  ),
  size = 3, show.legend = FALSE) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  facet_wrap(~ area, nrow = 1, scales = "free_x") +
  
  # Define a single legend (no title) that combines shape and linetype
  scale_shape_manual(
    name   = NULL,
    values = c(Urban = 16, Rural = 17),
    breaks = c("Urban", "Rural")
  ) +
  scale_linetype_manual(
    name   = NULL,
    values = c(Urban = "solid", Rural = "solid"),
    breaks = c("Urban", "Rural")
  ) +
  
  labs(
    y     = "Effect of political connections",
    x     = NULL
  ) +
  theme_minimal(base_size = 12) +
  theme(
    panel.spacing = unit(1, "lines"),
    axis.text.x   = element_text(angle = 45, hjust = 1),
  )

Replicating results in Tables 1 and 2 in the article.

Across all models, political connections significantly increase complaint-making among respondents.

3.3 Mechanisms

Based on the results from former regressions, Figure 2 explores the potential mechanisms through which political connections might lead to more complaining.

# ---------------------- single model -------------------------
fit_fe <- function(data, dv, ctrls, fe, clust, area, panel){
  rhs  <- paste(c("govconn", ctrls), collapse = " + ")
  frm  <- as.formula(paste0(dv, " ~ ", rhs, " | ", fe))
  
  mod  <- feols(frm, data = data, cluster = clust)
  
  res  <- tidy(mod, conf.int = TRUE) |>
    filter(term == "govconn")
  
  tibble(
    panel     = panel,
    area      = area,
    outcome   = dv,
    coef      = res$estimate,
    se        = res$std.error,
    CI_lower  = res$conf.low,
    CI_upper  = res$conf.high,
    N_clust   = length(unique(data[[clust]])),
  )
}

run_block <- function(dvs, data, ctrls, fe, clust, area, panel){
  map_dfr(dvs, ~ fit_fe(data, .x, ctrls, fe, clust, area, panel))
}

# ------------------- dependent variables ---------------------
urb_know <- c("govfile","mayor","email")
urb_acc  <- c("backdoor","dealgov","pullstring")
rur_know <- c("xiangzhang","governor","newspaper")
rur_acc  <- grep("^welcome", names(rural), value = TRUE)

# ------------------- run everything --------------------------
results <- bind_rows(
  run_block(urb_know, urban, covar1, "distrid", "distrid",
            "Urban", "Political knowledge"),
  run_block(urb_acc , urban, covar1, "distrid", "distrid",
            "Urban", "Access to authorities"),
  run_block(rur_know, rural, covar2, "v_id",    "v_id",
            "Rural", "Political knowledge"),
  run_block(rur_acc , rural, covar2, "v_id",    "v_id",
            "Rural", "Access to authorities")
)

keep   <- c("govfile","mayor","email",
            "xiangzhang","governor","newspaper",
            "backdoor","dealgov","pullstring",
            "welcome_cadre","welcome_town","welcome_coty")

mechanisms <- results |>
  filter(outcome %in% keep) |>
  # readable labels, panels, and display order
  mutate(
    outcome_lbl = recode(outcome,
      govfile        = "Access Gov.\nStatues",
      mayor          = "Know\nMajor's Name",
      email          = "Use Email\nRegularly",
      xiangzhang     = "Know Township\nHead",
      governor       = "Know\nGovernor's Name",
      newspaper      = "Read\nNewspapers",
      backdoor       = "Use Conn.\nfor Pub. Sev.",
      dealgov        = "Deal with the\nGovernment",
      pullstring     = "Pull Strings\nfor Benefits",
      welcome_cadre  = "Cadres Welcome\nInputs",
      welcome_town   = "Township\nWelcomes Inputs",
      welcome_coty   = "County Welcomes\nInputs"
    ),
    panel = if_else(outcome %in% c("govfile","mayor","email",
                                   "xiangzhang","governor","newspaper"),
                    "Political knowledge",
                    "Access to authorities"),
    CI_lower = coef - 1.96 * se,
    CI_upper = coef + 1.96 * se,
    label_x  = coef + if_else(coef >= 0, 0.022, -0.022),
    hjust    = if_else(coef >= 0, 0, 1)
  ) |>
  # set plotting order exactly as in the article
  mutate(outcome_lbl = factor(outcome_lbl,
    levels = c("Access Gov.\nStatues","Know\nMajor's Name",
               "Use Email\nRegularly","Know Township\nHead",
               "Know\nGovernor's Name","Read\nNewspapers",
               "Use Conn.\nfor Pub. Sev.","Deal with the\nGovernment",
               "Pull Strings\nfor Benefits","Cadres Welcome\nInputs",
               "Township\nWelcomes Inputs","County Welcomes\nInputs"))
  )

# 2. plot ----------------------------------------------------
# make sure the facet order is: 1) knowledge, 2) access
mechanisms <- mechanisms %>%
  mutate(
    panel = factor(panel, levels = c("Political knowledge", "Access to authorities")),
    # Reverse factor levels to control drawing order
    area = factor(area, levels = c("Rural", "Urban"))
  ) %>%
  arrange(panel, area, outcome_lbl) %>%  # Now Rural plots first (underneath)
  mutate(outcome_lbl = factor(outcome_lbl, levels = unique(outcome_lbl)))

# Plot with modified aesthetics
ggplot(mechanisms, aes(x = coef, y = outcome_lbl)) +
  geom_vline(xintercept = 0, colour = "grey70", linewidth = .3) +
  # Error bars with explicit linetype mapping
  geom_errorbarh(
    aes(xmin = CI_lower, xmax = CI_upper, linetype = area),
    height = .18, linewidth = .6, show.legend = TRUE
  ) +
  # Points with explicit shape mapping
  geom_point(
    aes(shape = area),
    size = 3
  ) +
  geom_text(
    aes(label = sprintf("%.2f", coef)),
    position = position_nudge(y = 0.25),
    size = 3, show.legend = FALSE
  ) +
  facet_wrap(~ panel, ncol = 1, scales = "free_y") +
  coord_cartesian(xlim = c(-.10, .30)) +
  # Manual scales with Urban as first element
  scale_linetype_manual(
    values = c("Urban" = "solid", "Rural" = "solid"),
    breaks = c("Urban", "Rural")  # This controls legend order
  ) +
  scale_shape_manual(
    values = c("Urban" = 16, "Rural" = 17),
    breaks = c("Urban", "Rural")
  ) +
  labs(
    x = "Effect of political connections",
    y = NULL,
    title = "Political Connections: Knowledge and Access"
  ) +
  theme_minimal(base_size = 12) +
  theme(
    panel.grid.major.y = element_blank(),
    axis.text.y = element_text(size = 9),
    strip.text = element_text(face = "bold", size = 13),
    legend.title = element_blank()
  )

Replicating Figure 2 in the article.

The top panel—“Political Knowledge”—includes measures like knowing the mayor’s or governor’s name. The bottom panel—“Access to Authorities”—covers behaviors such as using email or newspapers to reach officials. In both urban and rural samples, political connections are positively associated with these outcomes.

4 Robustness Checks

4.1 Matching

For robustness checks, the authors first use covariate matching and find that the results remain largely unchanged. They applied both exact matching and Mahalanobis-distance matching. For simplicity, this Markdown file replicates 1:5 non-exact matching using the Mahalanobis distance metric (with bias correction) and estimates the ATT on the matched set.

First, the code drops any observations with missing values in the outcome, treatment, or covariates. Next, the matching_analysis function performs matching without replacement to estimate the treatment effect, using the Match function with bias adjustment on each unit’s outcome (Y), treatment status (Tr), and covariates (X). After matching, it constructs a new dataset of matched pairs, assigns each pair a common identifier (matched_pair), and attaches weights to account for the matching algorithm’s sampling. Finally, it fits an OLS regression with village-level fixed effects and clustered standard errors—following the main model specification shown in Figure 1—using the matching weights.

# Drop NAs
vars_u <- c("compgov", "comp_off", "govconn", covar1)
urban_cc <- urban[complete.cases(urban[, vars_u]), ]

vars_r <- c("compl", "compl_vill", "govconn", covar2)
rural_cc <- rural[complete.cases(rural[, vars_r]), ]

matching_analysis <- function(data, Y, treat, covar, cluster_var) {
  # Ensure the treatment variable is binary (0/1)
  data[[treat]] <- as.numeric(data[[treat]])
  
  # Perform matching
  m.out <- Match(
    Y = data[[Y]],
    Tr = data[[treat]],
    X = data[, covar],
    estimand = "ATT",  # Average Treatment Effect on the Treated
    M = 5,
    replace = FALSE,   # No replacement for more conservative estimates
    ties = TRUE,
    BiasAdjust = TRUE
  )
  
  # Calculate clustered standard errors
  matched_data <- data[c(m.out$index.treated, m.out$index.control), ]
  matched_data$matched_pair <- rep(1:length(m.out$index.treated), 2)
  matched_data$weights <- c(m.out$weights, m.out$weights)
  
  # Fit model with clustered SEs
  model <- feols(
    as.formula(paste0(Y, " ~ ", treat, " | ", cluster_var)),
    data = matched_data,
    weights = ~weights,
    cluster = cluster_var
  )
  
  # Extract results
  tidy_results <- tidy(model, conf.int = TRUE) |> 
    filter(term == treat)
   return(tidy_results)
}


# Urban models
mods_u_matched <- list(
  "To government" = matching_analysis(urban_cc, "compgov", "govconn", covar1, "distrid"),
  "Through government offices" = matching_analysis(urban_cc, "comp_off", "govconn", covar1, "distrid")
)

# Rural models
mods_r_matched <- list(
  "To village committee" = matching_analysis(rural_cc, "compl", "govconn", covar2, "v_id"),
  "To fellow villagers" = matching_analysis(rural_cc, "compl_vill", "govconn", covar2, "v_id")
)

# Function to prepare plotting data
prepare_plot_data <- function(model_list, area_label) {
  bind_rows(lapply(names(model_list), function(nm) {
    model_list[[nm]] |>
      mutate(outcome = nm,
             area = area_label)
  }))
}

# Combine urban and rural results
coef_df_matched <- bind_rows(
  prepare_plot_data(mods_u_matched, "Urban"),
  prepare_plot_data(mods_r_matched, "Rural")
) |>
  mutate(outcome = factor(outcome, levels = unique(outcome)),
         area = factor(area, levels = c("Urban", "Rural")))


ggplot(coef_df_matched, 
       aes(x = outcome,
           y = estimate,
           ymin = conf.low,
           ymax = conf.high)) +
  # Map linetype to area so the error‐bar shows up in the legend
  geom_errorbar(aes(linetype = area), width = 0.10, linewidth = 0.6) +
  geom_point(aes(shape = area), size = 3) +
  geom_text(aes(
      label = sprintf("%.2f", estimate),
      y = ifelse(estimate >= 0, conf.high + 0.01, conf.low - 0.01)
  ),
  size = 3, show.legend = FALSE) +
  geom_hline(yintercept = 0, linetype = "dashed") +
  facet_wrap(~ area, nrow = 1, scales = "free_x") +
  
  # Define a single legend (no title) that combines shape and linetype
  scale_shape_manual(
    name   = NULL,
    values = c(Urban = 16, Rural = 17),
    breaks = c("Urban", "Rural")
  ) +
  scale_linetype_manual(
    name   = NULL,
    values = c(Urban = "solid", Rural = "solid"),
    breaks = c("Urban", "Rural")
  ) +
  
  labs(
    y     = "Effect of political connections (Matched Dataset)",
    x     = NULL
  ) +
  theme_minimal(base_size = 12) +
  theme(
    panel.spacing = unit(1, "lines"),
    axis.text.x   = element_text(angle = 45, hjust = 1),
  )

In urban areas, the estimates for “To government” and “Through government offices” remain nearly unchanged, but with tighter confidence intervals. In rural areas, the estimate for “To village committee” increases from 0.08 to 0.13, and “To fellow villagers” rises from 0.06 to 0.12.

4.2 Sensitivity Analysis

The authors then conduct a sensitivity analysis to assess how vulnerable the estimated association between govconn (political connections) and compgov (complaint-making) is to unobserved confounding. The left panel presents results for the rural sample, while the right panel displays results for the urban sample.

par(mfrow = c(1,2))
# --- sensitivity wrapper 
run_sensitivity <- function(data, Y, tr, Covars, id_var,
                            benchmarks = Covars) {
  
  # fixed‑effect formula: RHS | FE
  rhs  <- paste(c(tr, Covars), collapse = " + ")
  fmla <- as.formula(paste0(Y, " ~ ", rhs, " | ", id_var))
  
  # fit with cluster‑robust SEs
  fit <- feols(fmla, data = data, cluster = id_var)
  
  # sensemakr
  sm <- sensemakr(
    model                = fit,
    treatment            = tr,
    benchmark_covariates = benchmarks,
    kd                   = 1          # one‑covariate benchmarks
  )
  ovb_contour_plot(sm, sensitivity.of = "t-value")
}

run_sensitivity(
  data     = rural,
  Y        = "compl",
  tr       = "govconn",
  Covars   = covar2,
  id_var   = "v_id"
) 


run_sensitivity(
  data     = urban,
  Y        = "compgov",
  tr       = "govconn",
  Covars   = covar1,
  id_var   = "distrid"
)

Replicating Figure 3 in the article.

The red dashed line marks the tipping point: an unobserved confounder would need to reduce the t-value to this threshold to eliminate statistical significance at the 5% level. In both the urban and rural plots, all observed covariates fall well within the robust region, far from this line. Thus, only an implausibly strong confounder—one that accounts for substantially more variance in both the treatment and outcome than any observed variable—could overturn the estimated effect of political connections.

5 Summary

Using survey data from both urban and rural China, Tsai and Xu (2018) demonstrate that individuals with political connections (insiders) are significantly more likely to complain about public services than outsiders, even though they are not more dissatisfied. This markdown file replicates the key findings of the paper, explaining how main pieces of evidence—ranging from the types of complaints made (Figure 1) and the statistical link between connections and complaining (Tables 1 & 2), to exploring the mechanisms of knowledge and access (Figure 2)—contributes to this central argument.

In sum, it appears that, in the context of China, political connections empower participation by providing information and easing access, rather than simply reflecting grievance.

Reference

Gold, Thomas, Doug Guthrie, and David Wank, eds. 2002. Social Connections in China: Institutions, Culture, and the Changing Nature of Guanxi. Cambridge: Cambridge University Press.

Khanna, Jyoti, and Michael Johnston. 2007. “India’s middlemen: connecting by corrupting?” Crime Law and Social Change 48 (3-5): 151–68. https://doi.org/10.1007/s10611-007-9086-0.

Ledeneva, Alena V. 1998. Russia’s Economy of Favours: Blat, Networking and Informal Exchange. Cambridge: Cambridge University Press.

Lieberman, Evan, Daniel Posner, and Lily Tsai. 2014. “Does Information Lead to More Active Citizenship? Evidence from an Education Intervention in Rural Kenya.” World Development 60: 69–83.

Lust‐Okar, Ellen. 2006. “Elections Under Authoritarianism: Preliminary Lessons from Jordan.” Democratization 13: 456–71.

Michelson, Ethan. 2006. “The Practice of Law as an Obstacle to Justice: Chinese Lawyers at Work.” Law and Society Review 40 (1): 1–38.

Shi, Tianjian. 1997. Political Participation in Beijing. Cambridge, MA: Harvard University Press.

Tarrow, Sidney. 1998. Power in movement. Cambridge University Press.

Tsai, Lily L., and Yiqing Xu. 2018. “Outspoken insiders: political connections and citizen participation in authoritarian China.” Political Behavior 40 (3): 629–57. https://doi.org/10.1007/s11109-017-9416-6.

Verba, Sidney, Kay Schlozman, and Henry Brady. 1995. Voice and Equality: Civic Voluntarism in American Politics. Cambridge, MA: Harvard University Press.

Replicating Tsai & Xu (2018)

Jinwen Wu

2025-05-28