Research Papers – Yiqing Xu

Research Papers.

Working Papers

The Harmonic Synthetic Control Method

with Ziyi Liu.

Synthetic control methods can produce misleading counterfactual predictions when outcome series contain unit-specific stochastic trends, a common feature of nonstationary macroeconomic data. Existing remedies, such as pre-filtering or differencing, reduce spurious matching but may discard shared nonstationary variation that helps estimate donor weights. We propose Harmonic Synthetic Control (HSC), which replaces this binary choice with a soft allocation mechanism. HSC jointly estimates donor weights and a treated-unit-specific smooth residual component, then extrapolates this component into post-treatment periods using a time-series forecaster. A tuning parameter, selected by rolling-origin cross-validation, governs the division between donor matching and forecasting. As it varies, HSC continuously interpolates between synthetic control applied to differenced outcomes and synthetic control applied to raw outcomes with an intercept or trend. We provide a spectral interpretation showing how HSC downweights low-frequency residual components in donor matching and assigns them to the forecasting branch. A prediction-error decomposition separates weight-estimation distortion from residual-forecasting error. Monte Carlo exercises show that HSC adapts across regimes, performing well when stochastic trends are predominantly common or idiosyncratic, while estimators fixed to one regime can fail in the other.

arXiv

Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

with Tong Wang and Leo Yang Yang.

Interpretable text representations should expose coordinates that are not only predictive, but also meaningful enough for independent auditors to apply. Existing discriminative representations often use anonymous embedding directions, while concept-bottleneck and LLM-assisted methods attach natural-language names to features without ensuring that those definitions are reproducible or distinct from the target label. We propose an operational criterion for interpretable discriminative text representations: each coordinate should satisfy conceptual clarity, measured by chance-adjusted agreement between independent annotators applying the feature definition, and label disentanglement, meaning the feature should not merely paraphrase the prediction target. We instantiate this criterion in LLM-assisted Feature Discovery (LFD), an iterative method that proposes lexical and semantic features from contrastive outcome-opposed text pairs, screens candidates using cross-LLM Cohen's κ, and selects features by residual held-out predictive gain. A stylized analysis connects the κ screen to a per-feature annotation-noise bound, formalizing agreement as a reliability check. Across ten text-classification tasks spanning seven corpora, LFD matches the predictive performance of a strong text bottleneck baseline while producing substantially clearer and less label-entangled features. Human audits with 232 raters show that LFD features achieve higher human--human and human--LLM agreement than baseline concepts, and raters consistently judge them as less label-leaking. These results suggest that agreement-tested, label-disentangled coordinates provide a practical auditability standard for interpretable text classification.

arXiv

Learning Preferences from Conjoint Data: A Structural Deep Learning Approach

with Avidit Acharya and Jens Hainmueller.

Conjoint experiments randomize multidimensional profiles, offering a powerful design for recovering structural preference parameters---including marginal rates of substitution, willingness to pay, and the distribution of preferences across a population. Yet the dominant approach in political science has focused on nonparametric causal estimands that do not leverage this potential. We propose a structural approach that embeds a deep neural network within a random utility logit model, allowing preference parameters to vary as a fully flexible function of respondent characteristics. The neural network addresses the concern that a parametric specification may not capture the true data generating process, while double/debiased machine learning provides valid inference on average preference parameters. We apply our method to three prominent conjoint studies and find rich preference heterogeneity masked by reduced-form averages: a near-zero gender effect coexists with 83% preferring female candidates, opposition to undemocratic behavior is near-universal but varies sharply in intensity, and progressive tax preferences cut across every partisan subgroup.

StatsClaw: An AI-Collaborative Workflow for Statistical Software Development

with Tianzhu Qin

Translating statistical methods into reliable software is a persistent bottleneck in quantitative research. Existing AI code-generation tools produce code quickly but cannot guarantee faithful implementation---a critical requirement for statistical software. We introduce StatsClaw, a multi-agent architecture for Claude Code that enforces information barriers between code generation and validation. A planning agent produces independent specifications for implementation, simulation, and testing, dispatching them to separate agents that cannot see each other's instructions: the builder implements without knowing the ground-truth parameters, the simulator generates data without knowing the algorithm, and the tester validates using deterministic criteria. We describe the approach, demonstrate it end-to-end on a probit estimation package, and evaluate it across three applications to the authors' own R and Python packages. The results show that structured AI-assisted workflows can absorb the engineering overhead of the software lifecycle while preserving researcher control over every substantive methodological decision.

Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Replication and Reanalysis

with Leo Yang Yang.

Computational reproducibility is central to scientific credibility, yet verifying published results at scale remains costly. We develop an AI-assisted workflow for automated full-paper replication---retrieving materials, reconstructing environments, executing code, and matching outputs to point estimates reported in regression tables. We define a universe of all empirical and quantitative papers from the three top political science journals (2010--2025) and measure stated data availability using automated extraction. For a stratified sample of 384 studies, we apply the workflow to conduct full-paper replication, totaling 3,382 empirical models. We find that journal verification requirements, combined with data archiving mandates, drive reproducibility: the full-paper reproducibility rate rises from 29.6% before DA-RT adoption to 79.8% after, and conditional on accessible replication packages, 94.4% of papers are fully reproducible (237/251). As a secondary application, we apply standardized IV diagnostics to 92 studies (215 specifications), illustrating how automated execution enables systematic reanalysis across heterogeneous empirical settings.

arXiv
Demo

The Credibility Revolution in Political Science

with Carolina Torreblanca, William Dinneen, and Guy Grossman.

How has the credibility revolution reshaped political science? We address this question by using a large language model to classify 91,632 articles published between 2003 and 2023 across 174 political science journals, focusing on causal research designs, transparency practices, and citation patterns. Design-based studies---research strategies that explicitly a research design and the assumptions required for causal identification---have become increasingly common, displacing regression-based analyses that rely primarily on modeling assumptions. Yet as of 2023, studies without an explicit identification strategy still constitute nearly 40% of empirical quantitative work. Within design-based research, survey experiments dominate, while field experiments and quasi-experimental approaches have grown more modestly. Transparency practices such as placebo tests and power analysis remain rare. Design-based studies are concentrated in top journals and among authors at highly ranked institutions, and enjoy a persistent citation premium. The credibility revolution has meaningfully reshaped the discipline, though unevenly and incompletely.

User Location Disclosure Fails to Deter Overseas Criticism but Amplifies Regional Divisions on Chinese Social Media

with Leo Yang Yang.

We examine the behavioral effects of a user location disclosure policy implemented by Sina Weibo, China’s largest microblogging platform, using a high-frequency dataset of uncensored user engagement—including tens of thousands of comments—on 165 prominent government and media accounts. Exploiting the platform’s abrupt rollout of IP-based location tags on April 28, 2022, we compare user behavior in comment sections before and after the policy change. Although the policy was publicly justified as a measure to curb misinformation and counter foreign influence, we find no decline in participation by overseas users. Instead, it significantly reduced domestic engagement with local issues outside users’ home provinces, particularly among critical comments. Evidence suggests this effect was not driven by generalized fear or concerns about credibility, but by a rise in regionally discriminatory replies that increased the social cost of cross-provincial engagement. Our findings indicate that identity disclosure tools can produce unintended consequences by activating existing social divisions in ways that reinforce state control without direct censorship.

arXiv

A Response to Recent Critiques of Hainmueller, Mummolo and Xu (2019) on Estimating Conditional Relationships

with Jens Hainmueller, Jiehan Liu, Ziyi Liu, and Jonathan Mummolo

Simonsohn2024 (a) and Simonsohn2024 (b) critique Hainmueller, Mummolo and Xu (2019, HMX), arguing that failing to model nonlinear relationships between the treatment and moderator leads to biased marginal effect estimates and uncontrolled Type-I error rates. While these critiques highlight the issue of under-modeling nonlinearity in applied research, they are fundamentally flawed in several key ways. First, the causal estimand for interaction effects and the necessary identifying assumptions are not clearly defined in these critiques. Once properly stated, the critiques no longer hold. Second, the kernel estimator HMX proposes recovers the true causal effects in the scenarios presented in these recent critiques, which compared effects to the wrong benchmark, producing misleading conclusions. Third, while Generalized Additive Models (GAM) can be a useful exploratory tool (as acknowledged in HMX), they are not designed to estimate marginal effects, and better alternatives exist, particularly in the presence of additional covariates. Our response aims to clarify these misconceptions and provide updated recommendations for researchers studying interaction effects through the estimation of conditional marginal effects.

How Discrimination Increases Chinese Overseas Students’ Support for Authoritarian Rule

with Yingjie Fan, Jennifer Pan and Zijie Shao.

The cross-border flow of people for educational exchange in Western democracies is seen as a way to transfer democratic values to non-democratic regions of the world. What happens when students studying in the West encounter discrimination? Based on an experiment among hundreds of Chinese first-year undergraduates in the United States, we show that discrimination interferes with the transfer of democratic values. Chinese students who study in the United States are more predisposed to favor liberal democracy than their peers in China. However, anti-Chinese discrimination significantly reduces their belief that political reform is desirable for China and increases their support for authoritarian rule. These effects of discrimination are most pronounced among students who are more likely to reject Chinese nationalism. Encountering non-racist criticisms of the Chinese government does not increase support for authoritarianism. Our results are not explained by relative evaluations of US and Chinese government handling of covid-19. Media Coverage: Newsweek, Politico, Monkey Cage, South China Morning Post, The Economist

Pre-Print (SSRN)

Trajectory Balancing: A General Reweighting Approach to Causal Inference with Time-Series Cross-Sectional Data

with Chad Hazlett.

We introduce trajectory balancing, a general reweighting approach to causal inference with time-series cross-sectional (TSCS) data. We focus on settings in which one or more units is exposed to treatment at a given time, while a set of control units remain untreated throughout a time window of interest. First, we show that many commonly used TSCS methods imply an assumption that a unit's non-treatment potential outcomes in the post-treatment period are linear in that unit's pre-treatment outcomes as well as time-invariant covariates. Under this assumption, we introduce the mean balancing method that reweights the control units such that the averages of the pre-treatment outcomes and covariates are approximately equal between the treatment and (reweighted) control groups. Second, we relax the linearity assumption and propose the kernel balancing method that seeks an approximate balance on a kernel-based feature expansion of the pre-treatment outcomes and covariates. The resulting approach inherits the property of handling time-vary confounders as in synthetic control and latent factor models, but has the advantages of: (1) improving feasibility and stability with reduced user discretion compared to existing approaches; (2) accommodating both short and long pre-treatment time periods with many or few treated units; and (3) achieving balance on the high-order ``trajectory" of pre-treatment outcomes rather than their simple average at each time period. We illustrate this method with simulations and two empirical examples.

Rotating to the Top: How Career Tracks Matter in the Chinese Communist Party

with Ruixue Jia.

This paper takes a novel perspective on the selection of leaders in the Chinese Communist Party (CCP) by focusing on career tracks of high-level CCP officials. Career tracks are defined as clusters of similar career trajectories with respect to both vertical and horizontal movements. We measure career tracks of full and alternate CCP Central Committee members from 1982 to 2017 using machine learning techniques and investigate their role in political selection. We find that the career tracks corresponding to high starting positions and frequent rotations are associated with significantly higher probabilities of obtaining top leadership positions in the CCP despite the influence of patronage networks. Moreover, when comparing the relative importance of career tracks and personal connections over time, we find suggestive evidence that the Chinese political system has become more personalistic in the past few years.

Pre-Print (SSRN)

Interpretable Discriminative Text Representations via Agreement and Label Disentanglement

with Tong Wang and Leo Yang Yang.

arXiv

The Harmonic Synthetic Control Method

with Ziyi Liu.

arXiv

Learning Preferences from Conjoint Data: A Structural Deep Learning Approach

with Avidit Acharya and Jens Hainmueller.

StatsClaw: An AI-Collaborative Workflow for Statistical Software Development

with Tianzhu Qin

Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Replication and Reanalysis

with Leo Yang Yang.

arXiv
Demo

The Credibility Revolution in Political Science

with Carolina Torreblanca, William Dinneen, and Guy Grossman.

A Response to Recent Critiques of Hainmueller, Mummolo and Xu (2019) on Estimating Conditional Relationships

with Jens Hainmueller, Jiehan Liu, Ziyi Liu, and Jonathan Mummolo

Trajectory Balancing: A General Reweighting Approach to Causal Inference with Time-Series Cross-Sectional Data

with Chad Hazlett.

User Location Disclosure Fails to Deter Overseas Criticism but Amplifies Regional Divisions on Chinese Social Media

with Leo Yang Yang.

arXiv

How Discrimination Increases Chinese Overseas Students’ Support for Authoritarian Rule

with Yingjie Fan, Jennifer Pan and Zijie Shao.

Pre-Print (SSRN)

Rotating to the Top: How Career Tracks Matter in the Chinese Communist Party

with Ruixue Jia.

Pre-Print (SSRN)

Peer Reviewed Articles

A Practical Guide to Estimating Conditional Marginal Effects: Modern Approaches

with Jiehan Liu and Ziyi Liu. Forthcoming, Elements in Quantitative and Computational Methods for the Social Sciences, Cambridge University Press.

This Element offers a practical guide to estimating conditional marginal effects—how treatment effects vary with a moderating variable—using modern statistical methods. Commonly used approaches, such as linear interaction models, often suffer from unclarified estimands, limited overlap, and restrictive functional forms. This guide begins by clearly defining the estimand and presenting the main identification results. It then reviews and improves upon existing solutions, such as the semiparametric kernel estimator, and introduces robust estimation strategies, including augmented inverse propensity score weighting with Lasso selection (AIPW-Lasso) and double machine learning (DML) with modern algorithms. Each method is evaluated through simulations and empirical examples, with practical recommendations tailored to sample size and research context. All tools are implemented in the accompanying interflex package for R.

How Deceptive Online Networks Reached Millions in the US 2020 Elections

with Ruth Appel, Young Mie Kim, Jennifer Pan, and others. Nature Human Behaviour, 2026.

Deceptive online networks are coordinated efforts that use identity deception to pursue strategic political or financial goals. During the US 2020 elections, these networks reached at least 37 million Facebook and 3 million Instagram users, representing 15% and 2% of the platforms’ active US adult users, respectively. Only 3 networks out of 49—1 network with explicitly political aims and 2 that appeared to use politics as a lure for profit—were responsible for over 70% of users reached. Notably, accounts unaffiliated with the networks played an important role in facilitating this reach by resharing content the three networks produced. Deceptive networks, regardless of whether their goals were political or financial, reached users who were older, more conservative, more frequently exposed to content from untrustworthy sources, and spent more time on Facebook.

Post-Print

Factorial Difference-in-Differences

with Anqi Zhao and Peng Ding. Journal of the American Statistical Association, forthcoming.

We formulate factorial difference-in-differences (FDID), a research design that extends canonical difference-in-differences (DID) to settings in which an event affects all units. In many panel data applications, researchers exploit cross-sectional variation in a baseline factor alongside temporal variation in the event, but the corresponding estimand is often implicit and the justification for applying the DID estimator remains unclear. We frame FDID as a factorial design with two factors, the baseline factor G and the exposure level Z, and define effect modification and causal moderation as the associative and causal effects of G on the effect of Z, respectively. Under standard DID assumptions of no anticipation and parallel trends, the DID estimator identifies effect modification but not causal moderation. Identifying the latter requires an additional factorial parallel trends assumption, that is, mean independence between G and potential outcome trends. We extend the framework to conditionally valid assumptions and regression-based implementations, and further to repeated cross-sectional data and continuous G. We demonstrate the framework with an empirical application on the role of social capital in famine relief in China.

Gauging Preference Stability under Authoritarianism

with Jennifer Pan, Research & Politics, 12(4), December 2025.

Do people living under authoritarianism exhibit stable, constrained preferences? Autocrats have incentives to suppress the formation of stable preferences structured by underlying constraints as such preferences can empower challengers and limit policy choices. However, research in political psychology suggest that such preferences may emerge through internal cognitive processes regardless of external conditions. We address this question by conducting three surveys, including two longitudinal studies, in China, a theoretically important case. We find that preferences related to political institutions, economic policies, nationalistic policies, traditional social values, and ethnic policies exhibit relatively high levels of intertemporal stability over month-long and year-long periods, comparable to patterns observed in competitive electoral democracies. Moreover, individuals with higher levels of political knowledge and education exhibit more stable preferences. These findings suggest that, despite autocratic efforts to suppress stable and constrained preferences, such preferences can still take shape. We also offer practical recommendations for measuring preference configuration in authoritarian contexts.

Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study

with Albert Chiu, Xingchen Lan, and Ziyi Liu. American Political Science Review, Vol. 120, Iss. 1, February 2026, pp. 245–266.

Two-way fixed effects (TWFE) models are widely used in political science to establish causality, but recent methodological discussions highlight their limitations under heterogeneous treatment effects (HTE) and violations of the parallel trends (PT) assumption. This growing literature has introduced numerous new estimators and procedures, causing confusion among researchers about the reliability of existing results and best practices. To address these concerns, we replicated and reanalyzed 49 studies from leading journals using TWFE models for observational panel data with binary treatments. Using six HTE-robust estimators, diagnostic tests, and sensitivity analyses, we find: (i) HTE-robust estimators yield qualitatively similar but highly variable results; (ii) while a few studies show clear signs of PT violations, many lack evidence to support this assumption; and (iii) many studies are underpowered when accounting for HTE and potential PT violations. We emphasize the importance of strong research designs and rigorous validation of key identifying assumptions. (Please see the Erratum, which addresses a typesetting error in the published article.)

Decentralized Propaganda in the Era of Digital Media: The Massive Presence of the Chinese State on Douyin

with Yingdan Lu, Jennifer Pan and Xu Xu. American Journal of Political Science, forthcoming

The rise of social media in the digital era poses unprecedented challenges to authoritarian regimes that aim to influence public attitudes and behaviors. In this paper, we argue that authoritarian regimes have adopted a decentralized approach to producing and disseminating propaganda on social media. In this model, tens of thousands of government workers and insiders are mobilized to produce and disseminate propaganda, and content flows in a multi-directional, rather than a top-down manner. We empirically demonstrate the existence of this new model in China by creating a novel dataset of over five million videos from over 18,000 regime-affiliated accounts on Douyin, the Chinese branding for TikTok. This paper supplements prevailing understandings of propaganda by showing theoretically and empirically how digital technologies are changing not only the content of propaganda, but also the way in which propaganda materials are produced and disseminated.

Disguised Repression: Targeting Opponents with Non-Political Crimes to Undermine Dissent

with Jennifer Pan and Xu Xu. The Journal of Politics, Vol. 88, No. 1, January 2026, pp. 282–298.

Why do authoritarian regimes charge political opponents with non-political crimes when they can levy charges directly related to opponents' political activism? We argue that doing so disguises political repression and undermines the moral authority of opponents, minimizing backlash and mobilization. To test this argument, we conduct a survey experiment, which shows that disguised repression decreases perceptions of dissidents' morality, decreases people's willingness to engage in dissent on behalf of the dissident, and increases support for repression of the dissident. We then assess the external validity of the argument by analyzing millions of Chinese social media posts made before and after a large crackdown of vocal government critics in China in 2013. We find that individuals with larger online followings are more likely to be charged with non-political crimes, and those charged with non-political crimes are less likely to receive public sympathy and support.

Does Legality Produce Political Legitimacy? An Experimental Approach

with Yiqin Fu and Taisu Zhang. The Journal of Legal Studies, Vol. 54, Iss. 2, June 2025, pp. 257-285

This article studies whether pure legality, stripped of normative components that are central to the rule of law, can convey perceived legitimacy to governmental institutions and activity. Through a survey experiment conducted among urban Chinese residents, it examines whether such conveyance is possible under current sociopolitical conditions in which the party-state continues to invest in pure legality without imposing legal checks on the party leadership’s political power and without corresponding investment in substantive rights or freedoms. Among survey respondents, government investment in professional and consistent law enforcement conveys meaningful amounts of political legitimacy. In fact, it does so even when it supports government activity, such as censorship of online speech, that is freedom-depriving and socially controversial, and even when such investment does not necessarily enhance the external predictability of state behavior. However, the legitimacy-enhancing effects of pure legality are likely weaker than those of state investment in procedural justice.

How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice based on 67 Replicated Studies

with Apoorva Lal, Mac Lockhart, and Ziwen Zu. Political Analysis, Vol. 32, Iss. 4, October 2024, pp. 521-540.

Instrumental variable (IV) strategies are widely used in political science to establish causal relationships, but the identifying assumptions required by an IV design are demanding, and assessing their validity remains challenging. In this paper, we replicate 67 articles published in three top political science journals from 2010-2022 and identify several concerning patterns. First, researchers often overestimate the strength of their instruments due to non-i.i.d. error structures such as clustering. Second, the commonly used t-test for two-stage-least-squares (2SLS) estimates frequently underestimates uncertainties, resulting in uncontrolled Type-I errors in many studies. Third, in most replicated studies, 2SLS estimates are significantly larger than ordinary-least-squares estimates, with their ratio negatively correlated with instrument strength in studies with non-experimentally generated instruments, suggesting potential violations of unconfoundedness or the exclusion restriction. We provide a checklist and software to help researchers avoid these pitfalls and improve their practice.

Panel Data Visualization in R (panelView) and Stata (panelview)

with Hongyu Mou and Licheng Liu. Journal of Statistical Software, Vol. 107 (2023), Iss. 7, pp. 1-20.

We develop an R package panelView and a Stata package panelview for panel data visualization. They are designed to assist causal analysis with panel data and have three main functionalities: (1) they plot the treatment status and missing values in a panel dataset; (2) they visualize the temporal dynamics of the main variables of interest; and (3) they depict the bivariate relationships between a treatment variable and an outcome variable either by unit or in aggregate. These tools can help researchers better understand their panel datasets before conducting statistical analysis.

The Power of History: How A Victimization Narrative Shapes National Identity and Public Opinion in China

with Jiannan Zhao. Research & Politics, 10(2), April 2023.

We study the effect of a victimization narrative on national identity and public opinion in China experimentally. Previous research has suggested that governments can shape public opinion by guiding citizens' collective memories of historical events, but few studies have established a clear causal link. By conducting an online survey experiment among approximately 2,000 urban Chinese citizens, we examine the causal impact of historical narratives on political attitudes. We find that, compared to control conditions, a narrative focusing on China’s humiliating past in the late Qing significantly reinforces respondents' attachment to the victim side of the Chinese national identity, raises suspicion of the intention of foreign governments in international disputes, stimulates preference for more hawkish foreign policies, and strengthens support for China's current political system. These effects are particularly strong among respondents without a college degree.

A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data

with Licheng Liu and Ye Wang. American Journal of Political Science, Vol. 68, Iss. 1, January 2024, pp. 160–176.

This paper introduces a unified framework of counterfactual estimation for time-series cross-sectional data, which estimates the average treatment effect on the treated by directly imputing treated counterfactuals. Examples include the fixed effects counterfactual estimator, interactive fixed effects counterfactual estimator, and matrix completion estimator. These estimators provide more reliable causal estimates than conventional twoway fixed effects models when treatment effects are heterogeneous or unobserved time-varying confounders exist. Under this framework, we propose a new dynamic treatment effects plot, as well as several diagnostic tests, to help researchers gauge the validity of the identifying assumptions. We illustrate these methods with two political economy examples and develop an open-source package, fect, in both R and Stata to facilitate implementation.

Hierarchically Regularized Entropy Balancing

with Eddie Yang. Political Analysis, Vol. 31 , Iss. 3 , July 2023 , pp. 457-464.

We introduce hierarchically regularized entropy balancing as an extension to entropy balancing, a reweighting method that adjusts weights for control group units to achieve covariate balance in observational studies with binary treatments. Our proposed extension expands the feature space by including higher-order terms (and interactions) of covariates and then achieves approximate balance on the expanded features. To prioritize balance on variables important for treatment assignment and prevent overfitting, the method imposes ridge penalties with a hierarchical structure on the higher-order terms. Compared with entropy balancing, this extension relaxes model dependency and improves robustness of causal estimates while avoiding optimization failure and highly concentrated weights. It prevents specification searches by minimizing user discretion in selecting features to balance on. Our method is also computationally more efficient than kernel balancing, a kernel-based covariate balancing method. We demonstrate its performance through simulations and an empirical example.

Bayesian Rule Set: A Quantitative Alternative to Qualitative Comparative Analysis

with Albert Chiu. The Journal of Politics, Vol. 85, Iss. 1, January 2023, pp. 280-295.

We introduce Bayesian Rule Set (BRS) as an alternative to Qualitative Comparative Analysis (QCA) when data are large and noisy. BRS is an interpretable machine learning algorithm that classifies observations using rule sets, which are conditions connected by logical operators, e.g., IF (condition A AND condition B) OR (condition C), THEN Y~=~TRUE. Like QCA, BRS is highly interpretable and capable of revealing complex nonlinear relationships in data. It also has several advantages over QCA: It is compatible with probabilistically generated data; it avoids overfitting and improves interpretability by making direct trade-offs between in-sample fitness and complexity; and it remains computationally efficient with many covariates. Our contributions are threefold: We modify the BRS algorithm to facilitate its usage in the social sciences, propose methods to quantify uncertainties of rule sets, and develop graphical tools for presenting rule sets. We illustrate these methods with two empirical examples from political science.

How Government-Controlled Media Shifts Policy Attitudes through Framing

with Jennifer Pan and Zijie Shao. Political Science Research and Methods, Vol. 10 , Iss. 2 , April 2022 , pp. 317-332.

Research shows that government-controlled media is an effective tool for authoritarian regimes to shape public opinion. Does government-controlled media remain effective when it is required to support changes in positions that autocrats take on issues? Existing theories do not provide a clear answer to this question, but we often observe authoritarian governments using government media to frame policies in new ways when significant changes in policy positions are required. By conducting an experiment that exposes respondents to government-controlled media---in the form of TV news segments---on issues where the regime substantially changed its policy positions, we find that by framing the same issue differently, government-controlled media moves respondents to adopt policy positions closer to the ones espoused by the regime regardless of individual predisposition. This result holds for domestic and foreign policy issues, for direct and composite measures of attitudes, and persists up to 48 hours after exposure.

A Bayesian Alternative to Synthetic Control for Comparative Case Studies

with Xun Pang and Licheng Liu. Political Analysis, Vol. 30 , Iss. 2 , April 2022 , pp. 269-288.

This paper proposes a Bayesian alternative to the synthetic control method for comparative case studies with a single or multiple treated units. We adopt a Bayesian posterior predictive approach to Rubin's causal model, which allows researchers to make inferences about both individual and average treatment effects on treated observations based on the empirical posterior distributions of their counterfactuals. The prediction model we develop is a dynamic multilevel model with a latent factor term to correct biases induced by unit-specific time trends. It also considers heterogeneous and dynamic relationships between covariates and the outcome, thus improving precision of the causal estimates. To reduce model dependency, we adopt a Bayesian shrinkage method for model searching and factor selection. Monte Carlo exercises demonstrate that our method produces more precise causal estimates than existing approaches and achieves correct frequentist coverage rates even when sample sizes are small and rich heterogeneities are present in data. We illustrate the method with two empirical examples from political economy.

Public Sentiment on Chinese Social Media during the Emergence of COVID-19

with Yingdan Lu and Jennifer Pan. Journal of Quantitative Description: Digital Media, Vol. 1, 2021, pp. 1--31.

When COVID-19 first emerged in China, there was speculation that the outbreak would trigger public anger and weaken the Chinese regime. By analyzing millions of social media posts from Sina Weibo made between December 2019 and February 2020, we describe the content and sentiment of public, online discussions pertaining to COVID-19 in China. We find that discussions of COVID-19 became widespread on January 20, 2020, consisting primarily of personal reflections, opinion, updates, and appeals. We find that the largest bursts of discussion coincide with the January 23 lockdown of Wuhan and the February 7 death of Dr. Li Wenliang and contain simultaneous spikes of criticism and support targeting the Chinese government. Criticisms are directed at the government for perceived lack of action, incompetence, and wrongdoing—in particular, censoring information relevant to public welfare. Support is directed at the government for aggressive action and positive outcomes. As the crisis unfolds, the same events are interpreted differently by different people, with those who criticize focusing on the government’s shortcomings and those who praise focusing on the government’s actions.

How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice

with Jens Hainmueller and Jonathan Mummolo. Political Analysis, Vol. 27, Iss. 2, April 2019, pp. 163-192.

Multiplicative interaction models are widely used in social science to examine whether the relationship between an outcome and an independent variable changes with a moderating variable. Current empirical practice tends to overlook two important problems. First, these models assume a linear interaction effect that changes at a constant rate with the moderator. Second, estimates of the conditional effects of the independent variable can be misleading if there is a lack of common support of the moderator. Replicating 46 interaction effects from 22 recent publications in five top political science journals, we find that these core assumptions often fail in practice, suggesting that a large portion of findings across all political science subfields based on interaction models are modeling artifacts or are at best highly model dependent. We propose a checklist of simple diagnostics to assess the validity of these assumptions and offer flexible estimation strategies that allow for nonlinear interaction effects and safeguard against excessive extrapolation. These statistical routines are available in both R and STATA. -- Awarded the 2020 Miller Prize for "the best work appearing in Political Analysis in 2019."

Awakening Leviathan: the Effect of Democracy on State Capacity, 1960-2009

with Erik H. Wang. Research & Politics, 5(2), April-June 2018.

Although researchers have often considered democracy and state capacity to be key predictors of cross-national variations in human welfare, few have investigated the relationship between the two variables themselves. We argue that democratization may have a positive, causal effect on state capacity. Employing a time-series cross-national dataset from 1960 to 2009, we document that democratization leads to a substantial increase in state capacity in the long run. Our results prove robust to a rich set of potential confounders and alternative coding of key variables. To further address the problem of endogeneity, we use an instrumental variable strategy that exploits exogenous variations in regional democratic diffusions. We also provide suggestive evidence that democratization enhances state capacity through increasing political contestation. -- Awarded the 2015 Malcolm Jewell Award for the best graduate student paper presented at the SPSA annual meeting.

Outspoken Insiders: Political Connections and Citizen Participation in Authoritarian China

with Lily L. Tsai. Political Behavior, Vol. 40, Iss. 3, September 2018, pp. 629–657.

Given widespread perceptions of risk and uncertainty in nondemocratic systems and developing democracies, why do some citizens still take action and make complaints to authorities? The resource mobilization model identifies the importance of time, money, and civic skills as resources that are necessary for participation. In this paper we build on this model and argue that political connections – close personal ties to someone working in government – can also constitute a critical resource, especially in contexts with weak democratic institutions. Using data from both urban and rural China, we find that individuals with political connections are more likely to contact authorities with complaints about government public services, despite the fact that they do not have higher levels of dissatisfaction with public service provision. We conduct various robustness checks, including a sensitivity analysis, and show that this relationship is unlikely to be driven by an incorrect model specification or unobserved confounding variables.

China’s Ideological Spectrum

with Jennifer Pan. The Journal of Politics, Vol. 80, No. 1, January 2018, pp. 254--273.

The study of ideology in authoritarian regimes---of how public preferences are configured and constrained---has received relatively little scholarly attention. Using data from a large-scale online survey, we study ideology in China. We find that public preferences are weakly constrained, and the configuration of preferences is multi-dimensional, but the latent traits of these dimensions are highly correlated. Those who prefer authoritarian rule are more likely to support nationalism, state intervention in the economy, and traditional social values; those who prefer democratic institutions and values are less likely to be nationalistic or support traditional social values but more likely to support market reforms. This latter set of preferences appears more in provinces with higher levels of development and among wealthier and better educated respondents. These findings suggest preferences are not simply split along a pro-regime or anti-regime cleavage, and indicate a possible link between China's economic reform and societal cleavages. -- SSRN Top Papers of 2015; Media Coverage: NYT, WSJ, FP, ChinaFile.

Incremental Democracy: The Policy Effects of the Partisan Composition of State Government

with Devin Caughey and Chris Warshaw. The Journal of Politics, Vol. 79, No. 4, October 2017, pp. 1342--1358.

How much does it matter which party controls the government? On one hand, campaign positions and roll-call records suggest that contemporary American parties are very ideologically polarized. On the other hand, the existing evidence that electing Democrats into office causes the adoption of more liberal policies is surprisingly weak. We bring clarity to this debate with the aid of a new measure of the policy liberalism of each state in each year 1936-2014, using regression-discontinuity and dynamic panel analyses to estimate the policy effects of the partisan composition of state legislatures and governorships. We find that until the 1980s, partisan control of state government had negligible effects on the liberalism of state policies, but that since then partisan effects have grown markedly. Even today, however, the policy effects of partisan composition remain small relative to differences between states---less than one-tenth of the cross-sectional standard deviation of state policy liberalism. This suggests that campaign positions and roll-call records may overstate the policy effects of partisan selection relative to other factors, such as public opinion.

Why Do Authoritarian Regimes Allow Citizens to Voice Opinions Publicly?

with Jidong Chen. The Journal of Politics, Vol. 79, No. 3, July 2017, pp. 792-803.

Why would an authoritarian regime allow citizens to voice opinions publicly if the exchange of information among citizens spurs social instability as has been often alleged? In this paper, we develop a game theoretic model and show that an authoritarian regime can strengthen its rule by allowing citizens to communicate with each other publicly. From the government’s perspective, such communication has two interrelated functions. First, if public communication reveals a shared feeling of dissatisfaction towards government policies among the citizens, the government will detect the danger and improve policies accordingly. Second, and perhaps more interestingly, public communication disorganizes the citizens if they find themselves in disagreement over the policies. We show that the government allows public communication if and only if it perceives sufficient heterogeneity in preferences among the citizens. The model also illustrates that public communication could serve as a commitment device ensuring government responsiveness when it faces high dissatisfaction, which in turn makes the government better off than with private polling.

Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models

Political Analysis, Vol. 25, Iss. 1, January 2017, pp. 57-76.

Difference-in-differences (DID) is commonly used for causal inference in time-series cross-sectional data. It requires the assumption that the average outcomes of treated and control units would have followed parallel paths in the absence of treatment. In this paper, I propose a method that not only relaxes this often-violated assumption, but also unifies the synthetic control method (Abadie, Diamond and Hainmueller 2010) with linear fixed effect models under a simple framework, of which DID is a special case. It imputes counterfactuals for each treated unit in post-treatment periods using control group information based on a linear interactive fixed effect model that incorporates unit-specific intercepts interacted with time-varying coefficients. This method has several advantages. First, it allows the treatment to be correlated with unobserved unit and time heterogeneities under reasonable modelling assumptions. Second, it generalizes the synthetic control method to the case of multiple treated units and variable treatment periods, and improves efficiency and interpretability. Third, with a built-in cross-validation procedure, it avoids specification searches and thus is transparent and easy to implement. An empirical example of Election Day Registration and voter turnout in the United States is provided. -- Awarded the 2014 John T. Williams Dissertation Prize for the best dissertation proposal in political methodology; the 2017 Political Analysis Editors' Choice; the 2018 Miller Prize for "the best work appearing in Political Analysis in 2017."

Information Manipulation and Reform in Authoritarian Regimes

with Jidong Chen. Political Science Research and Methods, Vol. 5, Iss. 1, January 2017, pp. 163-178.

We develop a theory of how an authoritarian regime interactively uses information manipulation, such as propaganda or censorship, and policy improvement to maintain social stability. The government can depict the status quo policy more popularly supported than it actually is, while at the same time please citizens directly by enacting a costly reform. We show that the government's ability of making policy concessions reduces its incentive to manipulate information and improves its credibility. Anticipating a higher chance of policy concessions and less information manipulation, citizens are more likely to believe the government-provided information and support the regime. Our model provides an explanation for the puzzling fact that reform coexists with selective information disclosure in authoritarian countries like China.

Sources of Authoritarian Responsiveness: A Field Experiment in China

with Jidong Chen and Jennifer Pan. American Journal of Political Science , Vol. 60, Iss. 2, April 2016, pp. 383-400.

Scholars have established that authoritarian regimes exhibit responsiveness to citizens, but our knowledge of why autocrats respond remains limited. We theorize that responsiveness may stem from rules of the institutionalized party regime, citizen engagement, and a strategy of preferential treatment of a narrow group of supporters. We test the implications of our theory using an online experiment among 2,103 Chinese counties. At baseline, we find that approximately one third of county level governments are responsive to citizen demands expressed online. Threats of collective action and threats of tattling to upper levels of government cause county governments to be considerably more responsive. However, while threats of collective action cause local officials more publicly responsive, threats of tattling do not have this effect. We also find that identifying as loyal, long-standing members of the Communists Party does not increase responsiveness. -- Awarded AJPS Best Paper 2016; AJPS Top Cited Articles Virtual Issue, 2018.

Informal Institutions, Collective Action, and Public Goods Expenditure in Rural China

with Yang Yao. American Political Science Review, Vol. 109, No. 2, May 2015, pp. 371-391.

Do informal institutions---rules and norms created and enforced by social groups---promote good local governance in environments of weak democratic or bureaucratic institutions? This question is difficult to answer because of challenges in defining and measuring informal institutions and identifying their causal effects. In the paper, we investigate the effect of lineage groups, one of the most important vehicles of informal institutions in rural China, on local public goods expenditure. Using a panel dataset of 220 Chinese villages from 1986 to 2005, we find that village leaders from the two largest family clans increased local public investment considerably. This association is stronger when the clans appeared to be more cohesive. We also find that clans helped local leaders overcome the collective action problem of financing public goods, but there is little evidence suggesting that they held local leaders accountable.

ebalance: A Stata Package for Entropy Balancing

with Jens Hainmueller. Journal of Statistical Software, Vol. 54, Iss. 7, August 2013.

The Stata package ebalance implements entropy balancing, a multivariate reweighting method described in Hainmueller (2011) that allows users to reweight a dataset such that the covariate distributions in the reweighted data satisfy a set of specied moment conditions. This can be useful to create balanced samples in observational studies with a binary treatment where the control group data can be reweighted to match the covariate moments in the treatment group. Entropy balancing can also be used to reweight a survey sample to known characteristics from a target population.

A Practical Guide to Estimating Conditional Marginal Effects: Modern Approaches

with Jiehan Liu and Ziyi Liu. Forthcoming, Elements in Quantitative and Computational Methods for the Social Sciences, Cambridge University Press.

Factorial Difference-in-Differences

with Anqi Zhao and Peng Ding. Journal of the American Statistical Association, forthcoming.

Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study

with Albert Chiu, Xingchen Lan, and Ziyi Liu. American Political Science Review, Vol. 120, Iss. 1, February 2026, pp. 245–266.

How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice based on 67 Replicated Studies

with Apoorva Lal, Mac Lockhart, and Ziwen Zu. Political Analysis, Vol. 32, Iss. 4, October 2024, pp. 521-540.

Panel Data Visualization in R (panelView) and Stata (panelview)

with Hongyu Mou and Licheng Liu. Journal of Statistical Software, Vol. 107 (2023), Iss. 7, pp. 1-20.

A Practical Guide to Counterfactual Estimators for Causal Inference with Time-Series Cross-Sectional Data

with Licheng Liu and Ye Wang. American Journal of Political Science, Vol. 68, Iss. 1, January 2024, pp. 160–176.

Hierarchically Regularized Entropy Balancing

with Eddie Yang. Political Analysis, Vol. 31 , Iss. 3 , July 2023 , pp. 457-464.

Bayesian Rule Set: A Quantitative Alternative to Qualitative Comparative Analysis

with Albert Chiu. The Journal of Politics, Vol. 85, Iss. 1, January 2023, pp. 280-295.

A Bayesian Alternative to Synthetic Control for Comparative Case Studies

with Xun Pang and Licheng Liu. Political Analysis, Vol. 30 , Iss. 2 , April 2022 , pp. 269-288.

How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice

with Jens Hainmueller and Jonathan Mummolo. Political Analysis, Vol. 27, Iss. 2, April 2019, pp. 163-192.

Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models

Political Analysis, Vol. 25, Iss. 1, January 2017, pp. 57-76.

ebalance: A Stata Package for Entropy Balancing

with Jens Hainmueller. Journal of Statistical Software, Vol. 54, Iss. 7, August 2013.

Gauging Preference Stability under Authoritarianism

with Jennifer Pan, Research & Politics, 12(4), December 2025.

Decentralized Propaganda in the Era of Digital Media: The Massive Presence of the Chinese State on Douyin

with Yingdan Lu, Jennifer Pan and Xu Xu. American Journal of Political Science, forthcoming

Disguised Repression: Targeting Opponents with Non-Political Crimes to Undermine Dissent

with Jennifer Pan and Xu Xu. The Journal of Politics, Vol. 88, No. 1, January 2026, pp. 282–298.

Does Legality Produce Political Legitimacy? An Experimental Approach

with Yiqin Fu and Taisu Zhang. The Journal of Legal Studies, Vol. 54, Iss. 2, June 2025, pp. 257-285

The Power of History: How A Victimization Narrative Shapes National Identity and Public Opinion in China

with Jiannan Zhao. Research & Politics, 10(2), April 2023.

Clans and Calamity: How Social Capital Saved Lives during China’s Great Famine

with Jiarui Cao and Chuanchuan Zhang. Journal of Development Economics, Volume 157, June 2022.

This paper examines the role of social capital, embedded in kinship-based clans, in disaster relief during China's Great Famine (1958-1961). Using a county-year panel and a difference-in-differences strategy, we find that the rise in the mortality rate during the famine years is significantly less in counties with a higher clan density. Analysis using a nationally representative household survey corroborates this finding. Investigation of potential mechanisms suggests that social capital's impact on famine may have operated through enabling collective action against excessive government procurement. These results provide evidence that societal forces can ameliorate damages caused by faulty government policies in times of crisis.

How Government-Controlled Media Shifts Policy Attitudes through Framing

with Jennifer Pan and Zijie Shao. Political Science Research and Methods, Vol. 10 , Iss. 2 , April 2022 , pp. 317-332.

Public Sentiment on Chinese Social Media during the Emergence of COVID-19

with Yingdan Lu and Jennifer Pan. Journal of Quantitative Description: Digital Media, Vol. 1, 2021, pp. 1--31.

Outspoken Insiders: Political Connections and Citizen Participation in Authoritarian China

with Lily L. Tsai. Political Behavior, Vol. 40, Iss. 3, September 2018, pp. 629–657.

China’s Ideological Spectrum

with Jennifer Pan. The Journal of Politics, Vol. 80, No. 1, January 2018, pp. 254--273.

Why Do Authoritarian Regimes Allow Citizens to Voice Opinions Publicly?

with Jidong Chen. The Journal of Politics, Vol. 79, No. 3, July 2017, pp. 792-803.

Information Manipulation and Reform in Authoritarian Regimes

with Jidong Chen. Political Science Research and Methods, Vol. 5, Iss. 1, January 2017, pp. 163-178.

Sources of Authoritarian Responsiveness: A Field Experiment in China

with Jidong Chen and Jennifer Pan. American Journal of Political Science , Vol. 60, Iss. 2, April 2016, pp. 383-400.

Informal Institutions, Collective Action, and Public Goods Expenditure in Rural China

with Yang Yao. American Political Science Review, Vol. 109, No. 2, May 2015, pp. 371-391.

How Deceptive Online Networks Reached Millions in the US 2020 Elections

with Ruth Appel, Young Mie Kim, Jennifer Pan, and others. Nature Human Behaviour, 2026.

Post-Print

Awakening Leviathan: the Effect of Democracy on State Capacity, 1960-2009

with Erik H. Wang. Research & Politics, 5(2), April-June 2018.

Incremental Democracy: The Policy Effects of the Partisan Composition of State Government

with Devin Caughey and Chris Warshaw. The Journal of Politics, Vol. 79, No. 4, October 2017, pp. 1342--1358.

Other Publications

Comparing Experimental and Nonexperimental Methods: What Lessons Have We Learned Four Decades After LaLonde (1986)?

with Guido Imbens. Journal of Economic Perspectives, Vol. 39, No. 4, pp. 173-202, Fall 2025.

In 1986, Robert LaLonde published an article comparing nonexperimental estimates to experimental benchmarks (LaLonde 1986). He concluded that the nonexperimental methods at the time could not systematically replicate experimental benchmarks, casting doubt on their credibility. Following LaLonde's critical assessment, there have been significant methodological advances and practical changes, including (i) an emphasis on the unconfoundedness assumption separated from functional form considerations, (ii) a focus on the importance of overlap in covariate distributions, (iii) the introduction of propensity score-based methods leading to doubly robust estimators, (iv) methods for estimating and exploiting treatment effect heterogeneity, and (v) a greater emphasis on validation exercises to bolster research credibility. To demonstrate the practical lessons from these advances, we reexamine the LaLonde data. We show that modern methods, when applied in contexts with sufficient covariate overlap, yield robust estimates for the adjusted differences between the treatment and control groups. However, this does not imply that these estimates are causally interpretable. To assess their credibility, validation exercises (such as placebo tests) are essential, whereas goodness-of-fit tests alone are inadequate. Our findings highlight the importance of closely examining the assignment process, carefully inspecting overlap, and conducting validation exercises when analyzing causal effects with nonexperimental data.

Causal Inference with Time-Series Cross-Sectional Data: A Reflection

Chapter 30, Oxford Handbook of Engaged Methodological Pluralism in Political Science (Vol 1)

This chapter surveys new development in causal inference using time-series cross-sectional (TSCS) data. I start by clarifying two identification regimes for TSCS analysis: one under the strict exogeneity assumption and one under the sequential ignorability assumption. I then review three most commonly used methods by political scientists, the difference-in-differences approach, two-way fixed effects models, and the synthetic control method. For each method, I examine its assumptions, explain its pros and cons, and discuss its extensions. I then introduce several new methods under strict exogeneity or sequential ignorability, including the factor-augmented approach, panel matching, and marginal structural models. I conclude by pointing to several directions for future research.

Introduction to the Virtual Issue: Panel Data Analysis and Regression Discontinuity

Political Analysis.

Over the past ten years, the “causal inference revolution” has dramatically changed the landscape of political science and social sciences in general. Two of the most commonly used groups of methods for causal inference with observational data are (1) methods related to panel data or time-series-cross-section (TSCS) data and (2) regression discontinuity (RD) designs. Articles in this virtual issue represent the efforts of political methodologists either to develop more reliable and versatile approaches to panel data analysis and RD designs or to better understand the advantages and disadvantages of existing common practices in recent years.