Yiqing Xu

Yiqing Xu

Assistant Professor at
Department of Political Science
Stanford University

Welcome!

I am an Assistant Professor at Department of Political Science, Stanford University. I work in political methodology and comparative politics, focusing on China.

During the ’24-25 academic year, I am a W. Glenn Campbell and Rita Ricardo-Campbell National Fellow at the Hoover Institution, Stanford University.

I received a PhD in Political Science from Massachusetts Institute of Technology (MIT) in 2016, an MA in Economics from the National School of Development (NSD) at Peking University in 2010 and a BA in Economics from Fudan University in 2007. I taught at University of California San Diego (UCSD) from July 2016 to September 2019.

I am an associate director of Stanford Casual Science Center (SC2) and a faculty affiliate to Stanford Center on China’s Economy and Institutions (SCCEI), Stanford Center for Open and REproducible Science (CORES), Stanford King Center on Global Development, Stanford Center for East Asia Studies (CEAS), and the 21st Century China Center (21CCC) at UCSD.

My work has appeared in American Political Science Review, American Journal of Political Science, The Journal of Politics, Political Analysis, Political Science Research and Methods, Journal of Development Economics, among other peer-reviewed journals.

I have won several professional awards:

In 2024, I received the Emerging Scholar Award from the Society of Political Methodology and an honorable mention for the Becky Morton and Tom Carsey Excellence in Mentoring Award. Also in 2024, interflex, R and Stata packages I developed with my team, won the Society’s Best Statistical Software Award.
In 2018 and 2020, my work won the Miller Prize for the best work appearing in Political Analysis in the previous year.
In 2017, my paper was named Political Analysis Editors’ Choice for 2017.
In 2016, a paper I coauthored won the annual Best Article Award from American Journal of Political Science.
In 2014, I was awarded the John T. Williams Dissertation Prize by the Society for Political Methodology.

You can reach me via email: yiqingxu [at] stanford.edu.

Recent Articles.

A Practical Guide to Estimating Conditional Marginal Effects: Modern Approaches with Jiehan Liu and Ziyi Liu. Prepared for Elements in Quantitative and Computational Methods for the Social Sciences, Cambridge University Press.

This Element offers a practical guide to estimating conditional marginal effects—how treatment effects vary with a moderating variable—using modern statistical methods. Commonly used approaches, such as linear interaction models, often suffer from unclarified estimands, limited overlap, and restrictive functional forms. This guide begins by clearly defining the estimand and presenting the main identification results. It then reviews and improves upon existing solutions, such as the semiparametric kernel estimator, and introduces robust estimation strategies, including augmented inverse propensity score weighting with Lasso selection (AIPW-Lasso) and double machine learning (DML) with modern algorithms. Each method is evaluated through simulations and empirical examples, with practical recommendations tailored to sample size and research context. All tools are implemented in the accompanying interflex package for R.
Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study with Albert Chiu, Xingchen Lan, and Ziyi Liu. American Political Science Review, forthcoming.

Two-way fixed effects (TWFE) models are widely used in political science to establish causality, but recent methodological discussions highlight their limitations under heterogeneous treatment effects (HTE) and violations of the parallel trends (PT) assumption. This growing literature has introduced numerous new estimators and procedures, causing confusion among researchers about the reliability of existing results and best practices. To address these concerns, we replicated and reanalyzed 49 studies from leading journals using TWFE models for observational panel data with binary treatments. Using six HTE-robust estimators, diagnostic tests, and sensitivity analyses, we find: (i) HTE-robust estimators yield qualitatively similar but highly variable results; (ii) while a few studies show clear signs of PT violations, many lack evidence to support this assumption; and (iii) many studies are underpowered when accounting for HTE and potential PT violations. We emphasize the importance of strong research designs and rigorous validation of key identifying assumptions.
- Pre-Print
- arXiv
- SSRN
- Slides
- Tutorial
- Appendix (SM-A)
- Markdown Files (SM-B)
Decentralized Propaganda in the Era of Digital Media: The Massive Presence of the Chinese State on Douyin with Yingdan Lu, Jennifer Pan and Xu Xu. American Journal of Political Science, forthcoming

The rise of social media in the digital era poses unprecedented challenges to authoritarian regimes that aim to influence public attitudes and behaviors. In this paper, we argue that authoritarian regimes have adopted a decentralized approach to producing and disseminating propaganda on social media. In this model, tens of thousands of government workers and insiders are mobilized to produce and disseminate propaganda, and content flows in a multi-directional, rather than a top-down manner. We empirically demonstrate the existence of this new model in China by creating a novel dataset of over five million videos from over 18,000 regime-affiliated accounts on Douyin, the Chinese branding for TikTok. This paper supplements prevailing understandings of propaganda by showing theoretically and empirically how digital technologies are changing not only the content of propaganda, but also the way in which propaganda materials are produced and disseminated.
- PrePrint (SSRN)
- RMarkdown
Factorial Difference-in-Differences with Anqi Zhao and Peng Ding. Revise & Resubmit, Journal of American Statistical Association.

In many social science applications, researchers use the difference-in-differences (DID) estimator to establish causal relationships, exploiting cross-sectional variation in a baseline factor and temporal variation in exposure to an event that presumably may affect all units. This approach, which we term factorial DID (FDID), differs from canonical DID in that it lacks a clean control group unexposed to the event after the event occurs. In this paper, we clarify FDID as a research design in terms of its data structure, feasible estimands, and identifying assumptions that allow the DID estimator to recover these estimands. We frame FDID as a factorial design with two factors: the baseline factor, denoted by G, and the exposure level to the event, denoted by Z, and define the effect modification and causal interaction as the associative and causal effects of G on the effect of Z, respectively. We show that under the canonical no anticipation and parallel trends assumptions, the DID estimator identifies only the effect modification of G in FDID, and propose an additional factorial parallel trends assumption to identify the causal interaction. Moreover, we show that the canonical DID research design can be reframed as a special case of the FDID research design with an additional exclusion restriction assumption, thereby reconciling the two approaches. We extend this framework to allow conditionally valid parallel trends assumptions and multiple time periods, and clarify assumptions required to justify regression analysis under FDID. We illustrate these findings with empirical examples from economics and political science, and provide recommendations for improving practice and interpretation under FDID.
- arXiv
- SSRN
- OCIS Talk
- Slides
LaLonde (1986) after Nearly Four Decades: Lessons Learned with Guido Imbens. Forthcoming, Journal of Economic Perspectives.

In 1986, Robert LaLonde published an article that compared nonexperimental estimates to experimental benchmarks (LaLonde 1986). He concluded that the nonexperimental methods at the time could not systematically replicate experimental benchmarks, casting doubt on the credibility of these methods. Following LaLonde’s critical assessment, there have been significant methodological advances and practical changes, including (i) an emphasis on estimators based on unconfoundedness, (ii) a focus on the importance of overlap in covariate distributions, (iii) the introduction of propensity score-based methods leading to doubly robust estimators, (iv) a greater emphasis on validation exercises to bolster research credibility, and (v) methods for estimating and exploiting treatment effect heterogeneity. To demonstrate the practical lessons from these advances, we reexamine the LaLonde data and the Imbens-Rubin-Sacerdote lottery data. We show that modern methods, when applied in contexts with significant covariate overlap, yield robust estimates for the adjusted differences between the treatment and control groups. However, this does not mean that these estimates are valid. To assess their credibility, validation exercises (such as placebo tests) are essential, whereas goodness of fit tests alone are inadequate. Our findings highlight the importance of closely examining the assignment process, carefully inspecting overlap, and conducting validation exercises when analyzing causal effects with nonexperimental data.
Disguised Repression: Targeting Opponents with Non-Political Crimes to Undermine Dissent with Jennifer Pan and Xu Xu. The Journal of Politics, forthcoming.

Why do authoritarian regimes charge political opponents with non-political crimes when they can levy charges directly related to opponents’ political activism? We argue that doing so disguises political repression and undermines the moral authority of opponents, minimizing backlash and mobilization. To test this argument, we conduct a survey experiment, which shows that disguised repression decreases perceptions of dissidents’ morality, decreases people’s willingness to engage in dissent on behalf of the dissident, and increases support for repression of the dissident. We then assess the external validity of the argument by analyzing millions of Chinese social media posts made before and after a large crackdown of vocal government critics in China in 2013. We find that individuals with larger online followings are more likely to be charged with non-political crimes, and those charged with non-political crimes are less likely to receive public sympathy and support.
How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice based on 67 Replicated Studies with Apoorva Lal, Mac Lockhart, and Ziwen Zu. Political Analysis, Vol. 32, Iss. 4, October 2024, pp. 521-540.

Instrumental variable (IV) strategies are widely used in political science to establish causal relationships, but the identifying assumptions required by an IV design are demanding, and assessing their validity remains challenging. In this paper, we replicate 67 articles published in three top political science journals from 2010-2022 and identify several concerning patterns. First, researchers often overestimate the strength of their instruments due to non-i.i.d. error structures such as clustering. Second, the commonly used t-test for two-stage-least-squares (2SLS) estimates frequently underestimates uncertainties, resulting in uncontrolled Type-I errors in many studies. Third, in most replicated studies, 2SLS estimates are significantly larger than ordinary-least-squares estimates, with their ratio negatively correlated with instrument strength in studies with non-experimentally generated instruments, suggesting potential violations of unconfoundedness or the exclusion restriction. We provide a checklist and software to help researchers avoid these pitfalls and improve their practice.
- Open Access
- Erratum
- Slides
- SM
- RMarkdown
- arXiv
- Dataverse
- Replication Files (118M)
- R Package
- Stata Code
- State Tutorial

See All Papers

Software.

ivDiag: Estimation and Diagnostics for IV Designs

ivDiag is toolkit for estimation, diagnostics, and visualization with instrumental variable designs.

hbal: Hierarchically Regularized Entropy Balancing

hbal addresses the shortcomings of entropy balancing by hierarchically regularizing higher-order moment constraints of observed covariates.

R
Paper

fect: Fixed Effect Counterfactual Estimators

Counterfactual estimators for panel data with binary treatments address the weighting problem of fixed effects models and can potentally relax strict exogeneity.

tjbal: Trajectory Balancing

Using panel data with binary treatments, trajectory balancing draws causal inference by balancing on kernelized features from pretreatment periods.

R
Paper

interflex: Flexible Interaction Models

interflex conducts diagnostic tests and offers flexible estimation strategies for nonlinear interaction effects. It accommodates both continuous and discrete outcomes.

panelView: Visualizing Panel Data

panelview visualizes the treatment and missing-value status of observations in a panel dataset and plots variables of interest in a time-series fashion.

See All Software

Teaching~

Short Course on Causal Inference with Panel Data

This workshop series gives an overview of newly emerged causal inference methods using panel data (with dichotomous treatments). We start our discussion with a review of the difference-in-differences (DiD) method and conventional two-way fixed effects (2WFE) models. We then discuss the drawbacks of 2WFE models from a design-based perspective and clarify the two main identification regimes: one under the strict exogeneity (SE) assumption (or its variants) and one under the sequential ignorability (SI) assumption. In Lecture 2, we review the synthetic control method and discuss its extensions. In Lecture 3, we introduce the factor-augmented approach, including panel factor models, matrix completion methods, and Bayesian latent factor models. In Lecture 4, we take a different route and discuss matching and reweighting methods to achieve causal inference goals with panel data under the SE or SI assumptions. We also discuss hybrid methods that enjoy doubly robust properties.

Lecture 1. Difference-in-Differences and Fixed Effects Models
Lecture 2. Synthetic Control and Extensions
Lecture 3. Factor-Augmented Methods
Lecture 4. Matching/Balancing and Hybrid Methods
- Youtube
- Bilibili
POLI 450A. Political Methodology I

This is the first course in a four-course sequence on quantitative political methodology at Stanford Political Science. Political methodology is a growing subfield of political science which deals with the development and application of statistical methods to problems in political science and public policy. The subsequent courses in the sequence are 450B, 450C, and 450D. By the end of the sequence, students will be capable of understanding and confidently applying a variety of statistical methods and research designs that are essential for political science and public policy research.

This first course provides a graduate-level introduction to regression models, along with the basic principles of probability and statistics which are essential for understanding how regression works. Regression models are routinely used in political science, policy research, and other disciplines in social science. The principles learned in this course also provide a foundation for the general understanding of quantitative political methodology. If you ever want to collect quantitative data, analyze data, critically read an article that presents a data analysis, or think about the relationship between theory and the real world, then this course will be helpful for you.

You can only learn statistics by doing statistics. In recognition of this fact, the homework for this course will be extensive. In addition to the lectures and weekly homework assignments, there will be required and optional readings to enhance your understanding of the materials. You will find it helpful to read these not only once, but multiple times (before, during, and after the corresponding homework).
- Syllabus
POLI 150A. Data Science for Politics

Overview. Data science is quickly changing the way we understand and engage in politics, how we implement policy, and how organizations across the world make decisions. In this course, we will learn the fundamental tools of data science and apply them to a wide range of political and policy-oriented questions. How do we predict presidential elections? How can we guess who wrote each of the Federalist Papers? Do countries become less democratic when leaders are assassinated? These are just a few of the questions we will work on in the course.

Learning Goals. The course has three basic learning goals for students. At the end of this course, students should:
1. Be comfortable using basic features of the R programming language.
2. Be able to combine political data with statistical concepts to answer political questions.
3. Know how to create visual depictions of statistical patterns in data.
Learning Approach. Statistical and programming concepts do not lend themselves to the traditional lecture format, and in general, experimental research on teaching methods shows that combining active learning with lectures outperforms traditional lecturing. We will teach each concept in lectures using applied examples that encourage active learning. Lectures will be broken up into small modules; first, I will explain a concept, and then we will write code to implement the concept in practice. Students are asked to bring their laptops to class so that we can actively code during lectures. This will help students “learn by doing” and it will ensure that the transition from lecture to problem sets is smooth.
- Syllabus

See All Teaching

Yiqing Xu

Welcome!

Recent Articles.

A Practical Guide to Estimating Conditional Marginal Effects: Modern Approaches with Jiehan Liu and Ziyi Liu. Prepared for Elements in Quantitative and Computational Methods for the Social Sciences, Cambridge University Press.

Causal Panel Analysis under Parallel Trends: Lessons from A Large Reanalysis Study with Albert Chiu, Xingchen Lan, and Ziyi Liu. American Political Science Review, forthcoming.

Decentralized Propaganda in the Era of Digital Media: The Massive Presence of the Chinese State on Douyin with Yingdan Lu, Jennifer Pan and Xu Xu. American Journal of Political Science, forthcoming

Factorial Difference-in-Differences with Anqi Zhao and Peng Ding. Revise & Resubmit, Journal of American Statistical Association.

LaLonde (1986) after Nearly Four Decades: Lessons Learned with Guido Imbens. Forthcoming, Journal of Economic Perspectives.

Disguised Repression: Targeting Opponents with Non-Political Crimes to Undermine Dissent with Jennifer Pan and Xu Xu. The Journal of Politics, forthcoming.

How Much Should We Trust Instrumental Variable Estimates in Political Science? Practical Advice based on 67 Replicated Studies with Apoorva Lal, Mac Lockhart, and Ziwen Zu. Political Analysis, Vol. 32, Iss. 4, October 2024, pp. 521-540.

Software.

ivDiag: Estimation and Diagnostics for IV Designs

hbal: Hierarchically Regularized Entropy Balancing

fect: Fixed Effect Counterfactual Estimators

tjbal: Trajectory Balancing

interflex: Flexible Interaction Models

panelView: Visualizing Panel Data

Teaching~

Short Course on Causal Inference with Panel Data

POLI 450A. Political Methodology I

POLI 150A. Data Science for Politics