Before applying the structural estimator to real data, this chapter runs the full workflow on a synthetic conjoint experiment with known ground truth. Because we control the data-generating process, every quantity the package produces can be checked against its analytically correct value.
2.1 Data-generating process
The package ships simdata, a synthetic conjoint with \(M = 1{,}000\) respondents, \(T = 6\) tasks per respondent, and two profiles per task (12,000 rows). Three binary attributes \(x_1, x_2, x_3\) are drawn independently as Bernoulli(0.5). Two continuous moderators \(z_1, z_2 \sim N(0,1)\) determine each respondent’s preference vector:
## True beta matrix stored as attributebeta_true <-attr(simdata, "beta_true")dim(beta_true)
[1] 1000 3
NoteWhy does scfit() require at least one Z column?
The DML estimator is defined in terms of a nuisance function \(\hat\beta(\mathbf Z)\) that maps respondent-level covariates into preference vectors. Without any moderators the mapping is trivial and the DML correction degenerates to a plain plug-in estimator. Users who truly want a homogeneous model should fit a standard glm(choice ~ deltaX, family = binomial) instead.
2.2 Fitting
With 1,000 respondents and 6 tasks each, the estimator has enough signal to recover both population-level and individual-level parameters cleanly.
Check two things: (1) summary(fit) reports the DML/iid SE ratio — values close to one mean the clustering correction is small and the DNN’s per-respondent \(\hat\beta\) has little additional within-respondent information beyond the population mean. (2) plot(fit, "loss_trace") shows per-fold training loss curves — they should visually level off before the last epoch. If they are still descending, increase n_epochs.
TipPerformance tips
Set device = "cuda" on a GPU machine; bit-exact determinism is lost but wall-clock time drops substantially.
Use parallel = TRUE with n_cores = parallelly::availableCores() on CPU for large datasets — the package guarantees bit-exact identity between parallel and sequential runs.
Prefer hidden = "auto" as a starting point and tune only when clearly needed.
plot(fit_sim, "loss_trace")
All folds converge to a similar loss level, confirming that 200 epochs is sufficient for this dataset.
2.3 Population-average recovery
The DML point estimates \(\hat\theta\) should approximate the true population means.
High correlations for \(x_1\) and \(x_2\) confirm that the DNN recovers the individual-level mapping \(\beta(Z)\). The correlation for \(x_3\) is low because \(\beta_3\) is constant — there is no heterogeneity to recover.
The point cloud tracks the 45-degree line, confirming recovery of both direction and magnitude. The beta ridgelines below show the full distribution:
plot(fit_sim, "beta_ridgelines")
The ridgeline for \(x_1\) is wide and centered above zero (heterogeneous, positive); \(x_2\) is wide and centered below zero (heterogeneous, negative); \(x_3\) is narrow and centered at 0.3 (homogeneous).
2.5 Direction and intensity
The decomposition should show: \(x_1\) has strong positive direction, \(x_2\) has strong negative direction, \(x_3\) has moderate positive direction.
The diverging bar chart makes the asymmetry vivid:
plot_fraction(fit_sim)
2.7 Heterogeneity test
This is the most important validation. The test should flag \(x_1\) and \(x_2\) as significant (their \(\beta\) varies with \(Z\)) and \(x_3\) as non-significant (constant across respondents).
The heterogeneity bar chart confirms this visually — \(x_1\) and \(x_2\) are dark blue with diamond markers (significant), \(x_3\) is pale (homogeneous):
plot_hetero(fit_sim)
2.8 Subgroup analysis
By construction, respondents with \(z_1 > 0\) have \(\beta_1 > 0.5\) while those with \(z_1 < 0\) have \(\beta_1 < 0.5\).
theta_struct <-coef(fit_sim)theta_logit <-coef(bl)df_comp <-data.frame(dummy =rep(names(theta_struct), 2),estimate =c(theta_struct, theta_logit),model =rep(c("Structural (DML)", "Logit"), each =length(theta_struct)))ggplot(df_comp, aes(x = estimate, y = dummy, color = model)) +geom_vline(xintercept =0, linetype ="dashed", color ="gray50") +geom_point(position =position_dodge(0.4), size =3) +scale_color_manual(values =c("Structural (DML)"="#E41A1C","Logit"="#2166AC")) +labs(x =expression(hat(theta)), y =NULL, color =NULL,title ="Structural vs baseline logit") +theme_minimal(base_size =12) +theme(legend.position ="bottom")
Both estimators agree on population averages — the structural model’s advantage is the per-respondent \(\hat\beta(\mathbf Z_i)\) that enables all the heterogeneity analyses above.
2.9.2 Average marginal effects (probability scale)
sc_quantity: decisiveness
estimate = 0.6362 se = 0.003721 95% CI = [0.6289, 0.6435]
Profile A (\(\beta_1 + \beta_3 \approx 0.8\)) vs profile B (\(\beta_2 \approx -0.8\)) gives \(\Delta V \approx 1.6\), so most respondents are decisive.
2.9.5 Preference inequality
sc_inequality(fit_sim, measure ="variance")
sc_quantity: inequality_variance
estimate: data.frame with 3 rows
dummy_name measure value
x1 variance 0.06049
x2 variance 0.03019
x3 variance 0.03780
\(x_1\) and \(x_2\) show higher variance (\(\beta\) depends on \(Z\)); \(x_3\) has near-zero variance (constant \(\beta_3 = 0.3\)). This agrees with the heterogeneity test above.
2.10 Summary
Every check aligns with the ground truth:
Population-average\(\hat\theta\) recovers signs and magnitudes.
Individual-level\(\hat\beta(\mathbf Z_i)\) correlates strongly with the true preference vector for heterogeneous attributes.
Heterogeneity test correctly flags \(x_1\), \(x_2\) (Z-dependent) as heterogeneous and \(x_3\) (constant) as homogeneous.
Subgroup analysis recovers the known \(z_1\)-driven split.
Baseline logit agrees on population averages while lacking heterogeneity decomposition.
Advanced quantities (AME, surplus, welfare, decisiveness, inequality) all produce results consistent with the DGP.
With these checks in hand, we proceed to real-data applications.