MSM Identification and Recovery in tidyILD

Why this vignette exists

This vignette documents the identification assumptions behind the MSM/IPW workflow and shows how to run the causal recovery harness added for regression testing and simulation-based checks.

Identification assumptions

In this workflow, interpretation of weighted outcome contrasts depends on:

  1. Sequential exchangeability: all confounders needed for treatment assignment at each \(t\) are captured in the history set used for IPTW.
  2. Positivity / overlap: treatment probabilities are bounded away from 0 and 1 in relevant strata.
  3. Consistency: observed outcomes under observed treatment history equal potential outcomes under that same history.
  4. Correct weight models: treatment and censoring models are correctly specified.

Use diagnostics to stress-test these assumptions:

Estimand-first + history-builder workflow (v1)

library(tidyILD)

d <- ild_msm_simulate_scenario(n_id = 100, n_obs_per = 12, true_ate = 0.5, seed = 101)
d <- ild_center(d, y)

hist_spec <- ild_msm_history_spec(vars = c("stress", "trt"), lags = 1:2)
d <- ild_build_msm_history(d, hist_spec)

estimand <- ild_msm_estimand(type = "ate", regime = "static", treatment = "trt")

fit_obj <- ild_msm_fit(
  estimand = estimand,
  data = d,
  outcome_formula = y ~ y_bp + y_wp + stress + trt + (1 | id),
  history = ~ stress_lag1 + trt_lag1,
  predictors_censor = "stress",
  inference = "bootstrap",
  n_boot = 200,
  strict_inference = FALSE
)

fit_obj
fit_obj$inference$status
fit_obj$inference$reason

Recovery harness

rec <- ild_msm_recovery(
  n_sim = 100,
  n_id = 120,
  n_obs_per = 12,
  true_ate = 0.5,
  n_boot = 200,
  inference = "bootstrap",
  seed = 1001,
  censoring = TRUE
)

rec$summary
rec$summary_by_scenario

Scenario-grid validation (positivity stress and treatment-model misspecification):

grid <- tibble::tibble(
  scenario_id = c("baseline", "positivity_stress", "misspecified_treatment"),
  positivity_stress = c(1, 1.8, 1),
  misspec_treatment_model = c(FALSE, FALSE, TRUE)
)

rec_grid <- ild_msm_recovery(
  n_sim = 50,
  n_id = 120,
  n_obs_per = 12,
  true_ate = 0.5,
  n_boot = 200,
  inference = "bootstrap",
  scenario_grid = grid,
  seed = 1101
)

rec_grid$summary_by_scenario

Interpretation:

Inference caveats and strict mode

Notes on v1 scope

mirror server hosted at Truenetwork, Russian Federation.