method = "mc" treated missing (id, time)
panel cells and NA outcomes as observed zeros instead of
excluding them from the observation mask used by the Soft-Impute solver.
Missing cells are now masked out the same way as treated post-adoption
cells. This changes numerical results for mc fits
on unbalanced panels or panels with NA
outcomes.method = "tasc"’s EM loop already handles missing
outcome cells (Kalman smoother plus per-unit M-steps), but its
initialisation (svd(), the treated-unit loading OLS, and
var()) failed on panels with missing cells. Initial values
are now computed from a column-mean-imputed matrix; the EM loop itself
still runs on the true, unimputed data.Malformed panels previously produced confusing or misleading errors —
for example, a 0-row data frame (from an upstream filtering bug) was
misclassified as staggered adoption and failed with an unrelated
“predictors not supported” error, and missing (id, time)
cells surfaced as a raw eig_sym(): decomposition failed
from the C++ layer. panel_to_matrices() (shared by all six
estimators) and scm_design() now validate and error clearly
on:
NA unit or time identifiersNA or negative treatment indicator values(id, time) cells or non-finite outcomes, for
the estimators that require a fully observed panel (scm,
sdid, gsc, si — mc
and tasc handle missing data by design)scm_fit() also now rejects non-numeric
(factor/character) outcome or treatment
columns and non-integer treatment values, instead of silently coercing
them (as.integer(factor(...)) returns level codes, not the
original values). “All cohort-level fits failed” errors (SCM/SDID/GSC/SI
staggered paths) now point back to the preceding per-cohort warnings for
diagnosis.
solve_simplex_qp() /
solve_simplex_qp_lr()), used by every
scm_fit(method = "scm") fit and by the
method = "sdid" unit weights, now use FISTA with adaptive
restart (O’Donoghue & Candès 2015, gradient scheme). The first-order
solver’s iteration count grows with the condition number of
Q = X0'VX0, which is large for the collinear pre-treatment
outcome panels typical of synthetic control; resetting the momentum term
when it works against the gradient removes this slowdown. Each inner
solve converges to the same optimum (verified against an exact QP solver
to within 1e-10), so the returned weights are
unchanged.~1e-5). For v_selection = "oos"
and penalised (lambda_pen) fits on ill-conditioned panels,
the non-convex outer V search may now settle on a different local
optimum, shifting results slightly; both the previous and the new
solutions are valid SCM fits with the same objective. This can
change numerical results for v_selection = "oos" and
penalised fits on poorly conditioned data.panel_to_matrices() (and therefore every
scm_fit() method) and scm_design() now error
on duplicate (id, time) entries instead of silently keeping
a single arbitrary row. A balanced panel requires each unit-time cell to
be unique; duplicates were previously overwritten by the last
matrix-index assignment, dropping data without warning. The error
reports the number of offending rows and the first duplicated unit and
time.plot() now renders
Date/POSIXct time axes correctly. The time
vector was coerced with as.numeric(), so dates appeared as
days-since-epoch (e.g. 16000,
Date/POSIXct values are now
passed through unchanged so ggplot2 selects the appropriate date scale;
only character/factor time values are coerced
to numeric.v_selection = "oos" (outcomes-only case) previously fit
candidate W(V) on the full pre-treatment outcome matrix and
restricted only the MSPE evaluation to the validation window, allowing
the V optimiser to fit the validation period indirectly (a data leak
relative to Abadie (2021) S.3.2). The new
.scm_oos_outcomes() implements the correct train/validation
split: candidate W(V) are fitted on training-half outcomes
only, V* minimises validation-half MSPE, and
W* is refit with V* on the outcomes of the
last floor(T_pre/2) pre-treatment periods. For OOS fits,
v_weights now has floor(T_pre/2) entries and a
new v_rows field records which periods they refer to.
This changes numerical results for
v_selection = "oos".scale_predictors (default TRUE): predictor
rows supplied via predictors = are now divided by their
standard deviation across all units before optimisation, matching the
Synth reference implementation (Abadie, Diamond & Hainmueller 2011,
JSS). predictor_table continues to report values on the
original scale. This changes numerical results for SCM fits with
user-supplied predictors unless
scale_predictors = FALSE.placebo_in_time(): in-time placebo (backdating) test
for sharp SCM fits (Abadie, Diamond & Hainmueller 2015; Abadie &
Vives-i-Bastida 2022).loo_donors(): leave-one-out donor robustness check with
the predictor weights V held fixed (Abadie, Diamond & Hainmueller
2015, footnote 20).build_predictor_matrices() now errors with an
informative message if a pred() time window produces
missing or non-finite predictor values.conformal_inference()): permutation-based p-values and
confidence intervals following Chernozhukov, Wüthrich & Zhu (2021).
Works with sharp fits across all supported estimation methods
(scm, sdid, gsc, mc,
si). The counterfactual proxy is re-estimated under the
null on all T periods (essential for finite-sample validity per
CWZ S.2.2), and p-values are obtained via moving-block (cyclic-shift)
permutation of the estimated residuals. Confidence intervals are
constructed by test inversion over a user-supplied or automatically
chosen grid. Returns a coresynth_inference subclass
compatible with tidy() and glance().panel_to_matrices(): fill loop replaced by vectorised
match() + matrix-index assignment; removes an O(n × (T +
N)) bottleneck in the shared data-prep path.tasc.cpp: safe_inv_sympd() helper added so
the Kalman filter degrades to pinv instead of aborting when
the innovation covariance is not numerically PD.%||% null-coalescing helper centralised in
utils.R; duplicate definitions in broom.R and
plot.R removed.check_sharp_adoption() (unused internal function)
removed.First public release.
pred(), out-of-sample V selection
(v_selection = "oos"), donor filtering
(donor_mspe_threshold), penalised SCM
(lambda_pen), and staggered adoption. Inference: MSPE ratio
permutation test via mspe_ratio_pval().covariates =), sharp and staggered adoption. Inference:
sdid_inference() with placebo / bootstrap / jackknife /
jackknife_global.gsc_boot()) and non-parametric
(gsc_inference()).si_inference() with
bootstrap / jackknife / jackknife_global.scm_design() with base / weakly_targeted / unit_level
variants, blank-period permutation test, and split-conformal confidence
intervals.scm_fit(outcome ~ treatment | unit + time, data, method = ...)
entry point for all methods.panel_to_tensor() for multi-arm SI data
preparation.broom integration: tidy(),
glance(), augment() for all methods and
inference objects.plot.coresynth(): trend, gap, and weights plots via
ggplot2.export_json(): JSON export for reproducibility.All core optimisations implemented in C++ via RcppArmadillo: 50–70x
faster than the Synth package for typical panel sizes
(N_co ≤ 30). src/inference.cpp placebo loops parallelised
with OpenMP.