nlmixr2save focuses on two related problems:
Saving nlmixr2 results in a format that stays
readable outside of .rds.
Reusing expensive fits or simulations when nothing important has changed.
saveFit() writes a saved fit as a collection of files
and then optionally zips them together. In practice, most of the saved
object is reconstructed from:
.R files for model definitions and objects that can
be recreated as source.
.csv files for tabular fit components and
datasets.
That means the saved fit is largely inspectable outside of R, and it
is not tied to the binary serialization format used by a specific
version of nlmixr2 or rxode2.
library(nlmixr2est)
library(nlmixr2data)
library(nlmixr2save)
fit <- nlmixr2(one.cmt, theo_sd, est = "focei")
saveFit(fit, "fit")
restored_fit <- loadFit("fit")When loadFit("fit") runs, it recreates the object by
sourcing the generated .R (for fit this would be
fit.R) files and reading the generated .csv
files back in.
This is the main protection against a saved fit becoming unreadable simply because an internal serialization format changes.
For deterministic estimation methods, the restored object includes
the full saved fit, including the original model, fit results, and
origData.
This is especially useful for long-running estimation jobs because the saved fit can be rebuilt without repeating the estimation itself.
This saveFit() can automatically be performed and cached
with a new operator := and is the quickest way to save
nlmixr2 fits (and other items). For example:
For nlmixr2 fits, the cache key is based on:
table$keep
columns.This has two important consequences.
If the new dataset only changes columns that are not used for
estimation, nlmixr2save restores the existing fit instead
of rerunning the estimation.
fit := nlmixr2(one.cmt, theo_sd, est = "focei")
theo_sd_extra <- theo_sd
theo_sd_extra$.ignored <- "notes"
fit := nlmixr2(one.cmt, theo_sd_extra, est = "focei")The expensive estimation is skipped, but the restored object still
updates fit$origData to the new dataset. In other words,
the cached fit is reused only when the meaningful estimation inputs
match, while the saved object still tracks the latest original dataset
you supplied.
If you change a column that matters to the fit, such as
DV, time, dosing information, a covariate, or a
table$keep column, the cache key changes and the estimation
is run again.
This is the intended safety boundary: harmless dataset changes are absorbed, but real estimation changes invalidate the cache.
:= for long-running fits and simulationsThe := operator caches the result under the object name
on the left-hand side. If the saved result matches the current call, it
restores the cached object instead of rerunning the call.
fit := nlmixr2(one.cmt, theo_sd, est = "focei") # creates fit.zip
# Same call: restore from cache
fit := nlmixr2(one.cmt, theo_sd, est = "focei")
# Different estimation method: rerun
fit := nlmixr2(one.cmt, theo_sd, est = "saem") # overwrites fit.zipFor deterministic nlmixr2 fits, the cached form is the
text-and-csv-based fit bundle described above. For other functions, the
cached form is usually an .rds file.
Some calculations depend on the random-number stream. For those,
:= stores both the result and random-state metadata. When
the result is restored, the seed is advanced to the same post-run state
so downstream code sees the same random stream it would have seen if the
expensive call had actually run.
This is what makes := useful for long-running
simulations and stochastic estimation methods: you can restore the
result without silently changing the reproducibility of the rest of the
script.
nlmixr2save into your packageThere are two common integration paths.
If your package returns standard nlmixr2 fit objects
through a deterministic estimation method, users can already write:
and get zip-based save/restore behavior automatically.
If your package provides a stochastic workflow, use the seed-aware
path instead. For plain simulation functions, register the function name
with saveFitRandom(). For nlmixr2 estimation
methods, mark the estimator as random so := knows to use
the seed-aware cache path.
The companion vignette
vignette("register-simulation-functions", package = "nlmixr2save")
shows both patterns.
nlmixr2save is intentionally conservative, and a few
limitations are worth keeping in mind:
The saved fit is primarily .R and .csv,
but not exclusively. Some fit components still fall back to
.rds when they cannot be safely recreated as text.
Cache reuse only works when := sees the expensive
call directly. Wrapping the call inside something like
suppressMessages(nlmixr2(...)) forces the call to run
before caching can intercept it.
If dataset simplification cannot be computed, caching falls back to hashing the full dataset. That is safe, but it can cause more reruns than strictly necessary.
Seed-aware restores require the same starting random state. If the seed is different, the cached stochastic result is discarded and rerun.
The cache files are named from the left-hand-side object name and written in the current working directory, so project-level file management still matters.