Title: Tidy Intensive Longitudinal Data Analysis
Version: 0.2.0
Author: Alex Litovchenko [aut, cre]
Maintainer: Alex Litovchenko <al4877@columbia.edu>
Description: An opinionated, tidyverse-native toolkit for intensive longitudinal data (ILD). Encodes time structure, enforces within-between decomposition, provides spacing-aware lags, and integrates diagnostics and visualization. Use ild_prepare(), ild_center(), ild_lag(), and related functions for a unified pipeline from raw EMA/diary data to interpretable models.
License: MIT + file LICENSE
Encoding: UTF-8
Depends: R (≥ 4.0.0)
Imports: tibble, dplyr, lubridate, rlang, lme4, nlme, ggplot2
Suggests: testthat (≥ 3.0.0), roxygen2, knitr, broom.mixed
VignetteBuilder: knitr
Collate: 'package.R' 'ild-class.R' 'utils.R' 'ild_prepare.R' 'ild_summary.R' 'ild_center.R' 'ild_decomposition.R' 'ild_lag.R' 'ild_spacing_class.R' 'ild_spacing.R' 'ild_design_check.R' 'ild_missing_pattern.R' 'ild_missing_bias.R' 'ild_check_lags.R' 'ild_crosslag.R' 'ild_acf.R' 'ild_align.R' 'ild_lme.R' 'ild_person_model.R' 'ild_model_tidiers.R' 'ild_diagnostics.R' 'ild_manifest.R' 'ild_plot.R' 'ild_circadian.R' 'ild_simulate.R' 'ild_power.R' 'data.R' 'broom.R'
RoxygenNote: 7.3.3
NeedsCompilation: no
Packaged: 2026-03-05 16:10:17 UTC; alexanderlitovchenko
Repository: CRAN
Date/Publication: 2026-03-05 16:30:02 UTC

tidyILD: Tidy Intensive Longitudinal Data Analysis

Description

tidyILD is for intensive longitudinal data (ILD), e.g. ecological momentary assessment (EMA) or diary studies. It provides a tidy pipeline from raw data to mixed-effects models with explicit time structure, within-between decomposition, spacing-aware lags, and diagnostics. Use it when you have repeated measures per person over time and want consistent handling of time, gaps, centering, and residual correlation (AR1/CAR1).

Details

All ILD structure ('.ild_*' columns and 'ild_*' metadata) is created only by ild_prepare (via the internal constructor). Downstream functions expect data prepared with ild_prepare(). For the full workflow and applications, see the vignettes.

Getting started

A minimal workflow: simulate or load data, prepare with ild_prepare, inspect with ild_summary, apply ild_center and ild_lag, fit with ild_lme, then ild_diagnostics or ild_plot. See the examples below.

Function index by topic

Setup and validation

ild_prepare, as_ild, is_ild, validate_ild, ild_meta

Summaries and inspection

ild_summary, ild_spacing_class, ild_spacing, ild_design_check, ild_missing_pattern, ild_missing_bias, ild_plot (types: trajectory, gaps, missingness)

Within-person and lags

ild_center, ild_center_plot, ild_decomposition, ild_lag, ild_check_lags, ild_crosslag, ild_align

Modeling

ild_lme, ild_person_model

Diagnostics and visualization

ild_acf, ild_diagnostics, ild_plot (types: fitted, residual_acf), ild_heatmap, ild_spaghetti, ild_circadian

Reproducibility

ild_manifest, ild_bundle

Utilities and data

ild_simulate, ild_power, ema_example

Person-level

ild_person_model, ild_person_distribution

Model tidiers

augment_ild_model, tidy_ild_model; tidy.ild_lme, augment.ild_lme (broom.mixed, see broom_ild_lme)

Vignettes

browseVignettes("tidyILD") lists all vignettes. Key entries:

Key concepts

Author(s)

Alex Litovchenko al4877@columbia.edu

See Also

browseVignettes and vignette(package = "tidyILD") for vignettes. Core entry points: ild_prepare, ild_lme. Related packages: lme4, nlme (model backends), broom.mixed (tidiers).

Examples

library(tidyILD)
d <- ild_simulate(n_id = 10, n_obs_per = 12, irregular = TRUE, seed = 42)
x <- ild_prepare(d, id = "id", time = "time", gap_threshold = 7200)
ild_summary(x)
x <- ild_center(x, y)
x <- ild_lag(x, y, mode = "gap_aware", max_gap = 7200)
fit <- ild_lme(y ~ 1, data = x, ar1 = TRUE, correlation_class = "CAR1")
ild_diagnostics(fit, data = x)
ild_plot(fit, type = "fitted")


Coerce to ILD object

Description

If the object already has the required '.ild_*' columns and attributes, validates and returns it (with tidyild_df and ild_tbl class if missing). Otherwise errors.

Usage

as_ild(x)

Arguments

x

A data frame or tibble that may already be ILD-shaped.

Value

An ILD tibble with class tidyild_df and ild_tbl.


Augment an ILD model fit with fitted values and residuals

Description

Returns a tibble with one row per observation: .ild_id, .ild_time, the response variable (column name from the model formula, e.g. y), .fitted, and .resid. This structure is used internally by [ild_diagnostics()] and [ild_plot()]. Requires attr(fit, "ild_data"); refit with [ild_lme()] if missing. Random effects predictions can be added later.

Usage

augment_ild_model(fit, ...)

Arguments

fit

A fitted model from [ild_lme()] (must have attr(fit, "ild_data")).

...

Unused.

Value

A tibble with columns .ild_id, .ild_time, the response (name from formula), .fitted, .resid.


Tidy and augment ild_lme fits with broom.mixed

Description

These S3 methods delegate to [broom.mixed::tidy()] and [broom.mixed::augment()] on the underlying model object so that ild_lme fits work in tidy workflows. Package broom.mixed must be attached (e.g. library(broom.mixed)).

Usage

tidy.ild_lme(x, ...)

augment.ild_lme(x, ...)

Arguments

x

A fitted model from [ild_lme()].

...

Passed to broom.mixed::tidy() or broom.mixed::augment().

Value

Same as the corresponding broom.mixed method.


Example EMA-style intensive longitudinal dataset

Description

A small simulated dataset with 10 persons and 14 observations per person, irregular timing, and two variables (mood, stress). For use in examples and vignettes. Use [ild_prepare()] to convert to an ILD object.

Format

A data frame with 140 rows and 4 columns:

id

Person identifier (1–10).

time

POSIXct timestamp (irregular within person).

mood

Simulated mood score.

stress

Simulated stress score.

Source

Simulated with a fixed seed (12345) for reproducibility.


Autocorrelation function for ILD variables or model residuals

Description

Computes ACF on a variable in ILD data or on residuals from an [ild_lme()] fit. Use this to check whether AR1 is appropriate before fitting models. ACF is computed over the ordered observation sequence (pooled or within person); it does not adjust for irregular time gaps.

Usage

ild_acf(x, ..., by_id = FALSE)

Arguments

x

Either an ILD object (see [is_ild()]) or a fitted model from [ild_lme()].

...

When x is ILD data, the variable(s) to compute ACF on (tidy-select; one variable). Ignored when x is a fit.

by_id

Logical. If TRUE, also return per-person ACF in acf_by_id (default FALSE).

Value

A list with acf: a tibble with columns lag and acf (pooled). If by_id = TRUE, acf_by_id is a named list of tibbles (one per person).

Examples

d <- ild_simulate(n_id = 5, n_obs_per = 10, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
ild_acf(x, "y")
fit <- ild_lme(y ~ 1 + (1 | id), data = x, ar1 = FALSE, warn_no_ar1 = FALSE)
ild_acf(fit)

Align a secondary stream to primary ILD within a time window

Description

For each row in the primary ILD, finds observations in the secondary data set (same id, time within window before the primary time) and attaches an aggregated value (e.g. mean, median, or closest). Use when combining self-report with wearables or other streams that have different timestamps.

Usage

ild_align(
  primary,
  secondary,
  value_var,
  window,
  time_secondary = "time",
  fun = c("mean", "median", "closest")
)

Arguments

primary

An ILD object (see [is_ild()]); the stream to keep as rows.

secondary

A data frame with id and time columns and the value variable(s) to align.

value_var

Character. Name of the column in secondary to align (e.g. "heart_rate").

window

Numeric or lubridate duration. Time window (same units as .ild_time_num, e.g. seconds for POSIXct). Only secondary observations with time in (primary_time - window, primary_time] are used.

time_secondary

Character. Name of the time column in secondary (default "time").

fun

Character. Aggregation for values in window: "mean", "median", or "closest" (most recent in window).

Value

The primary data with a new column <value_var>_aligned (numeric; NA where no secondary obs in window).

Examples

prim <- ild_prepare(
  data.frame(
    id = rep(1:2, each = 3),
    time = as.POSIXct(rep(c(0, 3600, 7200), 2), origin = "1970-01-01"),
    y = rnorm(6)
  ),
  id = "id", time = "time"
)
sec <- data.frame(
  id = rep(1:2, each = 4),
  time = as.POSIXct(rep(c(0, 1800, 3600, 5400), 2), origin = "1970-01-01"),
  heart_rate = 60 + rnorm(8, 0, 5)
)
ild_align(prim, sec, "heart_rate", window = 3600, fun = "mean")

Bundle a result with a reproducibility manifest

Description

Combines a result (e.g. a fit from [ild_lme()] or output from [ild_diagnostics()]) with a manifest and optional label for one-shot saving. Typical use: saveRDS(ild_bundle(fit, label = "model_ar1"), "run.rds"). You can build a manifest with [ild_manifest()] and pass scenario (e.g. from [ild_summary()]) and seed before bundling.

Usage

ild_bundle(result, manifest = NULL, label = NULL)

Arguments

result

Any object (e.g. fitted model, diagnostics list).

manifest

List. Reproducibility manifest from [ild_manifest()]. If NULL, [ild_manifest()] is called with default arguments.

label

Optional character. Short label for the run (e.g. "model_ar1" or "diagnostics").

Value

A list with elements result, manifest, label, suitable for [saveRDS()].

Examples

dat <- ild_prepare(ild_simulate(seed = 1), "id", "time")
fit <- ild_lme(y ~ 1 + (1 | id), dat, ar1 = FALSE, warn_no_ar1 = FALSE)
b <- ild_bundle(fit, label = "ar1")
names(b)
b <- ild_bundle(fit, manifest = ild_manifest(seed = 1, scenario = list(n_obs = 50)), label = "run1")

Within-person and between-person decomposition (centering)

Description

For each selected variable, computes the person mean (between-person component) and the within-person deviation (variable minus person mean). Use '*_wp' at level-1 and '*_bp' at level-2 or in cross-level interactions to avoid ecological fallacy and conflation bias. Selected variables must be numeric.

Usage

ild_center(
  x,
  ...,
  type = c("person_mean", "grand_mean"),
  naming = c("suffix", "prefix")
)

Arguments

x

An ILD object (see [is_ild()]).

...

Variables to center (tidy-select). Unquoted names or a single character vector of column names. Must be numeric.

type

Character. '"person_mean"' (default) for person-mean centering (x_bp, x_wp); '"grand_mean"' for grand-mean centering (x_gm, x_wp_gm).

naming

Character. '"suffix"' (default): new columns var_bp, var_wp; "prefix": new columns BP_var, WP_var.

Value

The same ILD tibble with additional columns. ILD attributes are preserved.


Standalone WP/BP centering plot

Description

Shows within-person deviation and between-person (person mean) distribution for selected variable(s). Uses the same plot as [ild_decomposition(..., plot = TRUE)]. Useful when you only want the visualization without the variance table.

Usage

ild_center_plot(x, ...)

Arguments

x

An ILD object (see [is_ild()]).

...

Variables to plot (tidy-select). Must be numeric. Only the first is plotted.

Value

A ggplot object (WP vs BP density overlay for the first selected variable).

Examples

d <- ild_simulate(n_id = 10, n_obs_per = 8, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
ild_center_plot(x, y)

Check lag variable validity (gap-aware)

Description

Given an ILD object and lag variable names, reports how many lagged values are valid vs invalid (NA because the time distance to the lagged row exceeded a threshold). Useful to audit lag columns before modeling without re-specifying max_gap.

Usage

ild_check_lags(x, lag_vars = NULL, max_gap = NULL)

Arguments

x

An ILD object (see [is_ild()]) that contains lag columns (e.g. from [ild_lag()] with mode = "gap_aware").

lag_vars

Character vector of lag column names (e.g. "y_lag1"). If NULL, attempts to detect columns ending in _lag{n} or _lag_window.

max_gap

Numeric. Threshold used to define invalid (same units as .ild_time_num). If NULL, uses ild_meta(x)$ild_gap_threshold.

Value

A data frame with one row per lag variable: var, lag (parsed lag order or "window"), n_valid, n_invalid, n_first, n_total, pct_valid, pct_invalid.


Time-of-day pattern plot for ILD (circadian-style)

Description

Plots a variable by hour of day (or time-of-day) when .ild_time is POSIXct. Useful for EMA (e.g. mood or activity by hour). Does not add columns to the ILD object; hour is derived internally for plotting.

Usage

ild_circadian(x, var, type = c("boxplot", "line"))

Arguments

x

An ILD object (see [is_ild()]) with .ild_time as POSIXct.

var

Character or symbol. Variable to plot (e.g. mood, activity).

type

Character. "boxplot" (default) or "line" (mean per hour with optional SE).

Value

A ggplot object (variable by hour of day).

Examples

d <- ild_simulate(n_id = 5, n_obs_per = 12, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
ild_circadian(x, y)

Cross-lag model: lag predictor then fit outcome ~ lag

Description

One-call pipeline: [ild_lag()] the predictor, [ild_check_lags()] to audit, then [ild_lme()] to fit outcome on the lagged predictor. Returns the fit, the lag-term coefficient (estimate, CI, p), and lag validity check.

Usage

ild_crosslag(
  data,
  outcome,
  predictor,
  lag = 1L,
  mode = c("gap_aware", "index", "time_window"),
  max_gap = NULL,
  ar1 = FALSE,
  include_diagnostics = FALSE,
  ...
)

Arguments

data

An ILD object (see [is_ild()]).

outcome

Character or symbol. Name of the outcome variable (e.g. "mood").

predictor

Character or symbol. Name of the predictor to lag (e.g. "stress").

lag

Integer. Lag order (default 1).

mode

Character. Passed to [ild_lag()]: "index", "gap_aware" (default), or "time_window".

max_gap

Numeric or NULL. Passed to [ild_lag()] when mode = "gap_aware".

ar1

Logical. If TRUE, fit with [ild_lme()] using AR1/CAR1 (default FALSE).

include_diagnostics

Logical. If TRUE, attach [ild_diagnostics()] to the return (default FALSE).

...

Passed to [ild_lme()].

Value

A list: fit (fitted model), lag_term (one-row tibble from [tidy_ild_model()] for the lag variable), lag_check (tibble from [ild_check_lags()]), data (ILD with lag column), and optionally diagnostics.

Examples

d <- ild_simulate(n_id = 10, n_obs_per = 8, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
out <- ild_crosslag(x, "y", "y", lag = 1, ar1 = FALSE, warn_no_ar1 = FALSE)
out$lag_term
out$lag_check

Within-person and between-person variance decomposition

Description

Reports WP and BP variance and their ratio for selected variables. Use as a diagnostic and teaching tool: large ratio suggests within-person variance dominates; small ratio suggests between-person differences dominate. Helps avoid conflating WP and BP effects in modeling.

Usage

ild_decomposition(x, ..., plot = FALSE)

Arguments

x

An ILD object (see [is_ild()]).

...

Variables to decompose (tidy-select). Must be numeric.

plot

Logical. If TRUE, return a list with table and plot (WP vs BP density overlay). Default FALSE.

Value

A tibble with columns variable, wp_var, bp_var, ratio (wp_var / bp_var; Inf if bp_var is 0). If plot = TRUE, a list with table and plot (ggplot).

Examples

d <- ild_simulate(n_id = 10, n_obs_per = 8, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
ild_decomposition(x, y)

ILD design diagnostics: spacing, WP/BP, missingness, and recommendations

Description

Aggregates [ild_summary()], [ild_spacing()], [ild_spacing_class()], and optionally [ild_decomposition()] and [ild_missing_pattern()] into one design summary. Use before modeling to see spacing class, correlation recommendation, within- vs between-person variance, and missingness.

Usage

ild_design_check(x, vars = NULL)

Arguments

x

An ILD object (see [is_ild()]).

vars

Optional character vector of variable names for decomposition and missingness. If NULL, only spacing and summary are computed; wp_bp and missingness will be NULL.

Value

A list of class ild_design_check: summary (from ild_summary), spacing_class (regular-ish / irregular-ish), spacing (from ild_spacing), recommendation (AR1/CAR1 text), wp_bp (decomposition tibble or NULL), missingness (list with summary tibble and pct_na overall, or NULL). Use print() for a human-readable summary.

Examples

d <- ild_simulate(n_id = 10, n_obs_per = 8, irregular = TRUE, seed = 1)
x <- ild_prepare(d, id = "id", time = "time", gap_threshold = 7200)
ild_design_check(x, vars = "y")

Residual diagnostics for an ILD model

Description

Computes residual ACF (by person and/or pooled), residual vs fitted, residual vs time, and optional Q-Q. Use 'type' to request only specific diagnostics. For 'ild_lme' models with 'ar1 = TRUE', the estimated AR/CAR parameter is reported when 'type' includes '"residual_acf"'.

Usage

ild_diagnostics(
  object,
  data = NULL,
  type = c("residual_acf", "residual_time", "qq"),
  by_id = TRUE,
  ...
)

Arguments

object

A fitted model from [ild_lme()] (or an object with 'residuals()', and optional 'fitted()'; if not 'ild_lme', pass 'data' with '.ild_id' and '.ild_time_num' or '.ild_seq').

data

Optional. ILD data (required if 'object' is not from [ild_lme()]).

type

Character vector. Which diagnostics to compute: '"residual_acf"', '"residual_time"' (residuals vs time and vs fitted), '"qq"'. Default is all three.

by_id

Logical. If 'TRUE', compute ACF within each person (default 'TRUE').

...

Unused.

Details

The return value follows a stable schema: 'meta' (engine, ar1, id/time columns, n_obs, n_id), 'data$residuals' (tibble with '.ild_id', '.ild_time', response, '.resid', '.fitted'), and 'stats' (e.g. 'acf', 'ar1_param'). Plots are not stored in the object; use [plot_ild_diagnostics()] to generate them from a diagnostics object. The column .resid is always filled; .fitted is filled when it can be computed without refitting, otherwise it is NA (same for both engines and all type values).

Residual ACF is computed over the ordered observation sequence within person; it does not adjust for irregular time gaps.

Value

A list of class 'ild_diagnostics' with: 'meta' (engine, ar1, id_col, time_col, n_obs, n_id, type, by_id), 'data' (list with 'residuals' = tibble of .ild_id, .ild_time, response (name from formula), .resid, .fitted; data$residuals always exists, .resid is always filled, .fitted is returned when it can be computed without refitting, otherwise NA), 'stats' (list with 'acf' = list(pooled = tibble, by_id = list) when requested, 'ar1_param' = numeric or NULL for lme). Use [plot_ild_diagnostics()] for plots.

Examples

x <- ild_prepare(ild_simulate(n_id = 3, n_obs_per = 6, seed = 1), id = "id", time = "time")
fit <- ild_lme(y ~ 1 + (1 | id), data = x, ar1 = FALSE, warn_no_ar1 = FALSE)
diag <- ild_diagnostics(fit, type = c("residual_acf", "qq"))
plot_ild_diagnostics(diag)

ILD heatmap (alias for ild_plot with type = "heatmap")

Description

Person x time heatmap of a variable. See [ild_plot()].

Usage

ild_heatmap(x, var = NULL, ...)

Arguments

x

ILD object or fitted model (for heatmap, data are taken from ild_data if model).

var

Variable to plot. If NULL, single data column is used.

...

Passed to [ild_plot()] (e.g. id_var, time_var).

Value

A ggplot object.


Spacing-aware lag within person

Description

Computes lagged values within each person. Use this instead of [dplyr::lag()], which assumes equal spacing and no gaps and is unsafe for irregular ILD.

Usage

ild_lag(
  x,
  ...,
  n = 1L,
  mode = c("index", "gap_aware", "time_window"),
  max_gap = NULL,
  window = NULL,
  resolution = c("closest_prior", "last_in_window", "mean_in_window")
)

Arguments

x

An ILD object (see [is_ild()]).

...

Variables to lag (tidy-select). Unquoted names or selection.

n

Integer. Lag order (default 1 = previous observation).

mode

Character. "index": row-based lag. "gap_aware": same but NA when interval exceeds max_gap. "time_window": value from (time - window, time] with resolution.

max_gap

Numeric. For gap_aware only. Same units as .ild_time_num. If NULL, uses ild_meta(x)$ild_gap_threshold (metadata-driven default).

window

Numeric or lubridate duration. For time_window only: time window width. Numeric is in same units as .ild_time_num (e.g. seconds for POSIXct). You can pass a lubridate period/duration (e.g. lubridate::hours(2)); it is converted to seconds for POSIXct.

resolution

Character. For time_window: "closest_prior" (default: most recent observation in window), "last_in_window", or "mean_in_window".

Value

The same ILD tibble with new lag columns. ILD attributes preserved.


Fit a linear mixed-effects model to ILD

Description

When 'ar1 = FALSE', fits with [lme4::lmer()] (no residual correlation). When 'ar1 = TRUE', fits with [nlme::lme()] using a residual correlation structure: CAR1 (continuous-time) by default for irregular spacing, or AR1 when spacing is regular-ish. Use [ild_spacing_class()] to inform the choice; override with 'correlation_class'.

Usage

ild_lme(
  formula,
  data,
  ar1 = FALSE,
  correlation_class = c("auto", "AR1", "CAR1"),
  random = ~1 | .ild_id,
  warn_no_ar1 = TRUE,
  warn_uncentered = TRUE,
  ...
)

Arguments

formula

Fixed-effects formula. For 'ar1 = TRUE', must be fixed-only (e.g. 'y ~ x'); random structure is set to '~ 1 | .ild_id' internally. For 'ar1 = FALSE', formula may include random effects (e.g. 'y ~ x + (1|id)').

data

An ILD object (see [is_ild()]).

ar1

Logical. If 'TRUE', fit with nlme and residual AR1/CAR1 correlation; if 'FALSE', fit with lme4 (no residual correlation).

correlation_class

Character. '"auto"' (default) uses [ild_spacing_class()] to choose CAR1 (irregular-ish) or AR1 (regular-ish). Use '"CAR1"' or '"AR1"' to override.

random

For 'ar1 = TRUE', the random effects formula (default '~ 1 | .ild_id'). Must use '.ild_id' as grouping for correlation to match.

warn_no_ar1

If 'TRUE' (default), warn when 'ar1 = FALSE' that temporal autocorrelation is not modeled.

warn_uncentered

If 'TRUE' (default), warn when a predictor in the formula varies both within and between persons but is not decomposed (no _wp/_bp); suggests using [ild_center()].

...

Passed to [lme4::lmer()] or [nlme::lme()].

Value

A fitted model object (class 'lmerMod' or 'lme') with attribute 'ild_data' (the ILD data) and 'ild_ar1' (logical). When 'ar1 = TRUE', the returned object has class 'ild_lme' prepended and attribute 'ild_random_resolved' (the formula actually passed to nlme, e.g. '~ 1 | M2ID'). See [ild_diagnostics()] and [ild_plot()].

Examples

# lme4 path: formula includes random effects
set.seed(1)
dat <- ild_simulate(n_id = 5, n_obs_per = 6, seed = 1)
dat <- ild_prepare(dat, id = "id", time = "time")
dat <- ild_center(dat, y)
fit_lmer <- ild_lme(y ~ y_bp + y_wp + (1 | id), data = dat,
                    ar1 = FALSE, warn_no_ar1 = FALSE)
# nlme path (may not converge on all platforms; see ?nlme::lme)
## Not run: 
fit_lme <- ild_lme(y ~ y_bp + y_wp, data = dat,
                   random = ~ 1 | id, ar1 = TRUE)

## End(Not run)


Create a reproducibility manifest

Description

Captures timestamp, optional seed, optional scenario fingerprint, session info, and optional git SHA for use when saving or serializing results (e.g. after [ild_lme()] or [ild_diagnostics()]). The return value is a serializable list suitable for [saveRDS()] or [ild_bundle()].

Usage

ild_manifest(
  seed = NULL,
  scenario = NULL,
  include_session = TRUE,
  include_git = FALSE,
  git_path = "."
)

Arguments

seed

Optional integer. Seed used for the run (e.g. from [ild_simulate()] or set before fitting). Not captured automatically; pass explicitly if you want it in the manifest.

scenario

Optional. Named list or character string describing the run (e.g. formula, n_obs, n_id, ar1). Build from [ild_summary()] or a short list when calling after [ild_lme()] / [ild_diagnostics()].

include_session

Logical. If 'TRUE' (default), include [utils::sessionInfo()] in the manifest. Set to 'FALSE' to reduce size.

include_git

Logical. If 'TRUE', attempt to record the current git commit SHA from git_path. Default 'FALSE'.

git_path

Character. Path to the repository root (default "."). Used only when include_git = TRUE.

Value

A list with elements timestamp (POSIXct), seed (integer or NULL), scenario (as provided or NULL), session_info (list from sessionInfo() or NULL), git_sha (length-1 character or NA). All elements are serializable.

Examples

m <- ild_manifest()
names(m)
m <- ild_manifest(seed = 42, scenario = list(n_obs = 100, formula = "y ~ x"))
m$seed
m$scenario

Get ILD metadata attributes

Description

Returns the metadata attributes set by [ild_prepare()]: user-facing id/time column names, gap threshold, n_units, n_obs, and spacing (descriptive stats only).

Usage

ild_meta(x)

Arguments

x

An ILD object (see [is_ild()]).

Value

A named list of metadata (ild_id, ild_time, ild_gap_threshold, ild_n_units, ild_n_obs, ild_spacing). ild_spacing includes overall stats and may contain by_id, a tibble of per-person spacing stats.


Test whether missingness is associated with a predictor (informative missingness)

Description

Fits a logistic model of missingness (binary: is the outcome NA?) on a predictor variable. Use as a diagnostic: if the predictor is significant, missingness may be informative and results could be biased. This function does not correct for missingness; it flags the assumption for sensitivity analyses.

Usage

ild_missing_bias(x, outcome_var, predictor_var, random = FALSE)

Arguments

x

An ILD object (see [is_ild()]).

outcome_var

Character. Name of the variable with missingness (e.g. "mood").

predictor_var

Character. Name of the suspected predictor of missingness (e.g. "stress").

random

Logical. If TRUE, fit a mixed-effects logistic model is_missing ~ predictor + (1 | id) via lme4::glmer; if FALSE (default), fit glm(is_missing ~ predictor, family = binomial).

Value

A list with predictor (name), estimate, std_error, p_value, and message (short note about informative missingness).

Examples

set.seed(1)
d <- ild_simulate(n_id = 20, n_obs_per = 10, seed = 1)
d$stress <- rnorm(nrow(d))
d$mood <- d$y
d$mood[sample(nrow(d), 30)] <- NA  # some missing
x <- ild_prepare(d, id = "id", time = "time")
ild_missing_bias(x, "mood", "stress")

Summarize missingness pattern in ILD

Description

Returns a tabular summary of missingness by person and/or by variable, plus an optional heatmap plot. Complements [ild_summary()] and supports checking data before modeling. When vars = NULL, all non-internal data columns are used (observation presence across variables).

Usage

ild_missing_pattern(x, vars = NULL, max_ids = NULL, seed = NULL)

Arguments

x

An ILD object (see [is_ild()]).

vars

Optional character vector of variable names to summarize. If NULL, all non-.ild_* data columns are used.

max_ids

Optional integer. If set, subset to this many persons (sampled) before computing by_id, summary, and plot to handle large N.

seed

Optional integer. Seed for sampling when max_ids is set.

Value

A list with: summary (tibble: one row per var, columns var, n_obs, n_na, pct_na), plot (ggplot2 object for missingness heatmap), by_id, overall, n_complete, vars.


Plot distribution of person-level estimates from ild_person_model

Description

Draws a histogram or density of the selected term's estimates across persons. Useful to visualize heterogeneity (e.g. distribution of slopes or intercepts).

Usage

ild_person_distribution(
  person_fit,
  term = NULL,
  type = c("histogram", "density")
)

Arguments

person_fit

Tibble returned by [ild_person_model()] (columns term, estimate, etc.).

term

Character. Which term to plot (e.g. "(Intercept)" or a covariate name). If NULL, the first term in the table is used.

type

Character. "histogram" (default) or "density".

Value

A ggplot object.


Fit a model separately per person (N-of-1 / idiographic)

Description

Splits the ILD by person and fits the same formula (e.g. lm) within each. Returns a tibble of person-level estimates for teaching, N-of-1 analysis, or inspecting heterogeneity. Use [ild_person_distribution()] to visualize the distribution of estimates across persons.

Usage

ild_person_model(formula, data, method = c("lm"), min_obs = 2L)

Arguments

formula

A formula (e.g. y ~ x). Used for each person's lm.

data

An ILD object (see [is_ild()]).

method

Character. Currently only "lm" (default).

min_obs

Integer. Minimum observations per person to fit (default 2). Persons with fewer are omitted or get NA rows.

Value

A tibble with columns .ild_id (or the id column name from metadata), term, estimate, std_error, p_value, and optionally sigma, n_obs. One row per person per term (long format).

Examples

d <- ild_simulate(n_id = 5, n_obs_per = 8, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
pm <- ild_person_model(y ~ 1, x)
ild_person_distribution(pm, term = "(Intercept)")

ILD-specific plots

Description

Produces trajectory (spaghetti), heatmap, gaps, and (if a fitted model is provided) fitted vs actual and residual ACF. Works for both lmerMod and lme (ild_lme with ar1 = TRUE).

Usage

ild_plot(
  x,
  type = c("trajectory", "heatmap", "gaps", "missingness", "fitted", "fitted_vs_actual",
    "residual_acf"),
  var = NULL,
  id_var = ".ild_id",
  time_var = c(".ild_time_num", ".ild_seq"),
  max_ids = 20L,
  seed = 42L,
  ...
)

Arguments

x

An ILD tibble or a fitted [ild_lme()] model.

type

Character (or vector). One or more of: '"trajectory"', '"heatmap"', '"gaps"', '"missingness"', '"fitted"' or '"fitted_vs_actual"' (requires fitted model), '"residual_acf"' (requires fitted model; ACF is over observation sequence, not adjusted for irregular time gaps). If length > 1, returns a named list of ggplots.

var

For 'trajectory' or 'heatmap', the variable to plot (optional; if missing and only one non-.ild_* column exists, it is used).

id_var

For trajectory, variable used for grouping (default '.ild_id').

time_var

For trajectory/gaps, x-axis: '.ild_time_num' or '.ild_seq'.

max_ids

For trajectory, max number of persons to plot (sampled if larger; default 20). Set to 'Inf' to plot all.

seed

Integer. Seed for sampling ids when 'max_ids' is set (default 42).

...

Unused.

Value

A single ggplot when 'length(type) == 1', or a named list of ggplots when 'length(type) > 1'.

Examples

x <- ild_prepare(ild_simulate(n_id = 3, n_obs_per = 6, seed = 1), id = "id", time = "time")
fit <- ild_lme(y ~ 1 + (1 | id), data = x, ar1 = FALSE, warn_no_ar1 = FALSE)
ild_plot(fit, type = "fitted_vs_actual")
ild_plot(fit, type = c("fitted_vs_actual", "residual_acf"))

Simulation-based power analysis for a fixed effect in ILD models

Description

Estimates empirical power by repeatedly simulating data with a known effect (via [ild_simulate()] plus one added predictor), fitting with [ild_lme()], and counting the proportion of runs where the target term is significant (Wald p < alpha). The workflow (simulate, fit, reject/retain) mirrors simulation-based power in packages like mixpower; ild_power() is focused on ILD and ild_lme(). For multi-parameter grids, LRT, or general LMMs, consider mixpower.

Usage

ild_power(
  formula,
  n_sim = 500L,
  n_id,
  n_obs_per,
  effect_size,
  test_term = NULL,
  alpha = 0.05,
  ar1 = FALSE,
  seed = 42L,
  return_sims = FALSE,
  verbose = TRUE,
  ...
)

Arguments

formula

Fixed-effects formula including the predictor to power for and random effects, e.g. y ~ x + (1 | id). For ar1 = TRUE, use a fixed-only formula (random passed internally).

n_sim

Integer. Number of simulation replications (default 500).

n_id

Integer. Number of persons per replication.

n_obs_per

Integer. Observations per person per replication.

effect_size

Numeric. True coefficient for test_term in the DGP.

test_term

Character or NULL. Which fixed-effect term to test. If NULL, taken as the first non-intercept fixed-effect term from the model (inferred from the formula).

alpha

Numeric. Significance level for rejection (default 0.05).

ar1

Logical. If TRUE, fit with nlme and residual AR1/CAR1 (default FALSE).

seed

Integer. Base random seed; replication i uses seed + i.

return_sims

Logical. If TRUE, include a tibble of per-run estimate, std_error, p_value, rejected in the result (default FALSE).

verbose

Logical. If TRUE, message progress (default TRUE).

...

Passed to [ild_simulate()] (e.g. irregular, wp_effect, bp_effect) and to [ild_lme()].

Details

The data-generating process adds one predictor (name from test_term) as standard normal and adds effect_size * predictor to the outcome on top of the base [ild_simulate()] DGP (id, time, y). No change to ild_simulate() is required.

For ar1 = FALSE (lmer), the lme4 backend does not report p-values; inference for the test term uses a Wald z-approximation (estimate / SE) so that power is still computed. For ar1 = TRUE (nlme), p-values come from the model summary.

Value

A list: power (proportion of converged runs with p < alpha), n_sim, n_reject, n_converged, n_failed, alpha, test_term. If return_sims = TRUE, also sim_results (tibble of per-run results).

Examples

set.seed(42)
res <- ild_power(
  formula = y ~ x + (1 | id),
  n_sim = 25L,
  n_id = 15L,
  n_obs_per = 10L,
  effect_size = 0.3,
  seed = 42L,
  verbose = FALSE
)
res$power
res$n_reject

Prepare a data frame as an ILD (intensive longitudinal data) object

Description

Validates and encodes longitudinal structure: parses time, sorts by id and time, handles duplicate timestamps, and adds internal columns ('.ild_*') and metadata. All downstream functions assume the result of 'ild_prepare()'.

Usage

ild_prepare(
  data,
  id,
  time,
  gap_threshold = Inf,
  duplicate_handling = c("first", "last", "error", "collapse"),
  collapse_fn = NULL
)

Arguments

data

A data frame or tibble with at least an id and a time column.

id

Character. Name of the subject/unit identifier column.

time

Character. Name of the time column (Date, POSIXct, or numeric).

gap_threshold

Numeric. Time distance above which an interval is flagged as a gap ('.ild_gap' TRUE). Same units as the numeric time (e.g. seconds if time is POSIXct). Use 'Inf' to disable gap flagging.

duplicate_handling

Character. How to handle duplicate timestamps within the same id: '"first"' (keep first), '"last"' (keep last), '"error"' (stop with an error), '"collapse"' (aggregate with collapse_fn).

collapse_fn

Named list of functions, one per variable to collapse. Used only when duplicate_handling = "collapse". E.g. list(x = mean, y = function(z) z[1]). Variables not in collapse_fn keep their first value within the duplicate group.

Value

An ILD tibble with '.ild_*' columns and metadata attributes. Spacing metadata (see [ild_meta()]) includes overall stats and a by_id tibble of per-person spacing stats (median_dt, iqr_dt, n_intervals, pct_gap). Use [ild_summary()] to inspect and check gap flags before modeling.


Simulate simple ILD for examples, tests, and power analysis

Description

Generates a tibble with id, time, and outcome y. Optionally uses AR(1) within-person correlation and configurable WP/BP variance. Use [ild_prepare()] after to get a proper ILD object.

Usage

ild_simulate(
  n_id = 5L,
  n_obs_per = 10L,
  n_time = NULL,
  irregular = FALSE,
  ar1 = NULL,
  wp_effect = 0.5,
  bp_effect = 1,
  seed = 42L
)

Arguments

n_id

Integer. Number of persons (default 5).

n_obs_per

Integer. Observations per person (default 10).

n_time

Integer. Alias for n_obs_per (observations per person). If provided, overrides n_obs_per.

irregular

Logical. If TRUE, add random jitter to time (default FALSE).

ar1

Numeric or NULL. If numeric, within-person AR(1) correlation (e.g. 0.4). If NULL or 0, no AR (default NULL).

wp_effect

Numeric. Scale (SD) of within-person innovation (default 0.5).

bp_effect

Numeric. Scale (SD) of between-person random intercept (default 1).

seed

Integer. Random seed for reproducibility (default 42).

Value

A data frame with columns id, time (POSIXct), and y.

Examples

d <- ild_simulate(n_id = 3, n_obs_per = 5, seed = 1)
x <- ild_prepare(d, id = "id", time = "time")
d2 <- ild_simulate(n_id = 100, n_time = 50, ar1 = 0.4, wp_effect = 0.6,
  bp_effect = 0.3, irregular = TRUE, seed = 1)

Spacing diagnostics and correlation-structure recommendation

Description

Reports observation intervals in human-friendly units (e.g. hours) and recommends AR1 vs CAR1 for use in [ild_lme()]. Surfaces the same logic that ild_lme(..., ar1 = TRUE) uses internally so users can see why a correlation structure was chosen.

Usage

ild_spacing(x, gap_large_hours = 12)

Arguments

x

An ILD object (see [is_ild()]).

gap_large_hours

Numeric. Intervals (in hours) above which to count as "large gaps" for large_gaps_pct (default 12). Ignored if time is not in seconds (e.g. numeric day indices).

Value

A list with median_interval (hours), iqr (hours), large_gaps_pct (percent of intervals > gap_large_hours), coefficient_of_variation, recommendation (character: use CAR1 or AR1), and spacing_class (regular-ish or irregular-ish).

Examples

d <- ild_simulate(n_id = 5, n_obs_per = 10, irregular = TRUE, seed = 1)
x <- ild_prepare(d, id = "id", time = "time", gap_threshold = 7200)
ild_spacing(x)

Classify spacing as regular-ish vs irregular-ish

Description

Returns a simple classification for use in documentation or when choosing correlation structure (e.g. AR1 vs CAR1 in [ild_lme()]). The rule is documented and overridable via arguments. Does not change core ILD behavior.

Usage

ild_spacing_class(x, cv_threshold = 0.2, pct_gap_threshold = 10)

Arguments

x

An ILD object (see [is_ild()]).

cv_threshold

Numeric. Coefficient of variation of within-person intervals above which spacing is "irregular-ish" (default 0.2).

pct_gap_threshold

Numeric. Percent of intervals flagged as gaps above which spacing is "irregular-ish" (default 10).

Value

Character: '"regular-ish"' or '"irregular-ish"'.


ILD spaghetti / person trajectories (alias for ild_plot with type = "trajectory")

Description

Line plot of variable over time, one line per person. See [ild_plot()].

Usage

ild_spaghetti(x, var = NULL, ...)

Arguments

x

ILD object or fitted model.

var

Variable to plot. If NULL, single data column is used.

...

Passed to [ild_plot()] (e.g. max_ids, seed, id_var, time_var).

Value

A ggplot object.


One-shot summary of an ILD object

Description

Reports number of persons, number of observations, time range, descriptive spacing (median/IQR of intervals, percent gaps), and duplicate info. Uses [ild_meta()] and '.ild_*' columns only. No hard "regular"/"irregular" label; use [ild_spacing_class()] for that.

Usage

ild_summary(x)

Arguments

x

An ILD object (see [is_ild()]).

Value

A list with elements: summary (one-row tibble with n_id, n_obs, time_min, time_max, prop_gap, median_dt_sec, iqr_dt_sec), n_units, n_obs, time_range, spacing, n_gaps, pct_gap. The summary tibble is the primary contract for programmatic use.


Check if an object is a valid ILD tibble

Description

Returns TRUE if the object has all required '.ild_*' columns and 'ild_*' metadata attributes (as set by [ild_prepare()]).

Usage

is_ild(x)

Arguments

x

Any object.

Value

Logical.


Plot diagnostics from an ild_diagnostics object

Description

Generates ggplot objects for the requested diagnostic types. Plots are not stored in the diagnostics object; call this function to create them.

Usage

plot_ild_diagnostics(diag, type = NULL)

Arguments

diag

An object returned by [ild_diagnostics()].

type

Character vector. Which plots to build (default: the types stored in diag$meta$type).

Value

A named list of ggplot objects (e.g. residual_acf, residuals_vs_fitted, residuals_vs_time, qq).


Tidy fixed effects from an ILD model fit

Description

Returns a tibble of fixed-effect estimates with consistent columns for both lmer and lme engines: term, estimate, std_error, ci_low, ci_high, p_value. With object = TRUE, returns an object of class tidyild_model (meta + table) for use with print.tidyild_model.

Usage

tidy_ild_model(fit, conf_level = 0.95, object = FALSE, ...)

Arguments

fit

A fitted model from [ild_lme()] (lmerMod or lme).

conf_level

Numeric. Confidence level for intervals (default 0.95).

object

Logical. If TRUE, return a list with meta and table and class tidyild_model for polished printing (default FALSE).

...

Unused.

Value

A tibble, or when object = TRUE a list of class tidyild_model.


Validate an ILD object and error if invalid

Description

Checks presence and types of '.ild_*' columns and 'ild_*' attributes. Errors with a clear message if anything is missing or invalid. Calls ild_normalize_internal() so legacy objects get attr(x, "tidyILD") and class tidyild_df.

Usage

validate_ild(x)

Arguments

x

Object to validate (expected to be an ILD tibble).

Value

Invisibly returns x if valid.

mirror server hosted at Truenetwork, Russian Federation.