shewhartr shewhartr logo

R-CMD-check Codecov test coverage Lifecycle: experimental

shewhartr is a tidyverse-native toolkit for Statistical Process Control (SPC). It implements the classical Shewhart chart family — variables (I-MR, Xbar-R, Xbar-S) and attributes (p, np, c, u) — alongside a flagship regression-based control chart for processes with trend, where stationarity is too strong an assumption to make.

The package is built around a small set of design choices:

Installation

# Development version
remotes::install_github("castlaboratory/shewhartr")

A 30-second tour

library(shewhartr)
library(ggplot2)

# Classical I-MR chart on a 100-observation series with a small drift
fit <- shewhart_i_mr(bottle_fill, value = ml, index = observation)

print(fit)
autoplot(fit)

When to use which chart

Data type Chart
Individual measurements (no rational subgroup) shewhart_i_mr()
Subgroups of size 2-10 shewhart_xbar_r()
Subgroups of size > 10 or unequal n shewhart_xbar_s()
Proportion of nonconforming shewhart_p()
Number of nonconforming, constant n shewhart_np()
Defect counts, constant inspection size shewhart_c()
Defect counts, variable inspection size shewhart_u()
Process with trend (drift, growth, decay) shewhart_regression()
Small persistent shifts (memory-based) shewhart_ewma(), shewhart_cusum()
Several correlated variables monitored jointly shewhart_hotelling()

Multi-phase regression chart

The flagship chart for trended processes splits the series into phases when a runs rule fires, fits a local model in each, and flags points that depart from the local trend. The example below uses the COVID-19 mortality series for Recife (cvd_recife) with the original analysis settings from Ferraz et al. (2020):

fit <- shewhart_regression(
  cvd_recife,
  value       = new_deaths,
  index       = .t,
  model       = "loglog",
  phase_rule  = "we_seven_same",
  rules       = c("nelson_1_beyond_3s", "we_seven_same"),
  lower_bound = 0
)

length(fit$fits)              # number of phases detected
nrow(fit$violations)          # individual flagged observations
autoplot(fit)

Regression chart on cvd_recife showing nine phases with locally fitted limits

Each shaded band is one phase, the solid line is the local regression centre, the dashed lines are the phase’s 3-sigma limits, and the firebrick points are the days flagged by the rule set as departing from the local trend.

Phase I vs Phase II

A Shewhart chart serves two different purposes that are easy to conflate. Phase I is retrospective: take historical data, identify out-of-control points, eliminate assignable causes, and arrive at trustworthy estimates of the process mean and variability. Phase II is prospective: take those estimated limits and apply them to new data, signalling alarms when something departs from the established baseline. The package draws this line in code:

# Phase I: estimate limits from a clean baseline
calib <- calibrate(historical_data, value = y,
                   chart = "i_mr", trim_outliers = TRUE)

# Phase II: apply the limits to new data
alarms <- monitor(new_observations, calib)
alarms$violations

Architecture

shewhartr architecture: tidy data in, S3 chart object, print/plot/broom/monitor out

Every chart constructor — variables (shewhart_i_mr, shewhart_xbar_r, shewhart_xbar_s), attributes (shewhart_p, shewhart_np, shewhart_c, shewhart_u), regression (shewhart_regression), memory-based (shewhart_ewma, shewhart_cusum), and multivariate (shewhart_hotelling) — returns a shewhart_chart S3 object with a uniform layout. The same object then feeds into:

Documentation

The website hosts:

Citation

If you use shewhartr in academic work, please cite:

Leite, A., Vasconcelos, H., Ospina, R., & Ferraz, C. (2025). shewhartr: Statistical Process Control with Tidyverse-Native Workflows. R package version 1.0.0. https://castlaboratory.github.io/shewhartr/

References

mirror server hosted at Truenetwork, Russian Federation.