CharAnalysis: Diagnostic and analytical tools for peak detection and fire-history interpretations using high-resolution sediment-charcoal records

Philip Higuera

2026-04-27

Overview

CharAnalysis reconstructs local fire histories from lake-sediment charcoal records. Charcoal preserved in lake sediments is a direct proxy for past fire activity: individual fire events near a lake deposit pulses of charcoal that appear as peaks above a slowly varying background signal. CharAnalysis formalises this peak-detection logic as a reproducible, quantitative workflow.

This R package (v2.0.0) is a direct translation of CharAnalysis v2.0 (MATLAB), validated against reference outputs on four benchmark datasets spanning a range of record lengths, sampling resolutions, ecosystems, and analysis configurations. Analytical methods are described in Higuera et al. (2009).

The full workflow proceeds in five steps:

  1. Pretreatment — Resample the raw charcoal series to equal time steps and compute charcoal accumulation rate (CHAR, pieces cm-2 yr-1).
  2. Smoothing — Estimate low-frequency trends (Cbackground) using lowess or a moving-window statistic.
  3. Peak decomposition — Compute the high-frequency residual Cpeak = Cinterp − Cbackground (or ratio Cinterp / Cbackground).
  4. Thresholding — Define a noise threshold and flag Cpeak values that exceed it as candidate fire events. Thresholds can be global (one distribution fitted to the full record) or local (sliding-window distribution).
  5. Screening and metrics — Apply a minimum-count significance test, then compute fire-return intervals (FRIs), Weibull statistics, smoothed fire frequency, signal-to-noise index (SNI), and goodness-of-fit.

Installation

# Install from GitHub (requires devtools)
devtools::install_github("phiguera/CharAnalysis",
                         subdir = "CharAnalysis_2_0_R")

# Suggested packages for figures
install.packages(c("ggplot2", "patchwork", "ggtext"))

Worked example: Code Lake

Code Lake (CO; Colorado, USA) is the primary validation dataset for CharAnalysis. The record spans approximately 7,300 calibrated years BP and is analysed here using a local Gaussian mixture model (GMM) threshold — the recommended default configuration.

Input files

CharAnalysis reads two CSV files:

Both files are included in the package’s inst/extdata/ directory and can be located with system.file():

params_file <- system.file("extdata", "CO_charParams.csv",
                           package = "CharAnalysis")

For this vignette we reference the files directly from the repository:

params_file <- "path/to/CO_charParams.csv"   # adjust to your local path

Running the full pipeline

A single call to CharAnalysis() runs all five analytical steps and returns a named list of results.

library(CharAnalysis)

out <- CharAnalysis(params_file)

The progress messages show each step completing in sequence:

(1) Reading input file...
      ...done.
(1b) Validating input parameters...
(2) Pretreating charcoal data...
      ...done.
(3) Smoothing resampled CHAR to estimate low-frequency trends
    and calculating peak CHAR...
      ...done.
(4) Defining possible thresholds for peak identification...
      ...done.
(5) Identifying peaks and applying minimum-count screening...
      ...done.
(6) Post-processing: fire-return intervals, Weibull statistics...
      ...done.
(7) Pipeline complete.  Call char_write_results(...) to save output CSV.

Exploring the output

CharAnalysis() returns a named list. The most commonly used elements are:

names(out)
#> [1] "charcoal"      "pretreatment"  "smoothing"     "peak_analysis"
#> [5] "results"       "site"          "gap_in"        "char_thresh"
#> [9] "post"          "char_results"

The charcoal object

out$charcoal holds all time-series outputs, both raw and processed:

# Inspect the first few rows of key time series
head(data.frame(
  age_BP    = out$charcoal$ybpI,          # interpolated age (yr BP)
  CHAR      = out$charcoal$accI,          # C_interpolated  (pieces cm-2 yr-1)
  C_bkg     = out$charcoal$accIS,         # C_background
  C_peak    = out$charcoal$peak,          # C_peak (residuals)
  peaks     = out$charcoal$charPeaks[, 4] # final-threshold peak flags (0/1)
))

The char_thresh object

out$char_thresh holds threshold values, SNI, and goodness-of-fit results:

# Threshold at the final percentile (column 4 = threshValues[4])
range(out$char_thresh$pos[, 4], na.rm = TRUE)

# Signal-to-noise index (SNI): values > 3 indicate a strong signal
summary(out$char_thresh$SNI)

Post-processing metrics

out$post holds fire-history summary metrics:

# Fire-return intervals (FRIs) and mean FRI
cat("Number of FRIs:", length(out$post$FRI), "\n")
cat("Mean FRI:", round(mean(out$post$FRI), 1), "yr\n")

# Per-zone Weibull statistics (zone 1)
fri_z1 <- out$post$FRI_params_zone[1, ]
cat(sprintf(
  "Zone 1 — nFRI: %d  mFRI: %.1f yr  WBLb: %.1f  WBLc: %.2f\n",
  fri_z1[1], fri_z1[2], fri_z1[5], fri_z1[8]
))

The char_results matrix

out$char_results is the complete N × 33 output matrix, with columns matching the MATLAB charResults CSV exactly:

dim(out$char_results)
#> [1] 500  33

# Total number of fire events identified
sum(out$charcoal$charPeaks[, 4], na.rm = TRUE)
#> [1] 39

Output figures

CharAnalysis provides five publication-quality ggplot2 figures. All are produced by char_plot_all(), or individually by their dedicated functions.

Figure 3 — Cinterp, Cbackground, and Cpeak

char_plot_peaks(out)

The top panel shows resampled CHAR (black bars) with the smoothed Cbackground trend (grey line). The bottom panel shows Cpeak with the positive and negative threshold lines (red), identified fire events (+ symbols), and peaks that failed the minimum-count screen (grey dots).

Figure 5 — Cumulative peaks through time

char_plot_cumulative(out)

The slope of the cumulative curve at any point reflects the instantaneous fire frequency. Changes in slope indicate periods of higher or lower fire activity.

Figure 6 — FRI distributions by zone

char_plot_fri(out)

Each panel shows a histogram of fire-return intervals within one stratigraphic zone (20-yr bins, normalised to proportions). A two-parameter Weibull probability density function is overlaid where the Kolmogorov-Smirnov goodness-of-fit test passes (p > 0.10 for n < 30; p > 0.05 for n ≥ 30). Weibull scale (b) and shape (c) parameters, mean FRI, and sample size are annotated.

Figure 7 — Continuous fire history

char_plot_fire_history(out)

Three stacked panels show (from top to bottom): peak magnitude (integrated Cpeak above threshold per fire event, pieces cm-2 peak-1); fire-return intervals through time as filled squares with the smoothed mean FRI curve (black line) and bootstrapped 95% CI ribbon (grey); and lowess-smoothed fire frequency (fires per 1000 yr).

Figure 8 — Between-zone CHAR comparisons

char_plot_zones(out)

The left panel shows empirical cumulative distribution functions of raw CHAR within each zone, with pairwise two-sample Kolmogorov-Smirnov test p-values annotated. The right panel shows box plots (10th, 25th, 50th, 75th, 90th percentiles) of raw CHAR by zone.

Saving all figures to PDF

out_dir is a required argument when save = TRUE; the example below writes to a temporary directory so it can be run on any system, but you would normally substitute a path of your choosing (for example, "Results" inside your project folder).

char_plot_all(out, save = TRUE, out_dir = tempdir())
# Saves to tempdir():
#   CO_03_CHAR_analysis.pdf
#   CO_05_cumulative_peaks.pdf
#   CO_06_FRI_distributions.pdf
#   CO_07_continuous_fire_hx.pdf
#   CO_08_zone_comparisons.pdf

Note: Figures 9 (threshold sensitivity detail) and 10 (multi-site comparisons) from the MATLAB v2.0 interface are not implemented in this R package. All core analytical outputs are available through Figures 1–8 above.


Writing results to CSV

char_write_results() writes the 33-column output matrix to a CSV file whose column names and format match the MATLAB charResults output exactly. out_dir is required (no default); substitute a path of your choosing for the temporary directory used here.

char_write_results(out$char_results,
                 site    = out$site,
                 out_dir = tempdir())
# Writes: <tempdir>/CO_charResults.csv

The output CSV contains one row per interpolated time step and 33 columns covering all analytical outputs from Cinterp through to per-zone Weibull confidence intervals. Column headers match the MATLAB reference file exactly to facilitate direct numerical comparison.


Key parameters

The parameter file (*_charParams.csv) controls all aspects of the analysis. The most commonly adjusted parameters are:

Parameter Default Description
yrInterp 15 Resampling resolution (yr). Set to 0 for automatic (median raw resolution).
yr 500 Smoothing window width (yr) for Cbackground estimation.
threshType 2 Threshold type: 1 = global, 2 = local (sliding window).
threshMethod 3 Noise distribution: 2 = Gaussian, 3 = Gaussian mixture model.
threshValues 0.95, 0.99, 0.999, 0.99 Percentile thresholds; the final value defines the working threshold.
minCountP 0.05 Alpha level for the minimum-count significance screen.
peakFrequ 1000 Window width (yr) for smoothed fire frequency and FRI statistics.
zoneDiv record bounds Zone boundaries (yr BP) for stratified FRI and Weibull analysis.

Comparison with MATLAB v2.0

CharAnalysis v2.0.0 (R) is a direct translation of CharAnalysis v2.0 (MATLAB). Outputs are quantitatively equivalent on all validated reference datasets. The table below summarises results across the four validation datasets; full details are in inst/z_Validation_report_R_vs_MATLAB_V_2.0.md.

Dataset Site Smoothing charBkg max|diff| Peaks R v2.0.0 Peaks MATLAB v2.0
CO Code Lake, AK Method 1 (lowess) ~5 × 10-6 39 48
CH10 Chickaree Lake, CO Method 2 (robust lowess) 0.267 59 50
SI17 Silver Lake, CO Method 2 (robust lowess) 0.130 25 31
RA07 Raven Lake, AK Method 2 (robust lowess) < 0.001 15 17

Two sources of numerical difference are documented:

1. Robust lowess background (smoothing method 2) — MATLAB’s Curve Fitting Toolbox smooth(..., 'rlowess') and the R char_lowess() port produce slightly different Cbackground series. For gap-free records (RA07) the difference is negligible (< 0.001). For records with NaN gaps (CH10) the difference is larger (≤ 0.267) because the two implementations handle gap positions differently inside the bisquare robustness iteration. Smoothing method 1 (plain lowess) is not affected and agrees to within floating-point noise on all datasets.

2. Gaussian mixture model (GMM) peak counts — The R package ports the MATLAB GaussianMixture.m EM algorithm directly. Floating-point arithmetic accumulates differently across languages during EM iterations, causing the two implementations to reach slightly different threshold values in some local windows. Peak counts differ by 10–20% across datasets, with the direction varying (R sometimes higher, sometimes lower). All threshold and peak differences are downstream consequences of this single source; interpolation and peak-magnitude outputs agree to within numerical precision.

Weibull confidence intervals — Bootstrap CIs use random resampling and will differ between R and MATLAB runs regardless of any other differences. Weibull point estimates (scale b, shape c) agree within a few percent on datasets where peak counts are similar.


Citation

If you use CharAnalysis in published research, please cite Higuera et al. (2009), the first study to apply the core analytical tools implemented in the program. If you used CharAnalysis v2.0 (MATLAB or R v2.0.0) specifically, please also cite the software:

Higuera, P.E., L.B. Brubaker, P.M. Anderson, F.S. Hu, and T.A. Brown. 2009. Vegetation mediated the impacts of postglacial climate change on fire regimes in the south-central Brooks Range, Alaska. Ecological Monographs 79:201–219. https://doi.org/10.1890/07-2019.1

Higuera, P.E. 2026. CharAnalysis: Diagnostic and analytical tools for peak analysis in sediment-charcoal records (Version 2.0). Zenodo. https://doi.org/10.5281/zenodo.19304064


References

Higuera, P.E., L.B. Brubaker, P.M. Anderson, F.S. Hu, and T.A. Brown. 2009. Vegetation mediated the impacts of postglacial climate change on fire regimes in the south-central Brooks Range, Alaska. Ecological Monographs 79:201–219. https://doi.org/10.1890/07-2019.1

mirror server hosted at Truenetwork, Russian Federation.