bifrost bifrost hex sticker

R-CMD-check Codecov test coverage CRAN status License: GPL (≥ 2) Lifecycle: maturing

Branch-level Inference Framework for Recognizing Optimal Shifts in Traits

bifrost performs branch-level inference of multi-regime, multivariate trait evolution on a phylogeny using penalized-likelihood multivariate GLS fits. The current version searches for evolutionary model shifts under a multi-rate Brownian Motion (BMM) model with proportional regime VCV scaling, operating directly in trait space (e.g., no PCA), and is designed for high-dimensional datasets (p > n) and large trees (> 1000 tips). The method will work with fossil tip-dated trees, and will accept most forms of multivariate comparative data (e.g., GPA aligned morphometric coordinates, linear dimensions, and others). The next major release will enable usage of the multivariate scalar Ornstein–Uhlenbeck process.


Overview


Key features


📄 Vignette 1: Getting Started with bifrost

Installation (development version)

# install.packages("remotes")
remotes::install_github("jakeberv/bifrost")

Windows users:
Install Rtools for your R version and ensure it is added to your system PATH.

macOS users:
You may need to install XQuartz to build or run packages that depend on certain graphical or system libraries.


Quick start

library(bifrost)
library(ape)
library(phytools)
library(mvMORPH)

set.seed(1)

# Simulate a tree
tr <- pbtree(n = 50, scale = 1)

# Paint a single global baseline state "0" (single regime)
base <- phytools::paintBranches(
  tr,
  edge      = unique(tr$edge[, 2]),
  state     = "0",
  anc.state = "0"
)

# Simulate multivariate traits under a single-regime BM1 model (no shifts)
sigma <- diag(0.1, 2)  # 2×2 variance–covariance matrix for two traits
theta <- c(0, 0)       # ancestral means for the two traits

sim <- mvSIM(
  tree  = base,
  nsim  = 1,
  model = "BM1",
  param = list(
    ntraits = 2,
    sigma   = sigma,
    theta   = theta
  )
)

# mvSIM returns either a matrix or a list of matrices depending on mvMORPH version
X <- if (is.list(sim)) sim[[1]] else sim
rownames(X) <- base$tip.label

# Run bifrost's greedy search for shifts
res <- searchOptimalConfiguration(
  baseline_tree              = base,
  trait_data                 = X,
  formula                    = "trait_data ~ 1",
  min_descendant_tips        = 10,
  num_cores                  = 1,
  shift_acceptance_threshold = 20,  # conservative GIC threshold
  IC                         = "GIC",
  plot                       = FALSE,
  store_model_fit_history    = FALSE,
  verbose                    = FALSE # set TRUE for progress messages 
)

# For this single-regime BM1 simulation, we typically expect no inferred shifts:
res$shift_nodes_no_uncertainty     # typically integer(0)
res$optimal_ic - res$baseline_ic   # typically close to 0

str(res$VCVs)  # regime-specific VCVs (here just the baseline regime "0")

Data requirements

Core workflow

flowchart TD

subgraph S[Setup]
  A([Start]) --> B[Init and validate]
  B --> C[Paint baseline state 0]
  C --> D[Generate one-shift candidates]
end

subgraph CS[Baseline and scoring]
  D --> E[Fit baseline mvgls]
  E --> F[Baseline IC GIC or BIC]
  F --> G[Score candidates in parallel]
  G --> H[Compute delta IC and sort]
end

subgraph GS[Greedy search]
  H --> I[Init best tree and IC]
  I --> J{More candidates?}

  J -- Yes --> K[Add shift]
  K --> L[Fit shifted model]
  L --> M[Delta IC best minus new]
  M --> N{Delta IC >= threshold?}

  N -- Yes --> O[Accept update best]
  N -- No --> P[Reject keep best]

  O --> Q[Record status]
  P --> Q

  Q --> J
end

subgraph PP[Post processing]
  J -- No --> U[Finalize best model]
  U --> V{Compute IC weights?}

  V -- No --> V0[Skip weights]
  V -- Yes --> W{Any shifts?}
  W -- No --> V0

  W -- Yes --> X{Weights parallel?}
  X -- Yes --> X1[Drop-one refits parallel]
  X -- No --> X2[Drop-one refits serial]

  X1 --> Y[Compute weights aicw and ER]
  X2 --> Y
end

subgraph OUT[Output]
  V0 --> Z[Assemble result]
  Y --> Z
  Z --> ZA[Extract regime VCVs]
  ZA --> ZB([Return])
end

Primary functions

Helper functions (not exported)


Outputs

searchOptimalConfiguration() returns a comprehensive list containing:


Performance and scalability

Enable parallel processing using the future package:

library(future)
plan(multisession)   # or multicore on Linux/macOS

Reproducibility


Additional note

Though bifrost was initially developed as a framework for inferring macroevolutionary regime shifts in multivariate trait data, it can also be applied to perform multivariate phylogenetic generalized least squares (pGLS) analyses with factors or continuous predictors (e.g., cbind(trait1, trait2, ...) ~ predictor, or "trait_data[, 1:5] ~ trait_data[, 6]" when working directly with a matrix). In this context, bifrost identifies branch-specific rate variation under a multi-rate Brownian Motion model and fits the pGLS conditional on the resulting residual (phylogenetic) covariance structure, so estimated effect sizes and uncertainties account for “hidden” rate variation not explained by the predictors. This is conceptually similar to hidden-state approaches (e.g., Boyko et al. 2023), except that here the regimes influence variance and evolutionary rate rather than introducing regime-specific means. This use case is an active area of ongoing methodological development.

Citation

If you use bifrost, please cite the references returned by:

citation("bifrost")

Contributing

Bug reports, feature requests, and pull requests are welcome. Please open an issue at https://github.com/jakeberv/bifrost/issues.


License

This project is released under the GPL >= 2 License. See the LICENSE file for details.


Acknowledgements and dependencies

bifrost builds on substantial work from mvMORPH, phytools, ape, future, and future.apply. The greedy search algorithm is adapted from Mitov et al 2019 and Smith et al 2023. See the DESCRIPTION file for complete dependency and version information.

The name of our R package is inspired by the Bifröst, the rainbow bridge of Norse mythology that connects Earth (Midgard) and Asgard within the cosmic structure of Yggdrasil, the Tree of Life, echoing how our framework links observable data to hidden evolutionary shifts across the history of life.

Development of the bifrost R package was supported by the Oxford Research Software Engineering Group, with support from Schmidt Sciences, LLC. and the Michigan Institute for Data Science and AI in Society.

Schmidt Sciences logo Oxford RSE logo

mirror server hosted at Truenetwork, Russian Federation.