DPI

🛸 The Directed Prediction Index (DPI).

The Directed Prediction Index (DPI) is a simulation-based method for quantifying the relative endogeneity of outcome versus predictor variables in multiple linear regression models.

CRAN-Version GitHub-Version R-CMD-check CRAN-Downloads GitHub-Stars

Author

Han-Wu-Shuang (Bruce) Bao 包寒吴霜

📬 baohws@foxmail.com

📋 psychbruce.github.io

Citation

Installation

## Method 1: Install from CRAN
install.packages("DPI")

## Method 2: Install from GitHub
install.packages("devtools")
devtools::install_github("psychbruce/DPI", force=TRUE)

Computation Details

\[ \begin{aligned} \text{DPI}_{X \rightarrow Y} & = t^2 \cdot \Delta R^2 \\ & = t_{\beta_{XY|Covs}}^2 \cdot (R_{Y \sim X + Covs}^2 - R_{X \sim Y + Covs}^2) \\ & = t_{partial.r_{XY|Covs}}^2 \cdot (R_{Y \sim X + Covs}^2 - R_{X \sim Y + Covs}^2) \end{aligned} \]

In econometrics and broader social sciences, an exogenous variable is assumed to have a unidirectional (causal or quasi-causal) influence on an endogenous variable (\(ExoVar \rightarrow EndoVar\)). By quantifying the relative endogeneity of outcome versus predictor variables in multiple linear regression models, the DPI can suggest a more plausible direction of influence (e.g., \(\text{DPI}_{X \rightarrow Y} > 0 \text{: } X \rightarrow Y\)) after controlling for a sufficient number of potential confounding variables.

  1. It uses \(\Delta R_{Y vs. X}^2\) to test whether \(Y\) (outcome), compared to \(X\) (predictor), can be more strongly predicted by \(m\) observable control variables (included in a regression model) and \(k\) unobservable random covariates (specified by k.cov; see DPI). A higher \(R^2\) indicates relatively higher dependence (i.e., relatively higher endogeneity) in a given variable set.
  2. It also uses \(t_{partial.r}^2\) to penalize insignificant partial correlation (\(r_{partial}\), with equivalent \(t\) test as \(\beta_{partial}\)) between \(Y\) and \(X\), while ignoring the sign (\(\pm\)) of this correlation. A higher \(t^2\) (equivalent to \(F\) test value when \(df = 1\)) indicates a more robust (less spurious) partial relationship when controlling for other variables.
  3. Simulation samples with k.cov random covariates are generated to test the statistical significance of DPI.

mirror server hosted at Truenetwork, Russian Federation.