A study of gully erosion in rural southeast Nigeria documented the type of adjustment made by community members in response to erosion hazards, and asked whether adjustment type could distinguish individually-motivated from community-motivated responses.1
Four adjustment types were recorded: (1) Use of Ridges, (2) Shifting Habitation, (3) Relocation, and (4) Intensified Cultivation. Because Ridges and Shifting Habitation are actions that can be taken individually or collectively, whereas Relocation and Intensified Cultivation require organised community effort, the hypothesis is that Ridges and Shifting predict Community motivation and Relocation and Intensification predict Individual motivation. Optimal Data Analysis (UniODA) tests whether adjustment type discriminates motivation and quantifies the strength of the association.
Motivation (0 = Individual, 1 = Community) is the class variable; adjustment type (1-4) is the attribute. Published cell frequencies are reconstructed directly into observation-level vectors - no external data file is required.
library(oda)
# Cross-classification: rows = adjustment type, cols = motivation.
# Indiv (0) Comm (1) total
# Ridges (1) 85 173 258
# Shifting (2) 65 170 235
# Relocation (3) 172 10 182
# Intensified (4) 45 0 45
# 367 353 720
motivation <- c(rep(0L, 85), rep(1L, 173), # adjustment = 1
rep(0L, 65), rep(1L, 170), # adjustment = 2
rep(0L, 172), rep(1L, 10), # adjustment = 3
rep(0L, 45), rep(1L, 0)) # adjustment = 4
adjustment <- c(rep(1L, 258), rep(2L, 235),
rep(3L, 182), rep(4L, 45))
table(adjustment, motivation,
dnn = c("Adjustment (1=Ridges,2=Shifting,3=Relocation,4=Intensified)",
"Motivation (0=Individual, 1=Community)"))
#> Motivation (0=Individual, 1=Community)
#> Adjustment (1=Ridges,2=Shifting,3=Relocation,4=Intensified) 0 1
#> 1 85 173
#> 2 65 170
#> 3 172 10
#> 4 45 0Adjustment type is a four-category nominal variable. ODA searches all
possible binary partitions of the four categories and selects the
partition that maximises ESS. No a priori direction is
supplied; the search is nondirectional
(Hypothesis: NONDIRECTIONAL in MegaODA output).
Leave-one-out (LOO) jackknife validity analysis is included.
print(fit)
#>
#> ODA (binary) attr_type=categorical priors=TRUE n=720
#>
#> Rule: {4, 3} --> 0 | {1, 2} --> 1
#>
#> CLASS n PAC
#> 0 367 59.1%
#> 1 353 97.2%
#>
#> Mean PAC: 78.15% ESS: 56.30% p(MC): < .001
#>
#> -- LOO --
#> CLASS n PAC
#> 0 367 59.1%
#> 1 353 97.2%
#>
#> LOO ESS: 56.30% p(LOO): < .001ODA’s nondirectional search identified the optimal binary partition:
This recovered mapping is substantively consistent with the adjustment/motivation hypothesis: adjustments requiring collective action (Ridges, Shifting) predict Community motivation; adjustments undertaken individually (Relocation, Intensified Cultivation) predict Individual motivation.
# Confusion matrix: actual motivation (rows) x predicted motivation (cols)
conf_mat <- matrix(
c(fit$confusion$TN, fit$confusion$FP,
fit$confusion$FN, fit$confusion$TP),
nrow = 2L, byrow = TRUE,
dimnames = list(Actual = c("Indiv(0)", "Comm(1)"),
Predicted = c("Indiv(0)", "Comm(1)"))
)
print(conf_mat)
#> Predicted
#> Actual Indiv(0) Comm(1)
#> Indiv(0) 217 150
#> Comm(1) 10 343summary(fit)
#>
#> ODA Summary (binary) status=valid n=720
#> attr_type=categorical priors=TRUE weights=FALSE
#> Rule: {4, 3} --> 0 | {1, 2} --> 1
#>
#> -- Train --
#> Mean PAC (wt): 78.15% ESS: 56.30%
#> Sensitivity: 0.972 Specificity: 0.591
#> p(MC): < .001 [MC permutation, two-tailed]
#> -- LOO --
#> CLASS n PAC
#> 0 367 59.1%
#> 1 353 97.2%
#> LOO ESS: 56.30%
#> LOO Mean PAC: 78.15%
#> p(LOO): < .001 [Fisher exact (2x2), one-tailed]# Predictive value: accuracy when the model makes a prediction into each class
pv_indiv <- fit$confusion$TN / (fit$confusion$TN + fit$confusion$FN)
pv_comm <- fit$confusion$TP / (fit$confusion$TP + fit$confusion$FP)
cat("PV Individual (0):", round(pv_indiv * 100, 1), "%\n")
#> PV Individual (0): 95.6 %
cat("PV Community (1):", round(pv_comm * 100, 1), "%\n")
#> PV Community (1): 69.6 %The MC p-value and LOO result are shown in the summary
output above.
p(MC)
shown is expected to be very small (MegaODA reports p = 0.000000 at
25000 iterations); the 500-iteration CRAN run may show p = 0.0. Use
mc_iter = 25000L for publication results.Fixture parity. The training rule, confusion matrix,
and ESS are verified against MegaODA.exe output in the package test
suite (tests/testthat/test-fixture-vignettes.R, Example
2).
MC p-value calibration. The MC p shown here reflects
mc_iter = 500L for CRAN build speed. MegaODA reports p =
0.000000 (exact zero) at 25000 iterations; with 500 iterations a
near-zero p will still be reported accurately (STOP fires early). Use
the canonical run with mc_iter = 25000L (chunk
fit-canonical, eval=FALSE) for
publication-quality results. Training ESS and confusion matrix are
unaffected by mc_iter.
Nondirectional search. No direction_map
is supplied. ODA evaluates all possible binary partitions of the four
adjustment categories and selects the one that maximises ESS. The MC
permutation test is nondirectional: each permutation selects the best
partition for the permuted labels. This matches the MegaODA.exe gold run
(Hypothesis: NONDIRECTIONAL).
Optional constrained analysis. A researcher with an
a priori hypothesis specifying exactly which categories predict
which class can supply
direction_map = c("1"=1L, "2"=1L, "3"=0L, "4"=0L) for a
fixed-partition directional analysis (MPE Chapter 4 Phase 6C). For this
dataset the two analyses yield identical ESS and confusion because the
a priori mapping happens to be the global optimum; they differ
in MC interpretation (directional vs. nondirectional p-value).
Okuh D, Osumgborogwu IE (2019). Adjustments to hazards of gully erosion in rural southeast Nigeria: A case of Amacha communities. Applied Ecology and Environmental Sciences, 7, 11-20.↩︎
Yarnold, P.R., & Soltysik, R.C. (2005). Optimal Data Analysis: A Guidebook with Software for Windows. Washington, D.C.: APA Books.↩︎