This vignette for package
groupedHyperframe
documents the creation
of groupedHyperframe
object, the batch processes defined
for a groupedHyperframe
, and aggregations over multi-level
grouping structure.
Package groupedHyperframe
requires the
development versions of spatstat
family of
packages.
devtools::install_github('spatstat/spatstat'); packageDate('spatstat')
devtools::install_github('spatstat/spatstat.data'); packageDate('spatstat.data')
devtools::install_github('spatstat/spatstat.explore'); packageDate('spatstat.explore')
devtools::install_github('spatstat/spatstat.geom'); packageDate('spatstat.geom')
devtools::install_github('spatstat/spatstat.linnet'); packageDate('spatstat.linnet')
devtools::install_github('spatstat/spatstat.model'); packageDate('spatstat.model')
devtools::install_github('spatstat/spatstat.random'); packageDate('spatstat.random')
devtools::install_github('spatstat/spatstat.sparse'); packageDate('spatstat.sparse')
devtools::install_github('spatstat/spatstat.univar'); packageDate('spatstat.univar')
devtools::install_github('spatstat/spatstat.utils'); packageDate('spatstat.utils')
Examples in this vignette require that the search
path
has
Users should remove parameter mc.cores = 1L
from all
examples and use the default option, which engages all CPU cores on the
current host for macOS. The authors are forced to have
mc.cores = 1L
in this vignette in order to pass
CRAN
’s submission check.
A development version of package
groupedHyperframe
is hosted on Github.
Term / Abbreviation | Description | Reference |
---|---|---|
attr |
Attributes | base::attr ;
base::attributes |
CRAN , R |
The Comprehensive R Archive Network | https://cran.r-project.org |
data.frame |
Data frame | base::data.frame |
formula |
Formula | stats::formula |
fv , fv.object |
Function value table | spatstat.explore::fv.object |
groupedData |
Grouped data frame | nlme::groupedData |
hypercolumn |
Column of hyper data frame | spatstat.geom::hyperframe |
hyperframe |
Hyper data frame | spatstat.geom::hyperframe |
inherits |
Class inheritance | base::inherits |
kerndens |
Kernel density | stats::density.default()$y |
matrix |
Matrix | base::matrix |
mc.cores |
Number of CPU cores to use | parallel::mclapply ,
parallel::detectCores |
multitype |
Multitype object | spatstat.geom::is.multitype |
ppp , ppp.object |
(Marked) point pattern | spatstat.geom::ppp.object |
~ g1/.../gm |
Nested grouping structure | nlme::groupedData ;
nlme::lme |
quantile |
Quantile | stats::quantile |
S3 |
R ’s simplest object oriented system |
https://adv-r.hadley.nz/s3.html |
search |
Search path | base::search |
Surv |
Survival object | survival::Surv |
trapz , cumtrapz |
(Cumulative) trapezoidal integration | pracma::trapz ;
pracma::cumtrapz ; https://en.wikipedia.org/wiki/Trapezoidal_rule |
groupedHyperframe
ClassThe S3
class groupedHyperframe
inherits
from hyperframe
class, in a similar
fashion as groupedData
class inherits from
data.frame
class.
A groupedHyperframe
object, in addition to
hyperframe
object, has attribute(s)
attr(., 'group')
, a formula
to specify the
grouping structuregroupedHyperframe
with
ppp
-hypercolumnFunction grouped_ppp()
creates a
groupedHyperframe
with
one-and-only-one ppp
-hypercolumn.
Multiple ppp
-hypercolumns will not be supported in
foreseeable future, as we would need to check for name clash in
$marks
from the multiple ppp
-hypercolumns,
which is too much trouble.
In the following example, the argument formula
specifies
numeric
mark
hladr
and multitype
mark
phenotype
, on the left-hand-sideOS
, gender
and
age
, before the |
separator on the
right-hand-sideimage_id
nested in
patient_id
, after the |
separator on
the right-hand-side.(s = grouped_ppp(formula = hladr + phenotype ~ OS + gender + age | patient_id/image_id,
data = wrobel_lung, mc.cores = 1L))
#>
#> Grouped Hyperframe: ~patient_id/image_id
#>
#> 25 image_id nested in
#> 5 patient_id
#>
#> OS gender age patient_id image_id ppp.
#> 1 3488+ F 85 #01 0-889-121 [40864,18015].im3 (ppp)
#> 2 3488+ F 85 #01 0-889-121 [42689,19214].im3 (ppp)
#> 3 3488+ F 85 #01 0-889-121 [42806,16718].im3 (ppp)
#> 4 3488+ F 85 #01 0-889-121 [44311,17766].im3 (ppp)
#> 5 3488+ F 85 #01 0-889-121 [45366,16647].im3 (ppp)
#> 6 1605 M 66 #02 1-037-393 [56576,16907].im3 (ppp)
#> 7 1605 M 66 #02 1-037-393 [56583,15235].im3 (ppp)
#> 8 1605 M 66 #02 1-037-393 [57130,16082].im3 (ppp)
#> 9 1605 M 66 #02 1-037-393 [57396,17896].im3 (ppp)
#> 10 1605 M 66 #02 1-037-393 [57403,16934].im3 (ppp)
Function grouped_ppp()
has parameter coords
which specifies the column name of \(x\)- and \(y\)-coordinates in the input
data
. Default coords = ~ x + y
indicates the
use of data$x
and data$y
for \(x\)- and \(y\)-coordinates, respectively. Users may
use coords = FALSE
for data without \(x\)- and \(y\)-coordinates. In this case, the
coordinates are filled with randomly generated numbers, and the returned
groupedHyperframe
has a
pseudo.ppp
-hypercolumn.
(s_a = grouped_ppp(Ki67 ~ Surv(recfreesurv_mon, recurrence) + race + age | patientID/tissueID,
data = Ki67, coords = FALSE, mc.cores = 1L))
#>
#> Grouped Hyperframe: ~patientID/tissueID
#>
#> 207 tissueID nested in
#> 200 patientID
#>
#> recfreesurv_mon recurrence race age patientID tissueID ppp.
#> 1 100 0 White 66 PT00037 TJUe_I17 (pseudo.ppp)
#> 2 22 1 Black 42 PT00039 TJUe_G17 (pseudo.ppp)
#> 3 99 0 White 60 PT00040 TJUe_F17 (pseudo.ppp)
#> 4 99 0 White 53 PT00042 TJUe_D17 (pseudo.ppp)
#> 5 112 1 White 52 PT00054 TJUe_J18 (pseudo.ppp)
#> 6 12 1 Black 51 PT00059 TJUe_N17 (pseudo.ppp)
#> 7 64 0 Asian 50 PT00062 TJUe_J17 (pseudo.ppp)
#> 8 56 0 White 37 PT00068 TJUe_F19 (pseudo.ppp)
#> 9 79 0 White 68 PT00082 TJUe_P19 (pseudo.ppp)
#> 10 26 1 Black 55 PT00084 TJUe_O19 (pseudo.ppp)
ppp
-HypercolumnIn this section, we outline the batch process of spatial point
pattern analyses applicable to the ppp
-hypercolumn of a
hyperframe
.
Note that these spatial point pattern analyses should
not be applied to a
pseudo.ppp
-hypercolumn, as the \(x\)- and \(y\)-coordinates are randomly generated
psuedo numbers.
Batch processes that add a fv
-hypercolumn to the input
hyperframe
include
Function | Workhorse | Applicable To |
---|---|---|
Emark_() |
spatstat.explore::Emark |
numeric marks (e.g.,
hladr ) in ppp -hypercolumn |
Vmark_() |
spatstat.explore::Vmark |
numeric marks |
markcorr_() |
spatstat.explore::markcorr |
numeric marks |
markvario_() |
spatstat.explore::markvario |
numeric marks |
Gcross_() |
spatstat.explore::Gcross |
multitype marks (e.g.,
phenotype ) |
Kcross_() |
spatstat.explore::Kcross |
multitype marks |
Jcross_() |
spatstat.explore::Jcross |
multitype marks |
Batch processes that add a numeric
-hypercolumn to the
input hyperframe
include
Function | Workhorse | Applicable To |
---|---|---|
nncross_() |
spatstat.geom::nncross.ppp(., what = 'dist') |
multitype marks (e.g.,
phenotype ) |
Following example shows that multiple batch processes may be applied
to a hyperframe
(or groupedHyperframe
) in a
pipeline (|>
).
r = seq.int(from = 0, to = 250, by = 10)
out = s |>
Emark_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# Vmark_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# markcorr_(r = r, correction = 'best', mc.cores = 1L) |> # slow
# markvario_(r = r, correction = 'best', mc.cores = 1L) |> # slow
Gcross_(i = 'CK+.CD8-', j = 'CK-.CD8+', r = r, correction = 'best', mc.cores = 1L) |> # fast
# Kcross_(i = 'CK+.CD8-', j = 'CK-.CD8+', r = r, correction = 'best', mc.cores = 1L) |> # fast
nncross_(i = 'CK+.CD8-', j = 'CK-.CD8+', correction = 'best', mc.cores = 1L) # fast
#>
The returned hyperframe
(or
groupedHyperframe
) has
fv
-hypercolumn hladr.E
, created
by function Emark_()
on numeric
mark
hladr
fv
-hypercolumn phenotype.G
,
created by function Gcross_()
on multitype
mark phenotype
numeric
-hypercolumn
phenotype.nncross
, created by function
nncross_()
on multitype
mark
phenotype
out
#>
#> Grouped Hyperframe: ~patient_id/image_id
#>
#> 25 image_id nested in
#> 5 patient_id
#>
#> OS gender age patient_id image_id ppp. hladr.E phenotype.G
#> 1 3488+ F 85 #01 0-889-121 [40864,18015].im3 (ppp) (fv) (fv)
#> 2 3488+ F 85 #01 0-889-121 [42689,19214].im3 (ppp) (fv) (fv)
#> 3 3488+ F 85 #01 0-889-121 [42806,16718].im3 (ppp) (fv) (fv)
#> 4 3488+ F 85 #01 0-889-121 [44311,17766].im3 (ppp) (fv) (fv)
#> 5 3488+ F 85 #01 0-889-121 [45366,16647].im3 (ppp) (fv) (fv)
#> 6 1605 M 66 #02 1-037-393 [56576,16907].im3 (ppp) (fv) (fv)
#> 7 1605 M 66 #02 1-037-393 [56583,15235].im3 (ppp) (fv) (fv)
#> 8 1605 M 66 #02 1-037-393 [57130,16082].im3 (ppp) (fv) (fv)
#> 9 1605 M 66 #02 1-037-393 [57396,17896].im3 (ppp) (fv) (fv)
#> 10 1605 M 66 #02 1-037-393 [57403,16934].im3 (ppp) (fv) (fv)
#> phenotype.nncross
#> 1 (numeric)
#> 2 (numeric)
#> 3 (numeric)
#> 4 (numeric)
#> 5 (numeric)
#> 6 (numeric)
#> 7 (numeric)
#> 8 (numeric)
#> 9 (numeric)
#> 10 (numeric)
When nested grouping structure ~g1/g2/.../gm
is present,
we may aggregate over the
fv
-hypercolumn(s)numeric
-hypercolumn(s)numeric
marks in the ppp
-hypercolumnby either one of the grouping levels ~g1
,
~g2
, …, or ~gm
. If the lowest grouping
~gm
is specified, then no aggregation is performed.
The returned object of various aggregation functions,
aggregate_fv()
, aggregate_quantile()
and
aggregate_kerndens()
, is data.frame
instead of
hyperframe
. This is because the aggregated results are
stored in matrix
-columns, while the hyperframe
class does not support matrix
-column.
fv
-hypercolumn(s)Function aggregate_fv()
aggregates
spatstat.explore::plot.fv
. In the following example, we
have
matrix
-column hladr.E.value
,
aggregated function value from fv
-hypercolumn
hladr.E
matrix
-column phenotype.G.value
,
aggregated function value from fv
-hypercolumn
phenotype.G
matrix
-column hladr.E.cumtrapz
,
aggregated cumulative trapezoid area from fv
-hypercolumn
hladr.E
matrix
-column
phenotype.G.cumtrapz
, aggregated cumulative
trapezoid area from fv
-hypercolumn
phenotype.G
afv = out |>
aggregate_fv(by = ~ patient_id, f_aggr_ = 'mean', mc.cores = 1L)
#> Column(s) 'image_id' removed; as they are not identical per aggregation-group
nrow(afv) # number of patients
#> [1] 5
names(afv)
#> [1] "OS" "gender" "age"
#> [4] "patient_id" "hladr.E.value" "hladr.E.cumtrapz"
#> [7] "phenotype.G.value" "phenotype.G.cumtrapz"
dim(afv$hladr.E.cumtrapz) # N(patient) by length(r)
#> [1] 5 25
numeric
-hypercolumn(s) and
numeric
mark(s) in ppp
-hypercolumnFunction aggregate_quantile()
aggregates
numeric
-hypercolumn(s). In the
following example, we have
matrix
-column
phenotype.nncross.quantile
, aggregated quantile of
numeric
-hypercolumn
phenotype.nncross
numeric
mark(s) in the
ppp
-hypercolumn. In the following example, we have
matrix
-column hladr.quantile
,
aggregated quantile of numeric
mark
hladr
in ppp
-hypercolumnq = out |>
aggregate_quantile(by = ~ patient_id, probs = seq.int(from = 0, to = 1, by = .1), mc.cores = 1L)
#> Column(s) 'image_id' removed; as they are not identical per aggregation-group
nrow(q)
#> [1] 5
names(q)
#> [1] "OS" "gender"
#> [3] "age" "patient_id"
#> [5] "phenotype.nncross.quantile" "hladr.quantile"
dim(q$phenotype.nncross.quantile)
#> [1] 5 11
dim(q$hladr.quantile)
#> [1] 5 11
Function aggregate_kerndens()
aggregates
numeric
-hypercolumn(s). In
the following example, we have
matrix
-column
phenotype.nncross.kerndens
, aggregated kernel
density of numeric
-hypercolumn
phenotype.nncross
numeric
mark(s) in the
ppp
-hypercolumn. In the following example, we have
matrix
-column hladr.kerndens
,
aggregated kernel density of numeric
mark
hladr
in ppp
-hypercolumn(mdist = out$phenotype.nncross |> unlist() |> max())
#> [1] 354.2968
d = out |>
aggregate_kerndens(by = ~ patient_id, from = 0, to = mdist, mc.cores = 1L)
#> Column(s) 'image_id' removed; as they are not identical per aggregation-group
nrow(d)
#> [1] 5
names(d)
#> [1] "OS" "gender"
#> [3] "age" "patient_id"
#> [5] "phenotype.nncross.kerndens" "hladr.kerndens"
dim(d$phenotype.nncross.kerndens)
#> [1] 5 512