In this example we will show a quick example of a species Richness model using intSDM. We consider 7 randomly selected species of Vascular plants distibuted across the Netherlands. These species were obtained from two different datasets: Dutch vegetation database, which we treat as structured detection/non-detection data, and iNaturalist, which we treat as unstructured presence-only data.
To begin the workflow, we use the function startWorkflow
and specify the names of the species for which we want to obtain
estimates for. To construct a richness model, we specify
Richness = TRUE
, which will construct a multi-species model
rather than create models for each species independently. This richness
model will include an iid random effect unique for each
species, a species-level environmental effect, and a species-level
random effect. In order to speed up the estimation procedure, we will
assume that the hyperparameters are the same for each of the
species-level random effects.
Rich <- startWorkflow(Species = c("Lolium perenne L.","Rubus caesius L.",
"Rosa spinosissima L.",
"Poa trivialis L.",
"Galium verum L.",
"Tanacetum vulgare L.",
"Viola tricolor L.",
"Epilobium L."),
Projection = '+proj=utm +zone=32 +ellps=WGS84 +datum=WGS84 +units=km +no_defs',
Save = FALSE, Richness = TRUE,
saveOptions = list(projectName = 'Richness'))
Next, we need to obtain a boundary object of the Netherlands and simplify it to assist with the estimation.
Ned <- giscoR::gisco_get_countries(country = 'Netherlands', resolution = 60)
Ned <- st_cast(st_as_sf(Ned), 'POLYGON')
Ned <- Ned[which.max(st_area(Ned)),]
Ned <- rmapshaper::ms_simplify(Ned, keep = 0.5)
Rich$addArea(Ned)
We add the data in directly from GBIF using the
.$addGBIF
function. There are no recorded absences for the
Dutch vegetation database for our selected species. Therefore
we generate absences using generateAbsences = TRUE
, which
will pool all the observations for all species within the dataset
together, and generate absences where that species was not found.
Rich$addGBIF(datasetName = 'DVD',
datasetKey = '740df67d-5663-41a2-9d12-33ec33876c47',
datasetType = 'PA', generateAbsences = TRUE)
Rich$addGBIF(datasetName = 'iNat',
datasetKey = '50c9509d-22c7-4a22-a47d-8c48425ef4a7')
To estimate the spatial effect, we need to create a triangulated mesh. We create a fine mesh, however you may speed up the analysis by making the mesh coarser.
We furthermore specify a second spatial effect for the unstructured data to account for the bias sampling. In addition, we specify penalizing complexity priors for all of the random effects (species-level random effect, bias random effect and the precision parameter of the iid intercept term) to reduce overfitting of the model.
Rich$specifySpatial(prior.range = c(0.2,0.1),
prior.sigma = c(2, 0.1))
Rich$biasFields('iNat', prior.range = c(0.1, 0.1),
prior.sigma = c(0.2, 0.1))
Rich$specifyPriors(priorIntercept = list(prior = 'pc.prec', param = c(0.02, 0.01)))
We assume one covariate in the model (average temperature) which we
download using .$addCovariates
. We then use the function
.$modelFormula
to specify the formula of the process model,
which in this case includes both the linear and quadratic effect of the
covariate.
Rich$addCovariates(worldClim = c('tavg'), res = 10, Function = scale)
Rich$modelFormula(covariateFormula = ~ tavg + I(tavg^2))
Estimating richness requires selecting one of the intercepts in the model for which to scale the intensity function. This dataset needs to be one where the sampling bias is assumed to be minimal, and the sampling area for each sampling location is known and available.
Specifying the intercept used to scale the intenstiy function may be
done using predictionIntercept in the Richness
options. We may then specify the sampling size of the sampling locations
using samplingSize, however given that these values are not
constant for our dataset, we include it as an offset using the
Offset argument in the ISDM options.
Rich$modelOptions(ISDM = list(Offset = 'sampleSizeValue'),
Richness = list(predictionIntercept = 'DVD'))
Finally, we select our output as richness maps, and estimate the model.
And provide maps of the predictions along with estimated measures of uncertainty.
ggplot() + gg(RichModel$Richness$Richness, aes(col = q0.025))
ggplot() + gg(RichModel$Richness$Richness, aes(col = q0.5))
ggplot() + gg(RichModel$Richness$Richness, aes(col = q0.975))
The species-level probabilities of occurrence are also found within this return object. We may thus plot how these vary across space for each species. For example, we may plot the probabilities for Galium verumL. and Tanacetum vulgare L.