Step 2. Smoothing point process data

The second step of spatiotemporal causal inference is smoothing outcomes. We employ get_smoothed_outcome() function to work on this step.

1 Smoothing outcomes

get_smoothed_outcome() function has two key arguments. The first is data_interest, which is a column of the hyperframe that we generated in the first step. The second is method, which specifies the method of smoothing (either mclust or abram).

1.1 Method 1: Gaussian mixture model (method = mclust)

The Gaussian mixture model is the simplest and fastest method for smoothing point processes. This method performs model-based clustering of points by obtaining the common variance and using it as the variance of the isotropic smoothing kernel. The key advantage of choosing the Gaussian mixture model is its computational speed. Since our function employs the EII model (equal volume, round shape), the Gaussian mixture model is much faster than adaptive smoothing (method = abram).

To make this method even faster, users can specify initialization = TRUE. By doing so, the function uses a small portion of data to obtain the common variance. For example, in order to use 5% of data for initialization, then users can run the following code.

# Smoothing
smooth_allout <- get_smoothed_outcome(data_interest = dat_hfr$all_outcome,
                                      method = "mclust", initialization = TRUE,
                                      sampling = 0.05)
#> Fitting the Gaussian mixture model
#> Smoothing ppps

# Save the output
dat_hfr$smooth_allout <- smooth_allout

Since the process is computationally demanding, it is highly recommended that users save the output and avoid running the process multiple times. The output is a list of pixel images, each representing smoothed outcomes of each time frame.

1.2 Method 2: Abramson’s adaptive smoothing (method = abram)

The Gaussian mixture model is a fixed-bandwidth smoothing approach. This approach, therefore, can be problematic when the true intensity varies substantively across regions because the fixed-bandwidth approach can over- or under-smooth point process data of some regions. To incorporate the variation in intensity functions, we need to employ adaptive smoothing techniques.

The function get_smoothed_outcome() follows Abramson (1982) to estimate the target densities. geocausal package assumes that the bandwidth is inversely proportional to the square root of the target densities. Users can employ the adaptive-bandwidth approach by setting method = "abram".