The Inclusion of Claim Covariates in the Generation of SynthETIC Claims

This vignette aims to illustrate how the inclusion of covariates can influence the severity of the claims generated using the SynthETIC package. The distributional assumptions shown in this vignette are consistent with the default assumptions of the SynthETIC package (an Auto Liability portfolio). The inclusion of covariates aims to be a minor adjustment step to modelled claim sizes after Step 2: Claim size discussed in the SynthETIC-demo vignette.

In particular, with this demo we will construct:

Description R Object
Covariate Inputs covariate_obj = various factors, their levels and relativities for covariate frequency and claim severity
Covariate Outputs covariates_data_obj = dataset of assigned covariates for each claim
S_adj, claim size claim_size_w_cov[[i]] = claim size for all claims that occurred in period i after adjustment for covariates

Reference

To cite this package in publications, please use:

citation("SynthETIC")

SynthETIC Set Up

We set up package-wise global parameters demonstrated in the SynthETIC-demo vignette (which can be accessed via vignette("SynthETIC-demo", package = "SynthETIC") or online documentation) and perform modelling Steps 1 and 2 to generate the claim frequency and claim sizes under the default assumptions. Note that changing these assumptions for Steps 1 and 2 do not affect how covariates are implemented.

library(SynthETIC)
set.seed(20200131)

set_parameters(ref_claim = 200000, time_unit = 1/4)
ref_claim <- return_parameters()[1]
time_unit <- return_parameters()[2]

years <- 10
I <- years / time_unit
E <- c(rep(12000, I)) # effective annual exposure rates
lambda <- c(rep(0.03, I))

# Modelling Steps 1-2
n_vector <- claim_frequency(I = I, E = E, freq = lambda)
occurrence_times <- claim_occurrence(frequency_vector = n_vector)
claim_sizes <- claim_size(frequency_vector = n_vector)

Applying Covariates

To apply simulated covariates to SynthETIC claim sizes, a covariates is used in conjunction with the claim_size_adj() function to both simulate covariate combinations and apply adjusted claim sizes. The example covariates object below includes relativities for

  1. the legal representation of the claim
  2. the injury severity score
  3. the age of the claimant.
test_covariates_obj <- SynthETIC::test_covariates_obj
print(test_covariates_obj)
#> $factors
#> $factors$`Legal Representation`
#> [1] "Y" "N"
#> 
#> $factors$`Injury Severity`
#> [1] "1" "2" "3" "4" "5" "6"
#> 
#> $factors$`Age of Claimant`
#> [1] "0-15"    "15-30"   "30-50"   "50-65"   "over 65"
#> 
#> 
#> $relativity_freq
#>                factor_i             factor_j level_ik level_jl relativity
#> 1  Legal Representation Legal Representation        Y        Y      1.000
#> 2  Legal Representation Legal Representation        N        N      1.000
#> 3  Legal Representation      Injury Severity        Y        1      0.950
#> 4  Legal Representation      Injury Severity        Y        2      1.000
#> 5  Legal Representation      Injury Severity        Y        3      1.000
#> 6  Legal Representation      Injury Severity        Y        4      1.000
#> 7  Legal Representation      Injury Severity        Y        5      1.000
#> 8  Legal Representation      Injury Severity        Y        6      1.000
#> 9  Legal Representation      Injury Severity        N        1      0.050
#> 10 Legal Representation      Injury Severity        N        2      0.000
#> 11 Legal Representation      Injury Severity        N        3      0.000
#> 12 Legal Representation      Injury Severity        N        4      0.000
#> 13 Legal Representation      Injury Severity        N        5      0.000
#> 14 Legal Representation      Injury Severity        N        6      0.000
#> 15 Legal Representation      Age of Claimant        Y     0-15      1.000
#> 16 Legal Representation      Age of Claimant        Y    15-30      1.000
#> 17 Legal Representation      Age of Claimant        Y    30-50      1.000
#> 18 Legal Representation      Age of Claimant        Y    50-65      1.000
#> 19 Legal Representation      Age of Claimant        Y  over 65      1.000
#> 20 Legal Representation      Age of Claimant        N     0-15      1.000
#> 21 Legal Representation      Age of Claimant        N    15-30      1.000
#> 22 Legal Representation      Age of Claimant        N    30-50      1.000
#> 23 Legal Representation      Age of Claimant        N    50-65      1.000
#> 24 Legal Representation      Age of Claimant        N  over 65      1.000
#> 25      Injury Severity      Injury Severity        1        1      0.530
#> 26      Injury Severity      Injury Severity        2        2      0.300
#> 27      Injury Severity      Injury Severity        3        3      0.100
#> 28      Injury Severity      Injury Severity        4        4      0.050
#> 29      Injury Severity      Injury Severity        5        5      0.010
#> 30      Injury Severity      Injury Severity        6        6      0.010
#> 31      Injury Severity      Age of Claimant        1     0-15      1.000
#> 32      Injury Severity      Age of Claimant        1    15-30      1.000
#> 33      Injury Severity      Age of Claimant        1    30-50      1.000
#> 34      Injury Severity      Age of Claimant        1    50-65      1.000
#> 35      Injury Severity      Age of Claimant        1  over 65      1.000
#> 36      Injury Severity      Age of Claimant        2     0-15      1.000
#> 37      Injury Severity      Age of Claimant        2    15-30      1.000
#> 38      Injury Severity      Age of Claimant        2    30-50      1.000
#> 39      Injury Severity      Age of Claimant        2    50-65      1.000
#> 40      Injury Severity      Age of Claimant        2  over 65      1.000
#> 41      Injury Severity      Age of Claimant        3     0-15      1.000
#> 42      Injury Severity      Age of Claimant        3    15-30      1.000
#> 43      Injury Severity      Age of Claimant        3    30-50      1.000
#> 44      Injury Severity      Age of Claimant        3    50-65      1.000
#> 45      Injury Severity      Age of Claimant        3  over 65      1.000
#> 46      Injury Severity      Age of Claimant        4     0-15      1.000
#> 47      Injury Severity      Age of Claimant        4    15-30      1.000
#> 48      Injury Severity      Age of Claimant        4    30-50      1.000
#> 49      Injury Severity      Age of Claimant        4    50-65      1.000
#> 50      Injury Severity      Age of Claimant        4  over 65      1.000
#> 51      Injury Severity      Age of Claimant        5     0-15      1.000
#> 52      Injury Severity      Age of Claimant        5    15-30      1.000
#> 53      Injury Severity      Age of Claimant        5    30-50      1.000
#> 54      Injury Severity      Age of Claimant        5    50-65      1.000
#> 55      Injury Severity      Age of Claimant        5  over 65      1.000
#> 56      Injury Severity      Age of Claimant        6     0-15      1.000
#> 57      Injury Severity      Age of Claimant        6    15-30      1.000
#> 58      Injury Severity      Age of Claimant        6    30-50      1.000
#> 59      Injury Severity      Age of Claimant        6    50-65      1.000
#> 60      Injury Severity      Age of Claimant        6  over 65      1.000
#> 61      Age of Claimant      Age of Claimant     0-15     0-15      0.183
#> 62      Age of Claimant      Age of Claimant    15-30    15-30      0.192
#> 63      Age of Claimant      Age of Claimant    30-50    30-50      0.274
#> 64      Age of Claimant      Age of Claimant    50-65    50-65      0.180
#> 65      Age of Claimant      Age of Claimant  over 65  over 65      0.171
#> 
#> $relativity_sev
#>                factor_i             factor_j level_ik level_jl relativity
#> 1  Legal Representation Legal Representation        Y        Y       2.00
#> 2  Legal Representation Legal Representation        N        N       1.00
#> 3  Legal Representation      Injury Severity        Y        1       1.00
#> 4  Legal Representation      Injury Severity        Y        2       1.00
#> 5  Legal Representation      Injury Severity        Y        3       1.00
#> 6  Legal Representation      Injury Severity        Y        4       1.00
#> 7  Legal Representation      Injury Severity        Y        5       1.00
#> 8  Legal Representation      Injury Severity        Y        6       1.00
#> 9  Legal Representation      Injury Severity        N        1       1.00
#> 10 Legal Representation      Injury Severity        N        2       1.00
#> 11 Legal Representation      Injury Severity        N        3       1.00
#> 12 Legal Representation      Injury Severity        N        4       1.00
#> 13 Legal Representation      Injury Severity        N        5       1.00
#> 14 Legal Representation      Injury Severity        N        6       1.00
#> 15 Legal Representation      Age of Claimant        Y     0-15       1.00
#> 16 Legal Representation      Age of Claimant        Y    15-30       1.00
#> 17 Legal Representation      Age of Claimant        Y    30-50       1.00
#> 18 Legal Representation      Age of Claimant        Y    50-65       1.00
#> 19 Legal Representation      Age of Claimant        Y  over 65       1.00
#> 20 Legal Representation      Age of Claimant        N     0-15       1.00
#> 21 Legal Representation      Age of Claimant        N    15-30       1.00
#> 22 Legal Representation      Age of Claimant        N    30-50       1.00
#> 23 Legal Representation      Age of Claimant        N    50-65       1.00
#> 24 Legal Representation      Age of Claimant        N  over 65       1.00
#> 25      Injury Severity      Injury Severity        1        1       0.60
#> 26      Injury Severity      Injury Severity        2        2       1.20
#> 27      Injury Severity      Injury Severity        3        3       2.50
#> 28      Injury Severity      Injury Severity        4        4       5.00
#> 29      Injury Severity      Injury Severity        5        5       8.00
#> 30      Injury Severity      Injury Severity        6        6       0.40
#> 31      Injury Severity      Age of Claimant        1     0-15       1.00
#> 32      Injury Severity      Age of Claimant        1    15-30       1.00
#> 33      Injury Severity      Age of Claimant        1    30-50       1.00
#> 34      Injury Severity      Age of Claimant        1    50-65       1.00
#> 35      Injury Severity      Age of Claimant        1  over 65       1.00
#> 36      Injury Severity      Age of Claimant        2     0-15       1.00
#> 37      Injury Severity      Age of Claimant        2    15-30       1.00
#> 38      Injury Severity      Age of Claimant        2    30-50       1.00
#> 39      Injury Severity      Age of Claimant        2    50-65       1.00
#> 40      Injury Severity      Age of Claimant        2  over 65       1.00
#> 41      Injury Severity      Age of Claimant        3     0-15       1.00
#> 42      Injury Severity      Age of Claimant        3    15-30       1.00
#> 43      Injury Severity      Age of Claimant        3    30-50       1.00
#> 44      Injury Severity      Age of Claimant        3    50-65       1.00
#> 45      Injury Severity      Age of Claimant        3  over 65       1.00
#> 46      Injury Severity      Age of Claimant        4     0-15       1.00
#> 47      Injury Severity      Age of Claimant        4    15-30       1.00
#> 48      Injury Severity      Age of Claimant        4    30-50       1.00
#> 49      Injury Severity      Age of Claimant        4    50-65       0.97
#> 50      Injury Severity      Age of Claimant        4  over 65       0.95
#> 51      Injury Severity      Age of Claimant        5     0-15       1.00
#> 52      Injury Severity      Age of Claimant        5    15-30       1.00
#> 53      Injury Severity      Age of Claimant        5    30-50       1.00
#> 54      Injury Severity      Age of Claimant        5    50-65       0.95
#> 55      Injury Severity      Age of Claimant        5  over 65       0.90
#> 56      Injury Severity      Age of Claimant        6     0-15       1.00
#> 57      Injury Severity      Age of Claimant        6    15-30       1.00
#> 58      Injury Severity      Age of Claimant        6    30-50       1.00
#> 59      Injury Severity      Age of Claimant        6    50-65       1.00
#> 60      Injury Severity      Age of Claimant        6  over 65       1.00
#> 61      Age of Claimant      Age of Claimant     0-15     0-15       1.25
#> 62      Age of Claimant      Age of Claimant    15-30    15-30       1.15
#> 63      Age of Claimant      Age of Claimant    30-50    30-50       1.00
#> 64      Age of Claimant      Age of Claimant    50-65    50-65       0.85
#> 65      Age of Claimant      Age of Claimant  over 65  over 65       0.70
#> 
#> attr(,"class")
#> [1] "covariates"

The claim_size_adj() function simulates the covariate levels for each claim and then adjusts the claim sizes according to the relativities defined above. The covariate levels for each claim can be accessed in the covariates_data$data attribute of the function output.

claim_size_covariates <- claim_size_adj(test_covariates_obj, claim_sizes)
covariates_data_obj <- claim_size_covariates$covariates_data
head(data.frame(covariates_data_obj$data))
#>   Legal.Representation Injury.Severity Age.of.Claimant
#> 1                    Y               1           30-50
#> 2                    Y               3         over 65
#> 3                    Y               1           50-65
#> 4                    Y               2            0-15
#> 5                    Y               3           50-65
#> 6                    Y               2           30-50

The adjusted claim sizes are stored in the claim_size_adj attribute.

claim_size_w_cov <- claim_size_covariates$claim_size_adj
claim_size_w_cov[[1]]
#>  [1] 3.805351e+05 3.037256e+05 1.275308e+04 6.052859e+01 2.463426e+04
#>  [6] 6.604369e+05 4.650192e+03 2.047635e+03 4.036059e+04 3.794076e+03
#> [11] 4.813102e+04 2.378047e+04 2.412222e+04 4.700084e+03 7.025452e+05
#> [16] 5.408519e+05 1.353937e+03 1.532105e+05 2.651208e+03 6.530273e+05
#> [21] 3.114636e+03 2.352368e+05 1.289837e+04 5.427162e+05 3.954064e+03
#> [26] 3.182096e+04 2.129965e+05 9.924880e+04 2.312969e+04 1.239915e+05
#> [31] 3.151633e+04 3.443674e+04 5.795648e+04 1.046714e+06 7.586708e+04
#> [36] 4.516002e+05 2.164585e+02 7.783931e+04 1.492351e+05 1.667352e+04
#> [41] 1.860845e+04 2.543685e+04 2.063098e+04 4.893023e+03 3.935485e+05
#> [46] 1.698506e+05 2.625343e+04 1.804647e+04 1.140675e+04 7.235828e+04
#> [51] 5.549151e+04 3.061901e+05 1.901484e+06 1.092938e+06 3.668653e+03
#> [56] 6.238804e+05 1.412806e+03 6.383520e+04 1.503513e+03 1.895553e+04
#> [61] 2.252536e+04 1.181424e+05 6.570533e+04 3.257809e+05 2.396608e+04
#> [66] 5.434085e+04 3.191966e+05 4.451927e+03 2.838118e+04 3.466763e+04
#> [71] 1.007316e+05 1.420651e+05 4.898013e+04 3.874513e+04 1.816631e+05
#> [76] 5.609491e+04 4.233783e+05 4.055192e+05 5.348106e+05 9.282343e+04
#> [81] 5.501306e+04 6.280307e+05 7.692956e+04 1.325974e+04 9.718303e+04
#> [86] 1.232198e+03 3.458512e+03 4.966152e+05 6.014225e+04 3.178223e+05

Modelling Steps 3-5

Just as in Steps 1-2, Steps 3 onwards also do not require any specific adjustment in relation to implementing covariates. Guidance on implementing these modelling steps can be found in the SynthETIC-demo vignette. We can see from the example below that the inclusion of covariates primarily has an impact on claim sizes and thus any following modelling steps that are also impacted from the adjusted claim sizes. Note that the number of claims (n_vector) and the time at which they occur (occurrence_times) are unaffected by covariates.

generate_claims_dataset <- function(claim_size_list) {
    
    # SynthETIC Steps 3-5
    notidel <- claim_notification(n_vector, claim_size_list)
    setldel <- claim_closure(n_vector, claim_size_list)
    no_payments <- claim_payment_no(n_vector, claim_size_list)
    
    claim_dataset <- generate_claim_dataset(
      frequency_vector = n_vector,
      occurrence_list = occurrence_times,
      claim_size_list = claim_size_list,
      notification_list = notidel,
      settlement_list = setldel,
      no_payments_list = no_payments
    )
    
    claim_dataset
}

claim_dataset <- generate_claims_dataset(claim_size_list = claim_sizes)
claim_dataset_w_cov <- generate_claims_dataset(claim_size_list = claim_size_w_cov)

head(claim_dataset)
#>   claim_no occurrence_period occurrence_time   claim_size  notidel   setldel
#> 1        1                 1       0.6238351 783769.11073 1.900709 17.043275
#> 2        2                 1       0.1206679 214480.60483 1.609819  7.881951
#> 3        3                 1       0.2220436  30902.21786 3.278830  8.141655
#> 4        4                 1       0.4538309     49.86708 6.079014  0.511246
#> 5        5                 1       0.5910992  14326.01244 2.379051  2.488673
#> 6        6                 1       0.9524492 680134.40835 1.048755 17.254912
#>   no_payment
#> 1         12
#> 2         12
#> 3          4
#> 4          2
#> 5          3
#> 6          5
head(claim_dataset_w_cov)
#>   claim_no occurrence_period occurrence_time   claim_size    notidel   setldel
#> 1        1                 1       0.6238351 380535.10561 0.17572511 23.635467
#> 2        2                 1       0.1206679 303725.60695 0.07161892 21.407192
#> 3        3                 1       0.2220436  12753.08224 4.47553717  1.347083
#> 4        4                 1       0.4538309     60.52859 0.81164292  1.088407
#> 5        5                 1       0.5910992  24634.26408 2.57036623  5.007377
#> 6        6                 1       0.9524492 660436.89492 0.31729424 12.376230
#>   no_payment
#> 1          5
#> 2          5
#> 3          2
#> 4          1
#> 5          4
#> 6         11

Appendix 1: Using Different Sets of Covariates

This section shows the impact of using a set of covariates different than the default values within the SynthETIC package.

The included framework allows a user to easily construct any set of covariates required for simulation and/or analysis. This gives the user flexibility in choosing both the number of factors in the set of covariates and the number of levels within each factor.

The below example compares

factors_tmp <- list(
    "Vehicle Type" = c("Passenger", "Light Commerical", "Medium Goods", "Heavy Goods"),
    "Business Use" = c("Y", "N")
)

relativity_freq_tmp <- relativity_template(factors_tmp)
relativity_sev_tmp <- relativity_template(factors_tmp)

# Default Values
relativity_freq_tmp$relativity <- c(
    5, 1.5, 0.35, 0.25,
    1, 4,
    1, 0.6,
    0.35, 0.01,
    0.25, 0,
    2.5, 5
)

relativity_sev_tmp$relativity <- c(
    0.25, 0.75, 1, 3,
    1, 1,
    1, 1,
    1, 1,
    1, 1,
    1.3, 1
)

test_covariates_obj_veh <- covariates(factors_tmp)
test_covariates_obj_veh <- set.covariates_relativity(
    covariates = test_covariates_obj_veh, 
    relativity = relativity_freq_tmp, 
    freq_sev = "freq"
)
test_covariates_obj_veh <- set.covariates_relativity(
    covariates = test_covariates_obj_veh, 
    relativity = relativity_sev_tmp, 
    freq_sev = "sev"
)

claim_size_covariates_veh <- claim_size_adj(test_covariates_obj_veh, claim_sizes)

# Comparison of the same claim size except with adjustments due to covariates
data.frame(
    Claim_Size = head(round(claim_sizes[[1]]))
    ,Claim_Size_Original_Covariates = head(round(claim_size_covariates$claim_size_adj[[1]]))
    ,Claim_Size_New_Covariates = head(round(claim_size_covariates_veh$claim_size_adj[[1]]))
)
#>   Claim_Size Claim_Size_Original_Covariates Claim_Size_New_Covariates
#> 1     783769                         380535                    650712
#> 2     214481                         303726                    178069
#> 3      30902                          12753                     25656
#> 4         50                             61                        41
#> 5      14326                          24634                     11894
#> 6     680134                         660437                    564671

# Covariate Levels
head(claim_size_covariates$covariates_data$data)
#>   Legal Representation Injury Severity Age of Claimant
#> 1                    Y               1           30-50
#> 2                    Y               3         over 65
#> 3                    Y               1           50-65
#> 4                    Y               2            0-15
#> 5                    Y               3           50-65
#> 6                    Y               2           30-50
head(claim_size_covariates_veh$covariates_data$data)
#>   Vehicle Type Business Use
#> 1    Passenger            N
#> 2    Passenger            N
#> 3    Passenger            N
#> 4    Passenger            N
#> 5    Passenger            N
#> 6    Passenger            N

Appendix 2: Applying Known Covariate Values

To apply specific covariate values for each claim occurrence, we can use the parameter covariates_id when constructing the covariates_data object. This would map the each claim to a corresponding known covariate value from a dataset and apply the relevant severity relativities. Note that in this case, the frequency relativities would not be used, as no simulation of covariate values are performed.

In the example below, we have a known dataset of covariates, which can be mapped to each of the claim sizes. In the covariates dataset, we know:

As a result, we can use the indices for each of these rows to map each set of covariates to its associated claim. In this case, the first 50 claims are related to the last 50 rows in the covariates dataset in reverse order, and claims 51–100 are related to the first 50 rows in the covariates dataset.

claim_sizes_known <- list(c(
    rexp(n = 100, rate = 1.5)
))

known_covariates_dataset <- data.frame(
    "Vehicle Type" = rep(rep(c("Passenger", "Light Commerical"), each = 25), times = 2),
    "Business Use" = c(rep("N", times = 50), rep("Y", times = 50))
)
colnames(known_covariates_dataset) <- c("Vehicle Type", "Business Use")

covariates_data_veh <- covariates_data(
    test_covariates_obj_veh, 
    data = known_covariates_dataset, 
    covariates_id = list(c(100:51, 1:50))
)

claim_sizes_adj_tmp <- claim_size_adj.fit(
    covariates_data = covariates_data_veh,
    claim_size = claim_sizes_known
)

head(claim_sizes_adj_tmp[[1]])
#> [1] 1.23909867 0.41583558 0.21873095 2.08471717 0.23570391 0.04547377

mirror server hosted at Truenetwork, Russian Federation.