TriLLIEM

Simulating triad data

Simulating data in is all performed through the function. For example, we can simulate \(1,000\) case-triads from a population under HWE with a minor allele frequency of \(30\%\), \(S = 1.2\) and no other genetic factors by using the commands:

set.seed(123) # setting random seed for reproducibility
matdat <- 
  simulateData(
    nCases = 1000,
    maf = 0.3,
    mtCoef = c(1,1,1),
    S = c(1, 1.2, 1.2^2)
  )

More complex simulations can also be run. For example, consider the following conditions:

\(1,000\) case-triads
\(1,000\) control-triads
\(30\%\) minor allele frequency
\((C_1, C_2, C_4) = c(0.5,0.5,0.5)\)
\(S=1.2\)
\(I_{\textrm{M}} = 1.4\)
\(30\%\) environmental exposure frequency in the population
\(V = 1.5\) with maternal imprinting by environment interactions

We can obtain the appropriate simulated data by running:

impdat <-
  simulateData(
    nCases = 2000,
    nControl = 1000,
    maf = 0.3,
    S = c(1, 1.2, 1.2^2),
    V = c(1, 1.5, 1.5^2),
    mtCoef = c(0.5, 0.5, 0.5),
    Im = 1.4,
    propE = 0.3,
    Einteraction = "Im"
  )

Analyzing triad data

Analyzing triad data sets is performed through the function. This function takes a vector of characters for effects to include in the model, . It also allows users to specify the mating type model to use, and whether to use the stratified or non-stratified approach for environmental interactions (non-stratified by default). With the first simulated data set above, an appropriate model would use an HWE mating type model with maternal effects only, hence we would run the command:

matres <-
  TriLLIEM(
    mtmodel = "HWE",
    effects = c("M"),
    dat = matdat
  )

The function returns an object of class , which inherits from the object in R. This means users can apply many of the usual functions for interpreting results from the function. To interpret , we would run:

matres |> summary() |> coef()
#>               Estimate Std. Error    z value     Pr(>|z|)
#> (Intercept)  5.3642380 0.05373145  99.834238 0.000000e+00
#> HWgeno      -0.8640207 0.04895991 -17.647514 1.063359e-69
#> M            0.2140671 0.06793993   3.150829 1.628078e-03

The “Estimate” column for the \(\textrm{M}\) row represents our point estimate for \(s\), which we can exponentiate to obtain the equivalent risk parameter, \(S\). Based on these results, the model estimates \(\hat{S} = 1.24\) with a p-value of \(0.00163\). Since the data were simulated for a population with \(S = 1.2\), this level of significance is expected.

Likewise, for the data, we would use an MaS mating type model, with \(\textrm{S}, I_{\textrm{M}}, \textrm{E}:_{\textrm{M}}\) as genetic effects in the hybrid model.

impres <-
  TriLLIEM(
    mtmodel = "MaS",
    effects = c("M", "Im", "E:Im"),
    dat = impdat,
    includeE = TRUE,
    includeD = TRUE
  )
impres |> summary() |> coef()
#>                        Estimate Std. Error   z value      Pr(>|z|)
#> as.factor(mt_MaS)1    5.1815446 0.05797453 89.376223  0.000000e+00
#> as.factor(mt_MaS)2    4.6935934 0.05439332 86.289887  0.000000e+00
#> as.factor(mt_MaS)3    3.5599675 0.07715230 46.142077  0.000000e+00
#> as.factor(mt_MaS)4    4.0453588 0.08915495 45.374471  0.000000e+00
#> as.factor(mt_MaS)5    2.6738561 0.15164319 17.632550  1.385743e-69
#> as.factor(mt_MaS)6    3.3945750 0.06546387 51.854173  0.000000e+00
#> as.factor(mt_MaS)7    2.8988230 0.09880007 29.340295 3.177895e-189
#> as.factor(mt_MaS)8    1.5246915 0.17644651  8.641098  5.567555e-18
#> as.factor(mt_MaS)9    1.5892301 0.22489512  7.066539  1.588461e-12
#> D                     0.4640526 0.05622893  8.252916  1.545763e-16
#> M                     0.3672891 0.07842121  4.683543  2.819584e-06
#> Im                    0.1618288 0.08715645  1.856762  6.334499e-02
#> as.factor(mt_MaS)1:E -0.8489310 0.10406645 -8.157587  3.417852e-16
#> as.factor(mt_MaS)2:E -0.8654505 0.09740543 -8.885033  6.390154e-19
#> as.factor(mt_MaS)3:E -0.8737671 0.13462755 -6.490255  8.569108e-11
#> as.factor(mt_MaS)4:E -1.2086593 0.18531904 -6.522046  6.935466e-11
#> as.factor(mt_MaS)5:E -1.1784657 0.24725042 -4.766284  1.876550e-06
#> as.factor(mt_MaS)6:E -0.8734544 0.11353488 -7.693269  1.434227e-14
#> as.factor(mt_MaS)7:E -0.8466810 0.17213757 -4.918630  8.715215e-07
#> as.factor(mt_MaS)8:E -0.5633836 0.25337619 -2.223506  2.618167e-02
#> as.factor(mt_MaS)9:E -0.6885467 0.34166090 -2.015293  4.387399e-02
#> E:D                  -0.1251749 0.09541586 -1.311888  1.895579e-01
#> E:Im                  0.4695854 0.14410745  3.258578  1.119720e-03

From the results for genetic effects, the model estimates \(\hat{\textrm{S}} = 1.19, \hat{I}_{\textrm{M}} = 1.33, \hat{V} = 1.50\) with respective p-values \(0.0594, 0.0199, 0.0269\). Note that these results use the non-stratified approach. For results equivalent to the stratified approach that are equivalent to and , we use:

impres_strat <-
  TriLLIEM(
    mtmodel = "MaS",
    effects = c("M", "Im", "E:Im"),
    dat = impdat,
    includeE = TRUE,
    Estrat = TRUE,
    includeD = TRUE
  )
impres_strat |> summary() |> coef()
#>                         Estimate Std. Error    z value      Pr(>|z|)
#> as.factor(mt_MaS)1    5.17813918 0.05884060 88.0028254  0.000000e+00
#> as.factor(mt_MaS)2    4.69018804 0.05531549 84.7897764  0.000000e+00
#> as.factor(mt_MaS)3    3.56477327 0.07826018 45.5502808  0.000000e+00
#> as.factor(mt_MaS)4    4.04195341 0.08972054 45.0504790  0.000000e+00
#> as.factor(mt_MaS)5    2.68961659 0.15770371 17.0548717  3.215783e-65
#> as.factor(mt_MaS)6    3.39938078 0.06676599 50.9148586  0.000000e+00
#> as.factor(mt_MaS)7    2.90362882 0.09966760 29.1331257 1.366810e-186
#> as.factor(mt_MaS)8    1.54045202 0.18168152  8.4788590  2.274131e-17
#> as.factor(mt_MaS)9    1.60499054 0.22902537  7.0079159  2.418941e-12
#> D                     0.46959315 0.05844191  8.0352124  9.341694e-16
#> M                     0.34960137 0.09211854  3.7951248  1.475692e-04
#> Im                    0.17181963 0.09069743  1.8944267  5.816839e-02
#> as.factor(mt_MaS)1:E -0.83749037 0.10893675 -7.6878592  1.496173e-14
#> as.factor(mt_MaS)2:E -0.85400985 0.10259243 -8.3242966  8.483097e-17
#> as.factor(mt_MaS)3:E -0.88909912 0.14177636 -6.2711382  3.584184e-10
#> as.factor(mt_MaS)4:E -1.19721868 0.18809715 -6.3648955  1.954228e-10
#> as.factor(mt_MaS)5:E -1.23643459 0.29819412 -4.1464084  3.377313e-05
#> as.factor(mt_MaS)6:E -0.88878638 0.12192674 -7.2895117  3.110804e-13
#> as.factor(mt_MaS)7:E -0.86201293 0.17778442 -4.8486415  1.243098e-06
#> as.factor(mt_MaS)8:E -0.62135250 0.30329269 -2.0486894  4.049250e-02
#> as.factor(mt_MaS)9:E -0.74651565 0.38015672 -1.9637050  4.956431e-02
#> E:M                   0.06223857 0.17560562  0.3544224  7.230223e-01
#> E:Im                  0.43398750 0.17368206  2.4987468  1.246333e-02
#> E:D                  -0.14451609 0.11018032 -1.3116325  1.896442e-01

This model (which adds a \(\textrm{E}:\text{M}\) effect due to stratification), has \(\hat{S} = 1.18\), \(\hat{I}_\textrm{M} = 1.34\), and \(\hat{V} = 1.46\), with respective p-values \(0.0135, 0.0237, 0.0121\). Note that for comparison with and , we can only compare the \(\hat{V}\) estimates, since the stratified approach does not produce equivalent estimates for the main effects.

Likelihood ratio tests

allows users to compute log-likelihoods and perform likelihood ratio tests with the anova() function, similar to typical glm’s. For example, we can construct two MS models for example_dat4R, one with effects \(\textrm{C}, \textrm{M}\), and another that is nested with effects \(\textrm{C}, \textrm{M}, I_{\textrm{M}}\).

model_1 <- TriLLIEM(dat = example_dat4R, effects = c("C", "M"))
model_2 <- TriLLIEM(dat = example_dat4R, effects = c("C", "M", "Im"))

Then to compare the two models using the likelihood ratio test, we simply run:

anova(model_1, model_2)
#> Analysis of Deviance Table
#> 
#> Factors included in model: C, M
#> 
#> Response: count 
#> 
#> Terms added sequentially (first to last)
#> 
#> 
#>   Resid. Df Resid. Dev Df Deviance Pr(>Chi)  
#> 1         6     6.9853                       
#> 2         5     1.0541  1   5.9313  0.01487 *
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

TriLLIEM

Initial setup

Simulating triad data

Analyzing triad data

Likelihood ratio tests