phoenics is an R package designed to
perform a differential analysis for metabolomics datasets, while
accounting for metabolomic pathway information. More precisely,
phoenics performs a test at pathway level based on
metabolite quantifications and information on pathway metabolite
composition. The automatic query of metabolic pathways is implemented in
the package. See Guilmineau et al.,
2025.
library("phoenics")The dataset MTBLS422 contains data obtained from the
study Choo et al., 2017. Raw data have been
made available on MetaboLights (with the id MTBLS422 https://www.ebi.ac.uk/metabolights/editor/MTBLS422).
Metabolite quantifications were obtained based on the raw signal using
ASICS
package and are provided in the object quantif included in
the dataset.
data("MTBLS422")
ls()## [1] "design"   "pathways" "quantif"head(quantif)##                              1071        1151 1231        1281        1371
## Glycerol              0.000000000 0.000000000    0 0.000000000 0.000000000
## D-Glucose-6-Phosphate 0.002339491 0.001251202    0 0.001958586 0.002652924
## D-Galactose           0.001628629 0.003448431    0 0.001535844 0.002693243
## D-Sorbitol            0.000000000 0.000000000    0 0.000000000 0.000000000
## Myo-Inositol          0.001190331 0.001929542    0 0.001325843 0.001760645
## D-GlucuronicAcid      0.001495646 0.003459446    0 0.001377114 0.001855756
##                              1531         1681         1751        2081
## Glycerol              0.001403664 0.0000000000 0.0008294523 0.000000000
## D-Glucose-6-Phosphate 0.002877861 0.0035782998 0.0025688724 0.002273193
## D-Galactose           0.001952493 0.0037286043 0.0000000000 0.001240772
## D-Sorbitol            0.001066357 0.0009889747 0.0012056168 0.000000000
## Myo-Inositol          0.001430719 0.0028295101 0.0011248417 0.002085898
## D-GlucuronicAcid      0.001791299 0.0007645118 0.0000000000 0.001816359
##                              2161        2311
## Glycerol              0.000000000 0.000000000
## D-Glucose-6-Phosphate 0.001106493 0.002437845
## D-Galactose           0.002072136 0.001392737
## D-Sorbitol            0.000000000 0.000000000
## Myo-Inositol          0.001382456 0.001893727
## D-GlucuronicAcid      0.002410063 0.001502470They can be transformed into a data format suitable for
phoenics with:
quantif_f <- from_ASICS_to_PHOENICS(quantif)
head(quantif_f)##           C00116      C00092      C00984      C00794      C00137      C00191
## 1071 0.000000000 0.002339491 0.001628629 0.000000000 0.001190331 0.001495646
## 1151 0.000000000 0.001251202 0.003448431 0.000000000 0.001929542 0.003459446
## 1231 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## 1281 0.000000000 0.001958586 0.001535844 0.000000000 0.001325843 0.001377114
## 1371 0.000000000 0.002652924 0.002693243 0.000000000 0.001760645 0.001855756
## 1531 0.001403664 0.002877861 0.001952493 0.001066357 0.001430719 0.001791299
##            C00864       C01697       C00029       C00003
## 1071 0.0005098139 0.0000000000 0.0005008620 0.0000000000
## 1151 0.0007075218 0.0005573846 0.0000000000 0.0000000000
## 1231 0.0000000000 0.0000000000 0.0000000000 0.0009024399
## 1281 0.0004129427 0.0005796955 0.0004431261 0.0000000000
## 1371 0.0007317685 0.0005080792 0.0000000000 0.0000000000
## 1531 0.0002865527 0.0000000000 0.0006293917 0.0000000000In addition, pathway information was obtained using KEGGREST
package and is included in pathways:
head(pathways)##    metabolite_code   metabolite_name pathway_code         pathway_name
## 24          C00029       UDP-glucose     mmu00052 Galactose metabolism
## 25          C00116          Glycerol     mmu00052 Galactose metabolism
## 26          C00137      myo-Inositol     mmu00052 Galactose metabolism
## 27          C00794        D-Sorbitol     mmu00052 Galactose metabolism
## 28          C00984 alpha-D-Galactose     mmu00052 Galactose metabolism
## 29          C01697        Galactitol     mmu00052 Galactose metabolismThe user can also query this information using the function
pathway_search:
pathways <- pathway_search(metab = colnames(quantif), organism = "mmu")phoenics testThe test implemented in phoenics is a mixed model with
fixed and random effects. The current study design is provided in
design:
head(design)##           Id Age Treatment Mouse
## 1071 T1_G1R1 5.5   Control  G1R1
## 1151 T1_G1R2 5.5   Control  G1R2
## 1231 T2_G1R2 7.5   Control  G1R2
## 1281 T2_G1R1 7.5   Control  G1R1
## 1371 T1_G3R2 5.5 Vanco-Imi  G3R2
## 1531 T3_G1R1 9.0   Control  G1R1The Age and Treatment are thus used as
fixed effects and the Mouse is used as a random effect:
out_test <- test_pathway(quantif_f, design, pathways, 
                         fixed = c("Age", "Treatment"), random = "Mouse", 
                         npc = 2, model = "blmer")
out_test## phoenics method based on PCA
## A list of pathways which contains for each pathway: 
## $pathway_name: Pathway name 
## $pathway_code: Pathway code 
## $metabolites: Metabolites in the pathway 
## $PCA: Factor analysis results 
## $model: Model results 
## $test_pathway: Pathway test resultsResults are organized as a list where each element of the list corresponds to a tested pathway:
names(out_test)## [1] "Galactose metabolism"             "Inositol phosphate metabolism"   
## [3] "Vitamin digestion and absorption"The function extract can be used to extract results on a
given pathway:
res_1 <- extract(out_test, "Galactose metabolism")Fixed effect \(p\)-values and full
model are provided, respectively, in entries test_pathway
and model of this object.
res_1[[1]]$test_pathway##   Fixed_effect      pval
## 1          Age 0.2145888
## 2    Treatment 0.3831807res_1[[1]]$model## $mmu00052_PC1
## Cov prior  : Mouse ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE)
## Prior dev  : -0.6945
## 
## Linear mixed model fit by REML ['blmerMod']
## Formula: mmu00052_PC1 ~ Age + Treatment + (1 | Mouse)
##    Data: score_path
## REML criterion at convergence: 40.441
## Random effects:
##  Groups   Name        Std.Dev.
##  Mouse    (Intercept) 2.054   
##  Residual             1.629   
## Number of obs: 11, groups:  Mouse, 4
## Fixed Effects:
##        (Intercept)                 Age  TreatmentVanco-Imi  
##             2.0008             -0.3463              1.1450  
## 
## $mmu00052_PC2
## Cov prior  : Mouse ~ wishart(df = 3.5, scale = Inf, posterior.scale = cov, common.scale = TRUE)
## Prior dev  : -1.996
## 
## Linear mixed model fit by REML ['blmerMod']
## Formula: mmu00052_PC2 ~ Age + Treatment + (1 | Mouse)
##    Data: score_path
## REML criterion at convergence: 36.1712
## Random effects:
##  Groups   Name        Std.Dev.
##  Mouse    (Intercept) 2.209   
##  Residual             1.136   
## Number of obs: 11, groups:  Mouse, 4
## Fixed Effects:
##        (Intercept)                 Age  TreatmentVanco-Imi  
##            -3.1797              0.3949              0.5002In addition, adjusted \(p\)-values
(BH method) across pathways can be directly obtained
with the function adjust_pval:
adjust_pval(out_test)##                       pathway_name pathway_code Fixed_effect        pval
## 1             Galactose metabolism     mmu00052          Age 0.214588838
## 3    Inositol phosphate metabolism     mmu00562          Age 0.037233449
## 5 Vitamin digestion and absorption     mmu04977          Age 0.009257926
## 2             Galactose metabolism     mmu00052    Treatment 0.383180685
## 4    Inositol phosphate metabolism     mmu00562    Treatment 0.485524008
## 6 Vitamin digestion and absorption     mmu04977    Treatment 0.419680043
##   adjusted_pval
## 1    0.21458884
## 3    0.07446690
## 5    0.02777378
## 2    1.00000000
## 4    1.00000000
## 6    1.00000000PCA results can be explored using the entry PCA:
res_1[[1]]$PCA## **Results for the Principal Component Analysis (PCA)**
## The analysis was performed on 11 individuals, described by 10 variables
## *The results are available in the following objects:
## 
##    name                description                                          
## 1  "$eig"              "eigenvalues"                                        
## 2  "$var"              "results for the variables"                          
## 3  "$var$coord"        "coord. for the variables"                           
## 4  "$var$cor"          "correlations variables - dimensions"                
## 5  "$var$cos2"         "cos2 for the variables"                             
## 6  "$var$contrib"      "contributions of the variables"                     
## 7  "$ind"              "results for the individuals"                        
## 8  "$ind$coord"        "coord. for the individuals"                         
## 9  "$ind$cos2"         "cos2 for the individuals"                           
## 10 "$ind$contrib"      "contributions of the individuals"                   
## 11 "$quali.sup"        "results for the supplementary categorical variables"
## 12 "$quali.sup$coord"  "coord. for the supplementary categories"            
## 13 "$quali.sup$v.test" "v-test of the supplementary categories"             
## 14 "$call"             "summary statistics"                                 
## 15 "$call$centre"      "mean of the variables"                              
## 16 "$call$ecart.type"  "standard error of the variables"                    
## 17 "$call$row.w"       "weights for the individuals"                        
## 18 "$call$col.w"       "weights for the variables"In particular, phoenics implements some plots that can
be directly obtained using
plot(out_test, pathway_id = "Galactose metabolism", plot = "var")plot(out_test, pathway_id = "Galactose metabolism", plot = "ind", 
     habillage = "Age")plot(out_test, pathway_id = "Galactose metabolism", plot = "eig")plot(out_test, pathway_id = "Galactose metabolism", plot = "var")plot(out_test, pathway_id = "Galactose metabolism", plot = "group")phoenics main test functionA MFA step, more suited to can be used in place of the PCA. In this case, the levels first fixed effect are used to split the effects into different tables which are matched by the levels of the first random effect (the typical case is when the first fixed effect contains information on time points in a repeated measurement study where samples are provided by the first random effect).
out_test <- test_pathway(quantif_f, design, pathways, 
                         fixed = c("Age", "Treatment"), random = "Mouse", 
                         npc = 2, model = "blmer", analysis = "MFA")## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)
## refitting model(s) with ML (instead of REML)out_test## phoenics method based on MFA
## A list of pathways which contains for each pathway: 
## $pathway_name: Pathway name 
## $pathway_code: Pathway code 
## $metabolites: Metabolites in the pathway 
## $PCA: Factor analysis results 
## $model: Model results 
## $test_pathway: Pathway test resultsThe option pathways = "auto" can also be used to
automatically query KEGGREST instead of performing the two step analysis
relying on test_pathway:
out_test3 <- test_pathway(quantif, design, pathways = "auto", 
                          fixed = c("Age", "Treatment"), random = "Mouse", 
                          npc = 2, model = "blmer", organism = "mmu")
out_test3[1] Guilmineau, C., Tremblay-Franco, M., Vialaneix, N., & Servien, R. (2025). phoenics: a novel statistical approach for longitudinal metabolomic pathway analysis. BMC Bioinformatics, 26: 105. DOI: 10.1186/s12859-025-06118-z.
[2] Choo J. M., Kanno T., Zain N. M. M., Leong L. E. X., Abell G. C. J., Keeble J. E., Bruce K. D., Mason A. J., Rogers G. B. (2017). Divergent relationships between fecal microbiota and metabolome following distinct antibiotic-induced disruptions. mSphere, 2(1). DOI: 10.1128/msphere.00005-17.
[3] Benjamini Y, Hochberg Y (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B, 57(1): 289–300. DOI: 10.1111/j.2517-6161.1995.tb02031.x
sessionInfo()## R version 4.4.0 (2024-04-24 ucrt)
## Platform: x86_64-w64-mingw32/x64
## Running under: Windows 11 x64 (build 26100)
## 
## Matrix products: default
## 
## 
## locale:
## [1] LC_COLLATE=C                   LC_CTYPE=French_France.utf8   
## [3] LC_MONETARY=French_France.utf8 LC_NUMERIC=C                  
## [5] LC_TIME=French_France.utf8    
## 
## time zone: Europe/Paris
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] phoenics_0.6
## 
## loaded via a namespace (and not attached):
##  [1] gtable_0.3.6         xfun_0.52            bslib_0.9.0         
##  [4] ggplot2_3.5.2        htmlwidgets_1.6.4    rstatix_0.7.2       
##  [7] ggrepel_0.9.6        lattice_0.22-6       vctrs_0.6.5         
## [10] tools_4.4.0          Rdpack_2.6.4         generics_0.1.4      
## [13] sandwich_3.1-1       tibble_3.3.0         cluster_2.1.6       
## [16] pkgconfig_2.0.3      Matrix_1.7-3         RColorBrewer_1.1-3  
## [19] scatterplot3d_0.3-44 lifecycle_1.0.4      factoextra_1.0.7    
## [22] compiler_4.4.0       farver_2.1.2         leaps_3.2           
## [25] codetools_0.2-20     carData_3.0-5        htmltools_0.5.8.1   
## [28] sass_0.4.10          yaml_2.3.10          Formula_1.2-5       
## [31] car_3.1-3            pillar_1.10.2        ggpubr_0.6.0        
## [34] nloptr_2.2.1         FactoMineR_2.11      jquerylib_0.1.4     
## [37] tidyr_1.3.1          MASS_7.3-65          flashClust_1.01-2   
## [40] DT_0.33              cachem_1.1.0         reformulas_0.4.1    
## [43] abind_1.4-8          boot_1.3-30          multcomp_1.4-28     
## [46] nlme_3.1-168         tidyselect_1.2.1     digest_0.6.37       
## [49] mvtnorm_1.3-3        dplyr_1.1.4          purrr_1.0.4         
## [52] labeling_0.4.3       splines_4.4.0        fastmap_1.2.0       
## [55] grid_4.4.0           cli_3.6.5            magrittr_2.0.3      
## [58] survival_3.8-3       broom_1.0.8          TH.data_1.1-3       
## [61] withr_3.0.2          backports_1.5.0      scales_1.4.0        
## [64] estimability_1.5.1   rmarkdown_2.29       emmeans_1.11.1      
## [67] lme4_1.1-37          blme_1.0-6           ggsignif_0.6.4      
## [70] zoo_1.8-14           coda_0.19-4.1        evaluate_1.0.4      
## [73] knitr_1.50           rbibutils_2.3        rlang_1.1.6         
## [76] Rcpp_1.0.14          xtable_1.8-4         glue_1.8.0          
## [79] rstudioapi_0.17.1    minqa_1.2.8          jsonlite_2.0.0      
## [82] R6_2.6.1             multcompView_0.1-10