PEIMAN2-vignette

Payman Nickchi, Mohieddin Jafari

2026-06-16

1 Introduction

The annotation enrichment analysis increases the chance of identifying relevant biological features in a list of genes or proteins. The post translational enrichment, integration, and matching analysis (PEIMAN v1) software was introduced to provide a systematic framework to identify more probable and enriched post-translational modification (PTM) terms in a list of proteins obtained from high-throughput technologies (Nickchi et al. 2015). PEIMAN maps a large list of proteins to PTM keywords and test for their statistical significance, using a hypergeometric test. PEIMAN uses the most traditional way of enrichment analysis, by getting a list of proteins selected by user, and search for enriched PTM terms one by one. This strategy is called Singular Enrichment Analysis or SEA. Although this is a very promising approach for identifying biological features, the quality of selected list by researcher can potentially affect results at the end of the analysis.

To avoid this problem, we extend our enrichment framework to a wider class of enrichment analysis called Gene Set Enrichment Analysis or GSEA (Subramanian et al. 2005). The underlying idea of GSEA is very similar to SEA. Instead of applying a cutoff on input genes obtained from micro array experiments (either p-value or fold-change in gene expression), a ‘no-cutoff’ strategy is considered. The immediate benefits of this approach is to reduce the bias of gene selection and include genes with a low change in their expression level to participate in final analysis. The maximum value of the running score profile for ranked genes in each enrichment category is then calculated and compared with random scores obtained from permutation. More details on (Subramanian et al. 2005). This framework can be expanded to enrichment analysis in proteins. Inspired by GSEA idea, we here introduce a package in R for Protein Set Enrichment Analysis (PSEA).

The database in PEIMAN package updates monthly according to changes in UniProt. The package can be used to perform singular enrichment analysis (SEA) and visualize the results. PEIMAN can also be used to match and integrate results of two SEA analysis (for the same species) by visualizing their common PTM terms. To correct for biases in SEA, we implement protein set enrichment analysis (PSEA) as a new tool for computational community. Researchers can use this package to run PSEA and visualize the results.

Figure1: Our suggested workflow for a PTM-centric proteomics using PEIMAN software v2.0
Figure1: Our suggested workflow for a PTM-centric proteomics using PEIMAN software v2.0

2 Example data

We consider two example datasets to demonstrate the features of our package.

  1. exmplData1: We use the first example data for single enrichment analysis. This dataset contains two list of human proteins randomly selected from UniProt. The first list contains 45 proteins and the second list contains of 97 randomly selected proteins. Both lists belongs to Homo Sapiens (Human). Note: Only the first six proteins in each list are shown below.
P31946
P62258
Q04917
P61981
P31947
P27348
P17174
Q9NY61
P00505
Q96GS6
Q5VST6
Q6PCB6
  1. exmplData2: We will use the second dataset to perform protein set enrichment analysis or PSEA. The dataset is described in (Gholizadeh et al. 2021).
beatAML dataset samples
UniProtAC Score
P47819 579.6287
P20428 129.7175
P62982 2139.2700
P0CG51 2139.2700
P62986 2139.2700
Q63429 2139.2700

3 Singular Enrichment analysis (SEA)

In this section, we introduce the functions related to singular enrichment analysis or SEA in PEIMAN2 package. The functions in this section are divided into two parts, functions for enrichment and functions for plotting. We use exmplData1 in this part.

3.1 Enrichment

runEnrichment() function can be used to run singular enrichment analysis for one list of protein. This function takes the following inputs:

As it was mentioned, the taxonomy name of species must be provided, e.g for a list of proteins belongs to human we pass os.name as ‘Homo sapiens (Human)’. The list is available at UniProt website. We also included a helper function named getTaxonomyName to help getting the exact taxonomy name. More on this function later.

The following lines of code illustrate the steps to run SEA on exmplData1. In runEnrichment function, we pass pl1 (a character vector of UniProt accession code) to perform SEA as follows and save the results in enrich1.

# Load PEIMAN2 package
library(PEIMAN2)

# Extract dataset and assign a variable name to it
pl1 <- exmplData1$pl1

# Run SEA on the list
enrich1 <- runEnrichment(protein = pl1, os.name = 'Homo sapiens (Human)')

The function returns a dataframe with the following columns:

PTM FreqinPopulation FreqinSample Sample Population pvalue corrected pvalue AC
N6-(pyridoxal phosphate)lysine 53 5 97 20431 2e-07 7e-06 Q96QU6; Q4AC99; Q8N5Z0; Q8NHS2; P17174
Isoglutamyl cysteine thioester (Cys-Gln) 7 2 97 20431 4e-06 7e-05 P01023; A8K2U0
Glycoprotein 4726 41 97 20431 8e-06 1e-04 P08195; P08908; P28222; P28221; P28566; P30939; P28223; P41595; P28335; P46098; O95264; Q70Z44; A5X5Y0; Q13639; P47898; P34969; P21589; P02763; P19652; P20848; P01009; P04217; P08697; P02750; P01023; A8K2U0; U3KPV4; Q9NPC4; Q9UNA3; P05067; P30542; P29274; P29275; P0DMS8; P22760; Q15758; P01011; P54619; Q9UGJ0; Q9UGI9; Q13131
Thioester bond 11 2 97 20431 2e-05 2e-04 P01023; A8K2U0
S-cysteinyl cysteine 3 1 97 20431 7e-05 5e-04 P01009
Disulfide bond 3885 33 97 20431 1e-04 9e-04 P08195; P08908; P28222; P28221; P28566; P30939; P28223; P41595; P28335; P46098; O95264; Q8WXA8; A5X5Y0; Q13639; P47898; P50406; P34969; P21589; P05408; P02763; P19652; P04217; P08697; P02750; P01023; A8K2U0; P05067; P30542; P29274; P29275; P0DMS8; Q9NS82; P22760

Note: As it was mentioned, the os.name is the exact taxonomy name of species that you are working with. The name should be exactly the same as UniProt definition. To facilitate searching for this name, you can pass your protein list with UniProt accession ID to getTaxonomyName function as follows. The result is the exact taxonomy name of protein list that you need to pass to runEnrichment. In the following example, the exact taxonomy name is printed:

getTaxonomyName(x = exmplData1$pl1)
#> [1] "Please use os.name = `Homo sapiens (Human)`"

Similarly, we can run SEA for the second list of proteins:

# Extract dataset and assign a variable name to it
pl2 <- exmplData1$pl2

# Run SEA on the list
enrich2 <- runEnrichment(protein = pl2, os.name = 'Homo sapiens (Human)')
PTM FreqinPopulation FreqinSample Sample Population pvalue corrected pvalue AC
Glycoprotein 4726 25 45 20431 6e-07 1e-05 O95477; Q9BZC7; Q99758; P78363; Q8WWZ7; Q8N139; Q8IZY2; O94911; Q8IUA7; Q86UK0; Q2M3G0; Q9NP58; O95342; Q09428; O60706; P33897; Q9UBJ2; P28288; Q9UNQ0; Q9H172; Q9H222; Q9H221; Q8N2K0; Q0P651; Q96J66
N6-(pyridoxal phosphate)lysine 53 2 45 20431 2e-04 2e-03 P17174; P00505
S-glutathionyl cysteine 10 1 45 20431 2e-04 2e-03 Q9NRK6
Glutathionylation 13 1 45 20431 4e-04 2e-03 Q9NRK6
N5-methylglutamine 27 1 45 20431 2e-03 7e-03 Q9BZC7
3’-nitrotyrosine 46 1 45 20431 5e-03 2e-02 P00505

3.2 Plotting SEA results

The plotEnrichment function can be used to visualize singular enrichment analysis for one set of proteins or match, analyse, and integrate results for two sets of proteins. To read more about this match and integration, please read details at (Nickchi et al. 2015). We start by plotting the results for the firs list.

plotEnrichment(x = enrich1, sig.level = 0.05)
#> Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
#> ℹ Please use `linewidth` instead.
#> ℹ The deprecated feature was likely used in the PEIMAN2 package.
#>   Please report the issue to the authors.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

The results is a Lollipop plot which presents “Relative frequency” of each “PTM keywords” along with their corrected p-value measured in log scale. Note that only significant PTMs are shown. The default value for significance level is 5 percent. One can also visualize and match the results of two enrichment. For example, we can see the integrated results of enrich1 and enrich2 by the following line of code:

plotEnrichment(x = enrich1, y = enrich2, sig.level = 0.05)

The plot presents the ‘Relative frequency’ of common PTM terms among two enriched list (x and y). The coloring is the corrected p-value measured in log scale. By default a significance level of 5 percent is set to filter results. This can be modified by sig.level parameter.

4 Protein set enrichment analysis (PSEA)

In this section, we introduce the functions for protein set enrichment analysis (PSEA). The functions in this section are divided into two parts, functions for PSEA and functions for plotting the results. We use exmplData2 in this part.

4.1 PSEA

In order to run protein set enrichment analysis (PSEA), you can use runPSEA function. This function takes the following inputs:

psea_res <- runPSEA(protein = exmplData2, os.name = 'Rattus norvegicus (Rat)', nperm = 1000)
#> Warning: `rerun()` was deprecated in purrr 1.0.0.
#> ℹ Please use `map()` instead.
#>   # Previously
#>   rerun(1000, psea(x = protein, y = pro.pathway, p = pexponent, perm = TRUE))
#> 
#>   # Now
#>   map(1:1000, ~ psea(x = protein, y = pro.pathway, p = pexponent, perm = TRUE))
#> ℹ The deprecated feature was likely used in the PEIMAN2 package.
#>   Please report the issue to the authors.
#> This warning is displayed once per session.
#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was
#> generated.

The result is a list with 6 elements. The first element of this list is important: A dataframe with protein set enrichment analysis (PSEA) results. Every row corresponds to a post-translational modification (PTM) term with the following columns:

knitr::kable(psea_res[[1]], format = 'html')
PTM pval pvaladj FreqinPopulation FreqinSample ES NES nMoreExtreme size Enrichment AC leadingEdge
ADP-ribosylglycine 0e+00 0e+00 4 4 0.7707317 1.5633001 288 4 Over presented P62986; P62982; P0CG51; Q63429 P62982; P0CG51; P62986; Q63429
Acetylation 0e+00 0e+00 1787 125 0.7455919 1.1772832 17 125 Over presented P0C1X8; P11030; P60711; P63259; Q63028; Q62847; Q62848; Q9WUC4; P31399; P29419; P21571; P15999; D3ZAF6; Q9JJW3; O08839; P0DP29; P0DP30; P0DP31; P18418; P26772; P63039; B0K020; P08081; P08082; P45592; Q91ZN1; P11240; Q63768; P10715; P62898; Q9JHL4; Q7M0E3; P62628; Q07266; P84060; P62870; P15429; P07323; P60841; P56571; B0BN94; P55053; P55051; P07483; Q62658; Q32PX7; Q99PF5; Q5XI73; Q63228; P62994; P01946; P02091; P11517; P62959; P82995; P34058; P27321; Q5XI72; P50411; Q6AXU6; Q5BK20; P11980; Q99MZ8; Q792I0; Q66HF9; P15205; Q5M7W5; P30009; P02688; B0BN72; P30904; O35763; P62775; Q05982; Q71UE8; Q9JJ19; P13084; Q01205; P08461; Q920Q0; O88767; P04785; P31044; O55012; P10111; Q6J4I0; Q9R063; Q9EPC6; P02625; Q63475; P51583; Q68A21; P02401; P62982; P62859; Q6RJR6; Q9JK11; Q63945; B0BN85; P07632; Q66HL2; P28042; O35814; P13668; P37377; Q62880; P19332; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; Q6PEC1; P11232; P62076; P62078; Q9WV97; P48500; P04692; P58775; Q63610; P09495; Q7M767; Q9Z1A5; P63045 P62628; P31044; P37377; P45592; P11030; P02625; P29419; P62775; P21571; O88767; P31399; P02688; P08082; P62898; P63045; P62076; P11232; O35814; Q9WUC4; Q62658; Q63228; P07632; Q5XI73; B0K020; P08081; P62959
Cysteine sulfinic acid (-SO2H) 0e+00 0e+00 1 1 0.9423077 329.9497487 67 1 Over presented O88767 O88767
L-cysteine coenzyme A disulfide 0e+00 0e+00 1 1 -0.5817308 -31.0717581 413 1 Under presented Q05982 P31044
N-acetylaspartate 0e+00 0e+00 1 1 -0.9615385 29.6844912 46 1 Over presented P60711 P31044
N-acetylglutamate 0e+00 0e+00 1 1 -0.9663462 -33.5333998 40 1 Under presented P63259 P31044
N6-acetyllysine 0e+00 0e+00 1010 75 0.7129375 1.1304132 75 75 Over presented P11030; Q62848; Q9WUC4; P31399; P29419; P21571; P15999; D3ZAF6; Q9JJW3; P0DP29; P0DP30; P0DP31; P18418; P26772; P63039; B0K020; P08081; P08082; P45592; P11240; P62898; Q9JHL4; Q7M0E3; P07323; P56571; Q62658; Q99PF5; Q5XI73; P62994; P01946; P62959; P82995; P34058; P27321; Q6AXU6; Q5BK20; P11980; Q99MZ8; P30009; P02688; B0BN72; P30904; O35763; P62775; Q05982; Q71UE8; P13084; Q01205; P08461; O88767; P04785; P10111; Q9R063; Q63475; P51583; Q68A21; P02401; P62982; Q9JK11; Q63945; P07632; Q66HL2; P28042; O35814; P13668; P19332; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; P11232; P48500; P09495; Q9Z1A5 P45592; P11030; P29419; P62775; P21571; O88767; P31399; P02688; P08082; P62898; P11232; O35814; Q9WUC4; Q62658; P07632; Q5XI73; B0K020; P08081; P62959
Phosphoprotein 0e+00 0e+00 4142 171 0.5995932 0.9304239 745 171 Over presented P0C1X8; P11030; Q63028; Q62847; O08838; Q99068; Q05140; Q62848; Q9WUC4; P29419; P21571; P15999; D3ZAF6; Q05175; O08839; O88778; P0DP29; P0DP30; P0DP31; O35783; O35397; P26772; P63039; P08081; P08082; P10354; P45592; Q91ZN1; P11240; P84087; Q5U2U2; Q63768; Q6AY72; P11951; P10715; P62898; Q9JHL4; Q9QXU8; Q7M0E3; Q62950; P47942; Q07266; P84060; Q9WTP0; P62870; P15429; P07323; P60841; Q9Z1Z3; Q5RJL0; B0BN94; P55053; P07483; Q9JIX3; Q62658; Q32PX7; Q99PF5; Q920R4; Q5XI73; P47819; Q63228; P62994; P01946; P02091; P11517; P62959; Q9Z2X5; P82995; P34058; P27321; Q5XI72; Q68FR3; P50411; Q6AXU6; Q5BK20; P07335; P11980; Q99MZ8; Q66HF9; P34926; P15205; Q5M7W5; Q63560; P30009; P02688; B0BN72; Q5FVH7; Q4KM98; Q6XVN8; Q62625; O35763; Q9EPH2; P15146; P62775; P20428; Q05982; P69682; P97603; P07936; Q9JJ19; P13084; Q63083; Q9JI85; Q01205; P08461; Q4V8B0; Q5XIL2; Q9Z0W5; Q920Q0; O88767; P04785; Q5U318; P31044; O55012; Q99MC0; P10111; Q6J4I0; Q9R063; P02625; Q812D1; Q63475; P51583; P86252; Q68A21; P62986; P02401; P62982; P62859; Q64548; Q6RJR6; Q9JK11; O35314; P10362; Q63945; B0BN85; P60881; Q9Z2P6; P07632; Q66HL2; P28042; O35814; P13668; P21818; P09951; Q63537; O70441; Q58DZ9; P37377; Q63754; P21643; Q62880; P19332; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; Q66HC1; P62076; Q9WVA1; P48500; P04692; P58775; Q63610; P09495; P02767; P0CG51; Q63429; P63045; P20156; Q5BJU7 P31044; P37377; P45592; P11030; P02625; P29419; Q05175; P62775; P21571; O88767; P15146; Q63754; P02688; P08082; P62898; P63045; P62076; O35814; Q9WUC4; Q62658; P86252; Q63228; P07632; Q9WVA1; Q5XI73; P08081; P62959; P09951; P60881; P84087; P10362
Phosphoserine 0e+00 0e+00 3674 155 0.5378929 0.8396114 962 155 Over presented P0C1X8; Q63028; Q62847; O08838; Q99068; Q05140; Q62848; Q9WUC4; P29419; P21571; P15999; D3ZAF6; Q05175; O08839; O88778; P0DP29; P0DP30; P0DP31; O35783; O35397; P63039; P08081; P08082; P10354; P45592; Q91ZN1; P84087; Q63768; P11951; Q9JHL4; Q9QXU8; Q7M0E3; Q62950; P47942; Q07266; P84060; Q9WTP0; P62870; P15429; P07323; P60841; Q9Z1Z3; Q5RJL0; P55053; P07483; Q62658; Q32PX7; Q99PF5; Q920R4; Q5XI73; P47819; P01946; P02091; P11517; P62959; Q9Z2X5; P82995; P34058; P27321; Q5XI72; Q68FR3; P50411; Q6AXU6; Q5BK20; P07335; P11980; Q99MZ8; Q66HF9; P34926; P15205; Q5M7W5; Q63560; P30009; P02688; B0BN72; Q5FVH7; Q4KM98; Q6XVN8; O35763; Q9EPH2; P15146; P20428; Q05982; P97603; P07936; Q9JJ19; P13084; Q63083; Q9JI85; Q01205; P08461; Q4V8B0; Q5XIL2; Q9Z0W5; Q920Q0; P04785; Q5U318; P31044; O55012; Q99MC0; P10111; Q6J4I0; Q9R063; P02625; Q812D1; Q63475; P51583; P86252; Q68A21; P62986; P02401; P62982; P62859; Q64548; Q6RJR6; Q9JK11; O35314; P10362; Q63945; B0BN85; P60881; Q9Z2P6; P07632; Q66HL2; P28042; O35814; P13668; P21818; P09951; Q63537; O70441; Q58DZ9; P37377; Q63754; P21643; Q62880; P19332; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; Q66HC1; P62076; Q9WVA1; P48500; P04692; P58775; Q63610; P09495; P02767; P0CG51; Q63429; P20156; Q5BJU7 P31044; P37377; P45592; P02625; P29419; Q05175; P21571; P15146; Q63754; P02688; P08082; P62076; O35814; Q9WUC4; Q62658; P86252; P07632; Q9WVA1; Q5XI73; P08081; P62959; P09951; P60881; P84087; P10362; Q9Z0W5; Q63537; P07483; P15999; Q9JHL4; D3ZAF6; P62982; P0CG51; P62986; Q63429; O08838
Phosphothreonine 0e+00 0e+00 1555 92 0.5499037 0.8723362 922 92 Over presented P0C1X8; Q63028; O08838; Q05140; Q62848; P15999; Q05175; O08839; O88778; P0DP29; P0DP30; P0DP31; O35783; P26772; P08082; P45592; Q91ZN1; P11240; Q9JHL4; Q9QXU8; Q62950; P47942; Q07266; P84060; Q9WTP0; P62870; P15429; P07323; P60841; Q9Z1Z3; Q5RJL0; B0BN94; P07483; Q32PX7; Q99PF5; Q920R4; P47819; P62994; P01946; P02091; P11517; P82995; P34058; P27321; P50411; Q6AXU6; P07335; P11980; Q99MZ8; P34926; P15205; Q5M7W5; P30009; P02688; B0BN72; Q4KM98; O35763; Q9EPH2; P15146; P62775; P20428; P69682; P97603; P07936; Q9JJ19; P13084; Q63083; Q4V8B0; Q9Z0W5; Q920Q0; P31044; Q99MC0; P10111; Q6J4I0; Q812D1; Q63475; P51583; Q68A21; Q6RJR6; Q9JK11; B0BN85; P60881; Q66HL2; O35814; P09951; Q63537; Q62880; P19332; P48500; P58775; Q63610; P09495 P31044; P45592; Q05175; P62775; P15146; P02688; P08082; O35814
Sulfocysteine 0e+00 0e+00 1 1 -0.7596154 -33.2405275 213 1 Under presented P02767 P31044
N-acetylalanine 0e+00 0e+00 438 42 0.7139681 1.1228238 161 42 Over presented P31399; D3ZAF6; O08839; P0DP29; P0DP30; P0DP31; P26772; P45592; Q63768; Q7M0E3; P62628; Q07266; P15429; B0BN94; P55053; P07483; Q32PX7; Q5XI73; P62959; Q5XI72; P50411; Q792I0; P15205; Q5M7W5; P02688; O88767; P31044; Q9EPC6; P51583; Q68A21; Q6RJR6; B0BN85; P07632; P13668; P19332; Q6PEC1; P62078; Q9WV97; Q63610; P09495; Q7M767; Q9Z1A5 P62628; P31044; P45592; O88767; P31399; P02688; P07632; Q5XI73; P62959; Q9WV97; Q6PEC1; P07483
Phosphotyrosine 0e+00 0e+00 664 49 0.7022456 1.1093096 169 49 Over presented P0C1X8; P11030; P0DP29; P0DP30; P0DP31; O35783; P63039; P45592; Q5U2U2; Q63768; Q6AY72; P62898; Q9JHL4; Q62950; P47942; Q9WTP0; P15429; P07323; P55053; P07483; Q9JIX3; P62994; P01946; P82995; P34058; P07335; P11980; P34926; P15205; Q63560; P02688; B0BN72; O35763; P15146; P13084; Q9Z0W5; O88767; P51583; Q63945; Q66HL2; O35814; P09951; P37377; P19332; Q6AYZ1; Q68FR8; Q5XIF6; P04692; P58775 P37377; P45592; P11030; O88767; P15146; P02688; P62898; O35814
N6-succinyllysine 0e+00 0e+00 328 31 0.7518702 1.1755595 84 31 Over presented P11030; P31399; P21571; P15999; P26772; P63039; P62898; P47942; P07323; P56571; Q62658; Q5XI73; P01946; P02091; P11517; P34058; P11980; Q99MZ8; P30904; O35763; P13084; P08461; O88767; P04785; Q9R063; P02401; P07632; P28042; P11232; P62076; P48500 P11030; P21571; O88767; P31399; P62898; P62076; P11232; Q62658; P07632; Q5XI73
Methylation 0e+00 0e+00 507 39 0.3911697 0.6138598 994 39 Over presented P0C1X8; P60711; P63259; Q05140; P15999; O88778; P0DP29; P0DP30; P0DP31; P47942; Q9Z1Z3; Q32PX7; Q99PF5; P47819; P02091; P11517; P34058; Q5XI72; P11980; Q99MZ8; P15205; P02688; Q920Q0; Q63475; Q68A21; P63033; P62986; Q63945; Q66HL2; P13668; P09951; P19332; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; P48500; Q5BJU7 P02688; P09951; P15999; P62986; Q6P9V9; Q6AYZ1; P68370; P48500; Q5XI72
3’-nitrotyrosine 0e+00 2e-07 31 8 0.5834036 0.9520312 657 8 Over presented Q62950; P07335; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; P48500 Q6P9V9; Q6AYZ1; P68370; P48500
Nitration 1e-07 3e-07 32 8 0.5834036 0.9799961 664 8 Over presented Q62950; P07335; P68370; Q6P9V9; Q6AYZ1; Q68FR8; Q5XIF6; P48500 Q6P9V9; Q6AYZ1; P68370; P48500
N6-methyllysine 3e-06 1e-05 61 9 -0.3200000 -0.5203946 29 9 Under presented P60711; P63259; P0DP29; P0DP30; P0DP31; Q99MZ8; P13668; P19332; P48500 P31044; Q9WUC4; P0DN35; P62859; Q5PPG6; P10715; Q71UE8; O88778; Q6AXU6
Isopeptide bond 3e-06 1e-05 767 40 0.6527720 1.0342664 436 40 Over presented Q62847; Q05175; P0DP29; P0DP30; P0DP31; P63039; B0K020; P45592; P07323; Q99PF5; Q5XI73; P62994; P27321; Q68FR3; P07335; P11980; Q66HF9; Q5M7W5; Q05982; Q71UE8; P13084; O88767; O55012; P10111; Q812D1; P62986; P62982; Q63945; B0BN85; Q66HL2; O35814; P19332; P68370; Q6P9V9; Q66HC1; P48500; P0CG51; Q63429; Q5BJP3; P63025 P45592; Q05175; Q5BJP3; O88767; O35814; Q5XI73; B0K020
N-acetylvaline 2e-05 6e-05 14 4 0.3804878 0.8184281 791 4 Over presented P55051; P02091; P11517; P10111 P10111; P55051; P11517; P02091
N6-lipoyllysine 2e-05 6e-05 3 2 -0.7584541 -2.9983034 61 2 Under presented Q01205; P08461 P31044; Q63754
N6,N6,N6-trimethyllysine 2e-05 7e-05 35 6 0.5493333 0.9638404 694 6 Over presented P0DP29; P0DP30; P0DP31; P11980; P62986; Q6P9V9 P62986; Q6P9V9
Omega-N-methylarginine 3e-05 1e-04 260 18 0.4700971 0.7348251 925 18 Over presented P0C1X8; Q05140; P15999; O88778; Q9Z1Z3; Q32PX7; Q99PF5; P47819; Q5XI72; P15205; P02688; Q63475; Q68A21; Q66HL2; P09951; P19332; Q6P9V9; Q5BJU7 P02688; P09951; P15999; Q6P9V9; Q5XI72
S-nitrosocysteine 3e-05 1e-04 50 7 0.7165769 1.1920012 379 7 Over presented P47942; P82995; P34058; P11980; P15205; O35763; P11232 P11232
N6-malonyllysine 4e-05 1e-04 16 4 0.8782475 1.7988338 111 4 Over presented P11030; P26772; P63039; P34058 P11030
Oxidation 5e-05 1e-04 27 5 0.8799090 1.6172245 88 5 Over presented P60711; P63259; P10354; Q05982; O88767 O88767
N-acetylmethionine 5e-05 1e-04 391 23 0.6967950 1.0899098 329 23 Over presented P0C1X8; P60711; P63259; Q63028; P84060; P62870; P62994; Q6AXU6; Q5BK20; Q99MZ8; P13084; Q920Q0; P10111; Q6J4I0; P02401; P62859; Q9JK11; O35814; P37377; Q62880; P62076; P04692; P58775 P37377; P62076; O35814
S-nitrosylation 8e-05 2e-04 56 7 0.7165769 1.2311081 385 7 Over presented P47942; P82995; P34058; P11980; P15205; O35763; P11232 P11232
Ubl conjugation 1e-04 3e-04 1240 51 0.6788115 1.0688765 280 51 Over presented P60711; Q62847; Q05175; P0DP29; P0DP30; P0DP31; P63039; B0K020; P45592; Q91ZN1; P11240; Q7M0E3; Q07266; P07323; Q9Z1Z3; Q32PX7; Q99PF5; Q5XI73; P62994; P82995; P34058; P27321; Q68FR3; P07335; P11980; Q66HF9; Q5M7W5; Q62625; Q05982; P13084; O88767; O55012; P10111; Q812D1; P62986; P62982; Q63945; B0BN85; Q66HL2; O35814; P21818; P37377; P19332; Q6P9V9; Q66HC1; P48500; P0CG51; Q63429; Q5BJP3; Q9Z1A5; P63025 P37377; P45592; Q05175; Q5BJP3; O88767; O35814; Q5XI73; B0K020
Phosphatidylethanolamine amidated glycine 2e-04 4e-04 5 2 0.6231884 2.4443219 447 2 Over presented Q6XVN8; Q62625 Q62625; Q6XVN8
Phosphatidylserine amidated glycine 2e-04 4e-04 5 2 0.6231884 2.1583432 458 2 Over presented Q6XVN8; Q62625 Q62625; Q6XVN8
Methionine (R)-sulfoxide 3e-04 7e-04 6 2 -0.9661836 -3.5191455 0 2 Under presented P60711; P63259 P31044; P37377
5-glutamyl polyglutamate 5e-04 1e-03 7 2 0.7053140 2.4020302 350 2 Over presented P68370; Q6P9V9 Q6P9V9; P68370
ADP-ribosylation 8e-04 2e-03 44 5 0.7465420 1.3704412 338 5 Over presented P13084; P62986; P62982; P0CG51; Q63429 P62982; P0CG51; P62986; Q63429
Tele-methylhistidine 8e-04 2e-03 8 2 -0.9661836 -3.3597311 0 2 Under presented P60711; P63259 P31044; P37377
Deamidated glutamine 2e-03 4e-03 3 1 0.9230769 188.2426614 89 1 Over presented P02688 P02688
Arginine amide 4e-03 7e-03 4 1 0.5144231 -27.2388325 479 1 Under presented O35314 O35314
Glycine amide 4e-03 7e-03 4 1 -0.7644231 -101.9022364 240 1 Under presented P10354 P31044
N6-(2-hydroxyisobutyryl)lysine 4e-03 7e-03 26 3 0.9429546 2.3303419 46 3 Over presented P11030; P18418; P07323 P11030
Asymmetric dimethylarginine 6e-03 1e-02 108 7 0.5311510 0.8799775 736 7 Over presented Q05140; O88778; P47942; P02091; P11517; P09951; Q5BJU7 P09951
N,N,N-trimethylalanine 6e-03 1e-02 5 1 -0.7115385 -63.2300469 290 1 Under presented Q63945 P31044
N-acetylserine 8e-03 1e-02 212 11 0.8333308 1.3146531 84 11 Over presented P11030; Q62847; Q91ZN1; P07323; P60841; Q99PF5; Q63228; Q9JJ19; O55012; P02625; P63045 P11030; P02625; P63045; Q63228
N-acetylglycine 8e-03 1e-02 17 2 0.8299358 2.9288470 220 2 Over presented P10715; P62898 P62898
Methionine sulfoxide 9e-03 1e-02 6 1 -0.7644231 46.8552979 239 1 Over presented P10354 P31044
N6-methylated lysine 9e-03 1e-02 6 1 -0.8365385 215.4592593 167 1 Over presented P34058 P31044
Sulfation 1e-02 2e-02 34 3 0.6281641 1.5674287 491 3 Over presented O35314; P10362; P02767 P10362
Citrulline 2e-02 3e-02 39 3 0.8494968 2.1035945 171 3 Over presented P47819; P02688; Q812D1 P02688
4-carboxyglutamate 2e-02 3e-02 9 1 -0.7596154 50.5921220 247 1 Over presented P02767 P31044
Citrullination 2e-02 3e-02 42 3 0.8494968 2.0187903 191 3 Over presented P47819; P02688; Q812D1 P02688
Pyrrolidone carboxylic acid 2e-02 3e-02 43 3 0.4387468 1.1106568 699 3 Over presented P10354; P22057; P20156 P22057; P20156
Gamma-carboxyglutamic acid 3e-02 3e-02 10 1 -0.7596154 -21.4922559 222 1 Under presented P02767 P31044
N-acetylcysteine 3e-02 3e-02 10 1 0.9567308 22.2856984 49 1 Over presented P62775 P62775
Amidation 3e-02 4e-02 48 3 -0.4660194 -1.1220477 156 3 Under presented P10354; O35314; P20156 P31044; P29419; P48500

4.2 Plotting

We now introduce the plotting features for protein set enrichment analysis. Two functions are included to visualize PSEA results returned from runPSEA function. The first plot is generated by plotPSEA function and shows Normalized Enrichment Score (NES) for each PTM term. User can restrict the number of PTM terms to draw based by adjusting sig.level parameter (default value is 0.05). The coloring of the plot indicates if the PTM term is enriched or not.

plotPSEA(x = psea_res)

The second plot is generated by plotRunningScore function. A running enrichment score plot for each PTM can be plotted.

5. Translate PEIMAN results for Mass spectrometry searching tools

In addition to the introduced features and extensions from previous version, the results from PEIMAN can also be utilized in Mass spectrometry searching tools. The enriched PTM terms in list of proteins generated by runPSEA function in the previous step can be searched in subset of protein modifications database. psea2mass function takes PSEA results and a significant level (default value is 0.05) and returns protein modification of statistically significant PTM terms for later searches in mass spectrometry tools. Note that p-values obtained from permutation is used to identify significant PTM terms. As an example on how the function works, continuing from exmplData2 for PSEA, we call psea2mass function as follows:

MS <- psea2mass(x = psea_res, sig.level = 0.05)
MS
#>      MOD_ID                       name
#> 1 MOD:00085         N6-methyl-L-lysine
#> 2 MOD:00322      1'-methyl-L-histidine
#> 3 MOD:00720 L-methionine (R)-sulfoxide
#> 4 MOD:00051   N-acetyl-L-aspartic acid
#> 5 MOD:00052        N-acetyl-L-cysteine
#> 6 MOD:00053   N-acetyl-L-glutamic acid
#>                                                                                                                                                                                                         def
#> 1                                                            "converts an L-lysine residue to N6-methyl-L-lysine." [ChEBI:17604, DeltaMass:165, PubMed:11875433, PubMed:3926756, RESID:AA0076, Unimod:34#K]
#> 2                              "converts an L-histidine residue to tele-methyl-L-histidine." [PubMed:10601317, PubMed:11474090, PubMed:11875433, PubMed:6692818, PubMed:8076, PubMed:8645219, RESID:AA0317]
#> 3                                             "oxygenates an L-methionine residue to L-methionine sulfoxide R-diastereomer." [ChEBI:45764, PubMed:21406390, PubMed:22116028, PubMed:23911929, RESID:AA0581]
#> 4                                                                            "converts an L-aspartic acid residue to N-acetyl-L-aspartic acid." [ChEBI:21547, PubMed:1560020, PubMed:2395459, RESID:AA0042]
#> 5 "converts an L-cysteine residue to N-acetyl-L-cysteine." [ChEBI:28939, PubMed:11857757, PubMed:11999733, PubMed:12175151, PubMed:14730666, PubMed:1500421, PubMed:15350136, PubMed:6725286, RESID:AA0043]
#> 6                                                                                            "converts an L-glutamic acid residue to N-acetyl-L-glutamic acid." [ChEBI:17533, PubMed:6725286, RESID:AA0044]
#>   FreqinSample
#> 1            9
#> 2            2
#> 3            2
#> 4            1
#> 5            1
#> 6            1

Note that list of proteins generated by runEnrichment function can be passed to sea2mass function too.

References

Gholizadeh, Elham, Reza Karbalaei, Ali Khaleghian, et al. 2021. “Identification of Celecoxib-Targeted Proteins Using Label-Free Thermal Proteome Profiling on Rat Hippocampus.” Molecular Pharmacology 99 (5): 308–18. https://doi.org/https://doi.org/10.1124/molpharm.120.000210.
Nickchi, Payman, Mohieddin Jafari, and Shiva Kalantari. 2015. PEIMAN 1.0: Post-translational modification Enrichment, Integration and Matching ANalysis.” Database 2015 (April). https://doi.org/10.1093/database/bav037.
Subramanian, Aravind, Pablo Tamayo, Vamsi K. Mootha, et al. 2005. “Gene Set Enrichment Analysis: A Knowledge-Based Approach for Interpreting Genome-Wide Expression Profiles.” Proceedings of the National Academy of Sciences 102 (43): 15545–50. https://doi.org/10.1073/pnas.0506580102.