Europe PMC is a repository of life science literature. Europe PMC ingests all PubMed content and extends its index with other literature and patent sources.
For more background on Europe PMC, see:
Levchenko, M., Gou, Y., Graef, F., Hamelers, A., Huang, Z., Ide-Smith, M., … McEntyre, J. (2017). Europe PMC in 2017. Nucleic Acids Research, 46(D1), D1254–D1260. https://doi.org/10.1093/nar/gkx1005
This client supports the Europe PMC search syntax. If you are unfamiliar with searching Europe PMC, check out the Europe PMC query builder, a very nice tool that helps you to build queries. To make use of Europe PMC queries in R, copy & paste the search string to the search functions of this package.
In the following, some examples demonstrate how to search Europe PMC with R.
empc_search()
is the main function to query Europe PMC. It searches both metadata and fulltexts.
library(europepmc)
::epmc_search('malaria')
europepmc#> # A tibble: 100 × 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 36419237 MED 364192… PMC9… 10.1… Path… Walker IS, … Virulence 1
#> 2 37158217 MED 371582… PMC1… 10.1… Mobi… Kollipara A… Glob Health… 1
#> 3 37310126 MED 373101… PMC1… 10.1… Clin… Bi D, Huang… Ann Med 1
#> 4 37459385 MED 374593… <NA> 10.1… A co… Eisenberg S… Glob Health… 1
#> 5 36871259 MED 368712… PMC9… 10.1… Asse… Jantausch B… Med Educ On… 1
#> 6 37053493 MED 370534… <NA> 10.1… Opti… Kalula A, M… J Biol Dyn 1
#> 7 37191627 MED 371916… PMC1… 10.1… Huma… Ellis R, We… Hum Vaccin … 1
#> 8 37165851 MED 371658… PMC1… 10.1… Tria… Cho Y, Awoo… Glob Health… 1
#> 9 37074313 MED 370743… PMC9… 10.1… Deng… Asaga Mac P… Ann Med 1
#> 10 IND607962262 AGR <NA> <NA> <NA> Effe… Ojueromi OO… Journal of … 11
#> # ℹ 90 more rows
#> # ℹ 19 more variables: journalVolume <chr>, pubYear <chr>, journalIssn <chr>,
#> # pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>, …
It is worth noting that Europe PMC expands queries with MeSH synonyms by default, a behavior which can be turned off with the synonym
parameter.
::epmc_search('malaria', synonym = FALSE)
europepmc#> # A tibble: 100 × 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 37158217 MED 371582… PMC1… 10.1… Mobi… Kollipara A… Glob Health… 1
#> 2 36419237 MED 364192… PMC9… 10.1… Path… Walker IS, … Virulence 1
#> 3 37459385 MED 374593… <NA> 10.1… A co… Eisenberg S… Glob Health… 1
#> 4 37053493 MED 370534… <NA> 10.1… Opti… Kalula A, M… J Biol Dyn 1
#> 5 37310126 MED 373101… PMC1… 10.1… Clin… Bi D, Huang… Ann Med 1
#> 6 36871259 MED 368712… PMC9… 10.1… Asse… Jantausch B… Med Educ On… 1
#> 7 37165851 MED 371658… PMC1… 10.1… Tria… Cho Y, Awoo… Glob Health… 1
#> 8 37074313 MED 370743… PMC9… 10.1… Deng… Asaga Mac P… Ann Med 1
#> 9 IND607962262 AGR <NA> <NA> <NA> Effe… Ojueromi OO… Journal of … 11
#> 10 37580690 MED 375806… <NA> 10.1… Tren… Teka H, Gol… Malar J 1
#> # ℹ 90 more rows
#> # ℹ 19 more variables: journalVolume <chr>, pubYear <chr>, journalIssn <chr>,
#> # pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>, …
To get an exact match, use quotes as in the following example:
::epmc_search('"Human malaria parasites"')
europepmc#> # A tibble: 100 × 29
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 37452… MED 3745… 10.1… Mole… Lazrek Y, F… Sci Rep 1 13
#> 2 37277… MED 3727… 10.1… Sexu… Harris CT, … Nat Microbi… 7 8
#> 3 36777… MED 3677… 10.3… A no… Das R, Vash… Front Vet S… <NA> 10
#> 4 37365… MED 3736… 10.2… Virt… Yasir M, Pa… Curr Comput… <NA> <NA>
#> 5 37454… MED 3745… 10.1… Simi… Fornace KM,… Lancet Infe… <NA> <NA>
#> 6 PPR66… PPR <NA> 10.1… Gene… Suárez-Cort… <NA> <NA> <NA>
#> 7 37121… MED 3712… 10.1… The … Thompson TA… Trends Para… 7 39
#> 8 36007… MED 3600… 10.1… Bulk… Li X, Kumar… Parasitol I… <NA> 91
#> 9 PPR55… PPR <NA> 10.1… A co… Zhang X, Fl… <NA> <NA> <NA>
#> 10 36495… MED 3649… 10.1… A ra… Dong L, Li … Clin Chim A… <NA> 539
#> # ℹ 90 more rows
#> # ℹ 20 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> # pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>, versionNumber <int>
By default, 100 records are returned, but the number of results can be expanded or limited with the limit
parameter.
::epmc_search('"Human malaria parasites"', limit = 10)
europepmc#> # A tibble: 10 × 28
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 37452… MED 3745… 10.1… Mole… Lazrek Y, F… Sci Rep 1 13
#> 2 37277… MED 3727… 10.1… Sexu… Harris CT, … Nat Microbi… 7 8
#> 3 36777… MED 3677… 10.3… A no… Das R, Vash… Front Vet S… <NA> 10
#> 4 37365… MED 3736… 10.2… Virt… Yasir M, Pa… Curr Comput… <NA> <NA>
#> 5 37454… MED 3745… 10.1… Simi… Fornace KM,… Lancet Infe… <NA> <NA>
#> 6 PPR66… PPR <NA> 10.1… Gene… Suárez-Cort… <NA> <NA> <NA>
#> 7 37121… MED 3712… 10.1… The … Thompson TA… Trends Para… 7 39
#> 8 36007… MED 3600… 10.1… Bulk… Li X, Kumar… Parasitol I… <NA> 91
#> 9 PPR55… PPR <NA> 10.1… A co… Zhang X, Fl… <NA> <NA> <NA>
#> 10 36495… MED 3649… 10.1… A ra… Dong L, Li … Clin Chim A… <NA> 539
#> # ℹ 19 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> # pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>
Results are sorted by relevance. Other options via the sort
parameter are
sort = 'cited'
by the number of citation, descending from the most cited publicationsort = 'date'
by date published starting with the most recent publicationSometimes, you would like to check, if articles are indexed in Europe PMC using DOI names, a widely used identifier for scholarly articles. Use epmc_search_by_doi()
for this purpose.
<- c(
my_dois "10.1159/000479962",
"10.1002/sctm.17-0081",
"10.1161/strokeaha.117.018077",
"10.1007/s12017-017-8447-9"
)::epmc_search_by_doi(doi = my_dois)
europepmc#> # A tibble: 4 × 28
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 289578… MED 2895… 10.1… Clin… Schnieder M… Eur Neurol 5-6 78
#> 2 289413… MED 2894… 10.1… Conc… Doeppner TR… Stem Cells … 11 6
#> 3 290181… MED 2901… 10.1… One-… Psychogios … Stroke 11 48
#> 4 286236… MED 2862… 10.1… Defe… Carboni E, … Neuromolecu… 2-3 19
#> # ℹ 19 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> # pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>
By default, a non-nested data frame printed as tibble is returned. Other formats are output = "id_list"
returning a list of IDs and sources, and output = “‘raw’”” for getting full metadata as list. Please be aware that these lists can become very large.
Europe PMC provides text-mined annotations contained in abstracts and open access full-text articles.
These automatically identified concepts and term can be retrieved at the article-level:
::epmc_annotations_by_id(c("MED:28585529", "PMC:PMC1664601"))
europepmc#> # A tibble: 724 × 13
#> source ext_id pmcid prefix exact postfix name uri id type section
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 MED 28585529 PMC5467… "tive… Beta… " allo… Beta… http… http… Clin… Title …
#> 2 MED 28585529 PMC5467… "nomi… genes ".\nRa… gene http… http… Sequ… Title …
#> 3 MED 28585529 PMC5467… "nomi… genes " is o… gene http… http… Sequ… Abstra…
#> 4 MED 28585529 PMC5467… " One… genes " are … gene http… http… Sequ… Abstra…
#> 5 MED 28585529 PMC5467… " ide… beet " (Bet… Beta… http… http… Clin… Abstra…
#> 6 MED 28585529 PMC5467… "ify … Beta… " ssp.… Beta… http… http… Clin… Abstra…
#> 7 MED 28585529 PMC5467… "ulga… gene " Rz2 … gene http… http… Sequ… Abstra…
#> 8 MED 28585529 PMC5467… "e ge… geno… " sequ… geno… http… http… Sequ… Abstra…
#> 9 MED 28585529 PMC5467… "eque… beet ". Our… Beta… http… http… Clin… Abstra…
#> 10 MED 28585529 PMC5467… "disc… genes " rele… gene http… http… Sequ… Abstra…
#> # ℹ 714 more rows
#> # ℹ 2 more variables: provider <chr>, subType <chr>
To obtain a list of articles where Europe PMC has text-minded annotations, either subset the resulting data.frame
<- epmc_search("malaria")
tt $hasTextMinedTerms == "Y" | tt$hasTMAccessionNumbers == "Y",]
tt[tt#> # A tibble: 76 × 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 36419237 MED 364192… PMC9… 10.1… Path… Walker IS, … Virulence 1
#> 2 37158217 MED 371582… PMC1… 10.1… Mobi… Kollipara A… Glob Health… 1
#> 3 37310126 MED 373101… PMC1… 10.1… Clin… Bi D, Huang… Ann Med 1
#> 4 37459385 MED 374593… <NA> 10.1… A co… Eisenberg S… Glob Health… 1
#> 5 36871259 MED 368712… PMC9… 10.1… Asse… Jantausch B… Med Educ On… 1
#> 6 37053493 MED 370534… <NA> 10.1… Opti… Kalula A, M… J Biol Dyn 1
#> 7 37191627 MED 371916… PMC1… 10.1… Huma… Ellis R, We… Hum Vaccin … 1
#> 8 37165851 MED 371658… PMC1… 10.1… Tria… Cho Y, Awoo… Glob Health… 1
#> 9 37074313 MED 370743… PMC9… 10.1… Deng… Asaga Mac P… Ann Med 1
#> 10 IND607962262 AGR <NA> <NA> <NA> Effe… Ojueromi OO… Journal of … 11
#> # ℹ 66 more rows
#> # ℹ 19 more variables: journalVolume <chr>, pubYear <chr>, journalIssn <chr>,
#> # pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>, …
or expand the query choosing an annotation type or provider from the Europe PMC Advanced Search query builder.
epmc_search('malaria AND (ANNOTATION_TYPE:"Cell") AND (ANNOTATION_PROVIDER:"Europe PMC")')
#> # A tibble: 100 × 28
#> id source pmid doi title authorString journalTitle issue journalVolume
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 30925… MED 3092… 10.1… Cong… Fatima S, S… Pediatr Eme… 12 37
#> 2 31808… MED 3180… 10.1… Reti… Villaverde … J Pediatric… 5 9
#> 3 31782… MED 3178… 10.1… Incr… Jongo SA, C… Clin Infect… 11 71
#> 4 30989… MED 3098… 10.1… Clin… Enane LA, S… J Pediatric… 3 9
#> 5 31300… MED 3130… 10.1… Blac… Opoka RO, W… Clin Infect… 11 70
#> 6 31505… MED 3150… 10.1… Acut… Oshomah-Bel… J Trop Pedi… 2 66
#> 7 31687… MED 3168… 10.1… Eval… Ferdinand D… Trans R Soc… 3 114
#> 8 31693… MED 3169… 10.1… Redu… Kingston HW… J Infect Dis 9 221
#> 9 31843… MED 3184… 10.1… Arte… Pull L, Lup… Malar J 1 18
#> 10 31864… MED 3186… 10.1… Unde… Adhikari SR… Malar J 1 18
#> # ℹ 90 more rows
#> # ℹ 19 more variables: pubYear <chr>, journalIssn <chr>, pageInfo <chr>,
#> # pubType <chr>, isOpenAccess <chr>, inEPMC <chr>, inPMC <chr>, hasPDF <chr>,
#> # hasBook <chr>, hasSuppl <chr>, citedByCount <int>, hasReferences <chr>,
#> # hasTextMinedTerms <chr>, hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>,
#> # firstPublicationDate <chr>, pmcid <chr>
Another nice feature of Europe PMC is to search for cross-references between Europe PMC to other databases. For instance, to get publications cited by entries in the Protein Data bank in Europe published 2016:
::epmc_search('(HAS_PDB:y) AND FIRST_PDATE:2016')
europepmc#> # A tibble: 100 × 28
#> id source pmid pmcid doi title authorString journalTitle issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 28039433 MED 28039433 PMC5255… 10.1… Stru… Su HP, Rick… Proc Natl A… 3
#> 2 28036383 MED 28036383 PMC5201… 10.1… Stru… Kovaľ T, Øs… PLoS One 12
#> 3 27977122 MED 27977122 <NA> 10.1… Comp… De Deurwaer… ACS Chem Ne… 5
#> 4 28144358 MED 28144358 PMC5238… 10.3… Bioc… Ulrich V, B… Beilstein J… <NA>
#> 5 28028551 MED 28028551 <NA> 10.1… Stru… Zhou Z, Liu… Appl Microb… 7
#> 6 27958736 MED 27958736 <NA> 10.1… Glyc… Hamark C, B… J Am Chem S… 1
#> 7 27959534 MED 27959534 PMC6634… 10.1… Stru… Reed AJ, Vy… J Am Chem S… 1
#> 8 28083536 MED 28083536 PMC5183… 10.3… Conf… Paoletti F,… Front Mol B… <NA>
#> 9 28024148 MED 28024148 <NA> 10.1… Solu… Bibow S, Po… Nat Struct … 2
#> 10 28031486 MED 28031486 PMC5255… 10.1… Stru… Sevrioukova… Proc Natl A… 3
#> # ℹ 90 more rows
#> # ℹ 19 more variables: journalVolume <chr>, pubYear <chr>, journalIssn <chr>,
#> # pageInfo <chr>, pubType <chr>, isOpenAccess <chr>, inEPMC <chr>,
#> # inPMC <chr>, hasPDF <chr>, hasBook <chr>, hasSuppl <chr>,
#> # citedByCount <int>, hasReferences <chr>, hasTextMinedTerms <chr>,
#> # hasDbCrossReferences <chr>, hasLabsLinks <chr>,
#> # hasTMAccessionNumbers <chr>, firstIndexDate <chr>, …
The following sources are supported
To retrieve metadata about these external database links, use europepmc_epmc_db()
.
Europe PMC let us also obtain citation metadata and reference sections. For retrieving citation metadata per article, use
::epmc_citations("9338777", limit = 500)
europepmc#> # A tibble: 240 × 11
#> id source citationType title authorString journalAbbreviation pubYear
#> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 36883860 MED research supp… "Iso… Rodrigues C… J Virol 2023
#> 2 36790562 MED review; journ… "Por… Liu Y, Niu … Funct Integr Genom… 2023
#> 3 36417007 MED research-arti… "Hum… Lowe JWE. Hist Philos Life S… 2022
#> 4 35729348 MED research supp… "Det… Ishihara S,… Sci Rep 2022
#> 5 35437972 MED research-arti… "Sca… Chen JQ, Zh… Zool Res 2022
#> 6 34834962 MED im; research … "Por… Denner J. Viruses 2021
#> 7 34578447 MED im; research … "Hig… Denner J, S… Viruses 2021
#> 8 33353186 MED im; review-ar… "Xen… Galow AM, G… Int J Mol Sci 2020
#> 9 31565893 MED research-arti… "Reg… Chung HC, N… J Vet Sci 2019
#> 10 30230709 MED research supp… "Bio… Legallais C… Adv Healthc Mater 2018
#> # ℹ 230 more rows
#> # ℹ 4 more variables: volume <chr>, issue <chr>, pageInfo <chr>,
#> # citedByCount <int>
For reference section from an article:
::epmc_refs("28632490", limit = 200)
europepmc#> # A tibble: 169 × 19
#> id source citationType title authorString journalAbbreviation issue
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 12002480 MED JOURNAL ARTICLE Tricl… Adolfsson-E… Chemosphere 9-10
#> 2 18795164 MED JOURNAL ARTICLE In vi… Ahn KC, Zha… Environ Health Per… 9
#> 3 18556606 MED JOURNAL ARTICLE Effec… Aiello AE, … Am J Public Health 8
#> 4 17683018 MED JOURNAL ARTICLE Consu… Aiello AE, … Clin Infect Dis <NA>
#> 5 15273108 MED JOURNAL ARTICLE Relat… Aiello AE, … Antimicrob Agents … 8
#> 6 18207219 MED JOURNAL ARTICLE The i… Allmyr M, H… Sci Total Environ 1
#> 7 17007908 MED JOURNAL ARTICLE Tricl… Allmyr M, A… Sci Total Environ 1
#> 8 26948762 MED JOURNAL ARTICLE Press… Alvarez-Riv… J Chromatogr A <NA>
#> 9 23192912 MED JOURNAL ARTICLE Expos… Anderson SE… Toxicol Sci 1
#> 10 25837385 MED JOURNAL ARTICLE Obser… Vladar EK, … Methods Cell Biol <NA>
#> # ℹ 159 more rows
#> # ℹ 12 more variables: pubYear <int>, volume <chr>, pageInfo <chr>,
#> # citedOrder <int>, match <chr>, issn <chr>, essn <chr>,
#> # publicationTitle <chr>, publisherLoc <chr>, publisherName <chr>,
#> # externalLink <chr>, doi <chr>
Europe PMC gives not only access to metadata, but also to full-texts. Adding AND (OPEN_ACCESS:y)
to your search query, returns only those articles where Europe PMC has also the fulltext.
Fulltext as xml document can accessed via the PMID or the PubMed Central ID (PMCID):
::epmc_ftxt("PMC3257301")
europepmc#> {xml_document}
#> <article article-type="research-article" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:mml="http://www.w3.org/1998/Math/MathML">
#> [1] <front>\n <journal-meta>\n <journal-id journal-id-type="nlm-ta">PLoS ...
#> [2] <body>\n <sec id="s1">\n <title>Introduction</title>\n <p>Atmosphe ...
#> [3] <back>\n <ack>\n <p>We would like to thank Dr. C. Gourlay and Dr. T. ...