The function tree.merger
was created to merge
phylogenetic information derived from different phylogenies into a
single supertree. Given a backbone (backbone
) and a source
(source.tree
) trees, tree.merger
drops clades
from the latter to attach them on the former according to the
information provided in the dataset object data
. Individual
tips to add can be indicated in data
as well. The function
has been subsequently implemented to build phylogenetic trees from
scratch by only using the data
object. In both cases, once
the supertree is assembled, tips and nodes’ ages are calibrated based on
user-specified values.
The backbone
phylogeny serves as the reference to locate
where single tips or entire clades extracted from the
source.tree
have to be attached. The backbone
is assumed to be correctly calibrated so that nodes and tips ages
(including the age of the tree root) are left unchanged, unless the user
specifies otherwise. The source.tree
is the phylogeny from
which the clades to add are extracted. For each clade attached to the
backbone
, the time distances between the most recent common
ancestor of the clade and its descendant nodes are kept fixed, unless
the ages for any of these nodes are indicated by the user. All the new
tips added to the backbone
, irrespective of whether they
are attached as a clade or as individual tips, are placed at the maximum
distance from the tree root, unless calibration ages are supplied by the
user. The data
object is a dataframe including information
about “what” is attached, where and how. data
must be made
of three columns:
bind: the tips or clades to be attached;
reference: the tips or clades where bind will be attached;
poly: a logical indicating whether the bind and reference pair should form a polytomy.
If different column names are supplied, tree.merger
assumes they are ordered as described and eventually fails if this
requirement is not met. Similarly, with duplicated bind
supplied, the function stops and throws an error message. A clade,
either to be bound or to be the reference, must be indicated by
collating the names of the two phylogenetically furthest tips belonging
to it, separated by the “-” symbol. Alternatively, if
backbone$node.label
/source.tree$node.label
is
not NULL
, a
bind/reference clade can be indicated
as “Clade NAMEOFTHECLADE” when appropriate. Similarly, an entire genus
on both the backbone
and the source.tree
can
be indicated as “Genus NAMEOFTHEGENUS”. If the “Genus NAMEOFTHEGENUS”
mode is used for a species/clade belonging to one or more different
genera, the function automatically sets as reference the clade including
all the species belonging to the reference genus, whatever they are
already on the backbone
or binded. Regardless the way it
was attached, any ‘bound’ tip can be used as a reference for another tip
(individually or as an element for clade identification, i.e. in the
“species1-species2” form). The order with which clades and tips to
attach are supplied does not matter.
Tips and nodes are calibrated within tree.merger
by
means of the function scaleTree
.
To this aim, named vectors of tips and nodes ages, meant as time
distance from the youngest tips within the phylogeny, must be supplied.
As for the data
object, the nodes to be calibrated should
be identified by collating the names of the two phylogenetically
furthest tips it subtends to, separated by a “-”.
If only individual tips are attached the source.tree
can
be left unspecified. Tips set to be attached to the same
reference with poly=FALSE are
considered to represent a polytomy. Tips set as bind
which are already on the backbone tree are removed from the latter and
placed according to the reference. In the example
below, tips “genusE_1a” and “genusE_1b” are set to be attached to the
same reference “genusE_1”, creating a polytomy. The species “genusC_4”
and “genusC_5”, are both set to be bound to the entire “Genus genusC”
(including “genusC_1”, “genusC_2”, and “genusC_3”), but only the latter
is explicitly indicated to create a polytomous clade (“poly=TRUE”). Once
“genusC_5” is attached, the most recent common ancestor (MRCA) of the
entire genusC changes with respect to the MRCA on the
backbone
, hence the reference for “genusC_6” is identified
by selecting the two phylogenetically furthest tips within the ‘new’
genusC, that is “genusC_1-genusC_5”. This is unnecessary for “genusI_1”
as the function recognizes it belongs to a different genus than “genusC”
and therefore places it as sister to all the species in “genusC”,
regardless if they already are on the backbone or are attached.
“genusB_3” belonging to the backbone is indicated to be moved, and
“genusH_1” is added to the tree root thus changing the total height of
the tree.
tree.merger(backbone=tree.back,data=dato,plot=FALSE)
#> Warning in tree.merger(backbone = tree.back, data = dato, plot = FALSE): genusB_3 removed from the backbone tree
#> Warning in doTryCatch(return(expr), name, parentenv, handler): Root age not indicated: the tree root arbitrarily set at
#> 3.32
As no tip.ages
are supplied to tree.merger
,
all the new tips are placed at the maximum distance from the tree root.
Since no age for the root of the merged tree is indicated, the function
places it arbitrarly and produces a warning to inform the user about its
position with respect to the youngest tip on the phylogeny.
To calibrate the the ages of either tips or nodes within the merged
tree, the arguments tip.ages
and node.ages
must be indicated.
ages.tip
#> genusH_1 genusE_1a genusE_1 genusE_1b genusF_1 genusC_5 genusC_3a genusG_1
#> 1.0 2.0 1.7 1.5 0.8 1.5 0.3 1.2
#> genusB_1a
#> 0.2
ages.node
#> genusB_1-genusF_1 genusE_1a-genusE_1b genusH_1-genusB_1
#> 2.2 2.9 3.5
tree.merger(backbone=tree.back,data=dato,tip.ages=ages.tip,node.ages = ages.node,plot=FALSE)
#> Warning in tree.merger(backbone = tree.back, data = dato, tip.ages = ages.tip, : genusB_3 removed from the backbone tree
When a clade is attached, the node subtending to it on
source.tree
is identified as the MRCA of the tip pair, the
“Genus”, or the “Clade” indicated in bind. In the
example below, “Genus genusA” from the source is added as sister to
“genusA_1” within the backbone. Then, “genusL_1” is bound to the newly
created clade made of all the tips belonging to the “genusA”, located by
the two phylogenetically furthest tips within it.
bind | reference | poly |
---|---|---|
Genus genusA | genusA_1 | FALSE |
genusG_1-genusF_2 | Clade DC | FALSE |
Clade HI | Genus genusB | FALSE |
genusL_1 | Genus genusA | FALSE |
genusB_4 | genusB_3 | FALSE |
genusM_1-genusN_1 | Genus genusB | FALSE |
The machinery described above works equally when
tree.merger
is used to build a new phylogenetic tree from
scratch, with the only main difference that in this case the first row
within data
includes the first species pair to serve as
reference for subsequent attachments. Additionally, since no
a-priori information about species ages and tree height is available
(unless provided), the function automatically produces an uncalibrated
version of the tree where all the internal branch lengths equal 1 and
all the species are placed at the maximum distance from the tree
root.
tree.merger(data=dato.new,plot=FALSE)
#> Warning in doTryCatch(return(expr), name, parentenv, handler): Root age not indicated: the tree root arbitrarily set at
#> 5
#### Merging phylogenetic information
### load the RRphylo example dataset including Cetaceans tree
data("DataCetaceans")
DataCetaceans$treecet->treecet # phylogenetic tree
treecet$node.label[(131-Ntip(treecet))]<-"Crown Mysticeti" # assigning node labels
### Select two clades and some species to be removed
tips(treecet,131)->crown.Mysticetes
tips(treecet,193)->Delphininae
c("Aetiocetus_weltoni","Saghacetus_osiris",
"Zygorhiza_kochii","Ambulocetus_natans",
"Kentriodon_pernix","Kentriodon_schneideri","Kentriodon_obscurus",
"Eurhinodelphis_cristatus","Eurhinodelphis_bossi")->extinct
plot(treecet,show.tip.label = FALSE,no.margin=TRUE)
nodelabels(frame="n",col="blue",font=2,node=c(131,193),text=c("crown\nMysticetes","Delphininae"))
tiplabels(frame="circle",bg="red",cex=.3,text=rep("",length(c(crown.Mysticetes,Delphininae,extinct))),
tip=which(treecet$tip.label%in%c(crown.Mysticetes,Delphininae,extinct)))
### Create the backbone and source trees
drop.tip(treecet,c(crown.Mysticetes[-which(tips(treecet,131)%in%
c("Caperea_marginata","Eubalaena_australis"))],
Delphininae[-which(tips(treecet,193)=="Tursiops_aduncus")],extinct))->backtree
keep.tip(treecet,c(crown.Mysticetes,Delphininae,extinct))->sourcetree
bind | reference | poly |
---|---|---|
Clade Crown Mysticeti | Fucaia_buelli-Aetiocetus_weltoni | FALSE |
Aetiocetus_weltoni | Aetiocetus_cotylalveus | FALSE |
Saghacetus_osiris | Fucaia_buelli-Tursiops_truncatus | FALSE |
Zygorhiza_kochii | Saghacetus_osiris-Fucaia_buelli | FALSE |
Ambulocetus_natans | Dalanistes_ahmedi-Fucaia_buelli | FALSE |
Genus Kentriodon | Phocoena_phocoena-Delphinus_delphis | FALSE |
Sousa_chinensis-Delphinus_delphis | Sotalia_fluviatilis | FALSE |
Kogia_sima | Kogia_breviceps | FALSE |
Eurhinodelphis_cristatus | Eurhinodelphis_longirostris | FALSE |
Grampus_griseus | Globicephala_melas-Pseudorca_crassidens | FALSE |
Eurhinodelphis_bossi | Eurhinodelphis_longirostris | FALSE |
### Merge the backbone and the source trees according to dato without calibrating tip and node ages
tree.merger(backbone = backtree,data=dato,source.tree = sourcetree,plot=FALSE)
#> Warning in tree.merger(backbone = backtree, data = dato, source.tree = sourcetree, : Kogia_sima, Grampus_griseus removed from the backbone tree
#> Warning in tree.merger(backbone = backtree, data = dato, source.tree = sourcetree, : Eubalaena_australis, Caperea_marginata, Tursiops_aduncus already on the source tree: removed from the backbone tree
#> Warning in doTryCatch(return(expr), name, parentenv, handler): Root age not indicated: the tree root arbitrarily set at
#> 45.06
### Set tips and nodes calibration ages
c(Aetiocetus_weltoni=28.0,
Saghacetus_osiris=33.9,
Zygorhiza_kochii=34.0,
Ambulocetus_natans=40.4,
Kentriodon_pernix=15.9,
Kentriodon_schneideri=11.61,
Kentriodon_obscurus=13.65,
Eurhinodelphis_bossi=13.65,
Eurhinodelphis_cristatus=5.33)->tipages
c("Ambulocetus_natans-Fucaia_buelli"=52.6,
"Balaena_mysticetus-Caperea_marginata"=21.5)->nodeages
### Merge the backbone and the source trees and calibrate tips and nodes ages
tree.merger(backbone = backtree,data=dato,source.tree = sourcetree,
tip.ages=tipages,node.ages=nodeages,plot=FALSE)
#> Warning in tree.merger(backbone = backtree, data = dato, source.tree = sourcetree, : Kogia_sima, Grampus_griseus removed from the backbone tree
#> Warning in tree.merger(backbone = backtree, data = dato, source.tree = sourcetree, : Eubalaena_australis, Caperea_marginata, Tursiops_aduncus already on the source tree: removed from the backbone tree
#### Building a new phylogenetic tree: build the phylogenetic tree shown in
#### Pandolfi et al. 2020 - Figure 2 (see reference)
### Create the data object
data.frame(bind=c("Hippopotamus_lemerlei",
"Hippopotamus_pentlandi",
"Hippopotamus_amphibius",
"Hippopotamus_antiquus",
"Hippopotamus_gorgops",
"Hippopotamus_afarensis",
"Hexaprotodon_sivalensis",
"Hexaprotodon_palaeindicus",
"Archaeopotamus_harvardi",
"Saotherium_mingoz",
"Choeropsis_liberiensis"),
reference=c("Hippopotamus_madagascariensis",
"Hippopotamus_madagascariensis-Hippopotamus_lemerlei",
"Hippopotamus_pentlandi-Hippopotamus_madagascariensis",
"Hippopotamus_amphibius-Hippopotamus_madagascariensis",
"Hippopotamus_antiquus-Hippopotamus_madagascariensis",
"Hippopotamus_gorgops-Hippopotamus_madagascariensis",
"Genus Hippopotamus",
"Hexaprotodon_sivalensis",
"Hexaprotodon_sivalensis-Hippopotamus_madagascariensis",
"Archaeopotamus_harvardi-Hippopotamus_madagascariensis",
"Saotherium_mingoz-Hippopotamus_madagascariensis"),
poly=c(FALSE,
TRUE,
FALSE,
FALSE,
TRUE,
FALSE,
FALSE,
FALSE,
FALSE,
FALSE,
FALSE))->dato
bind | reference | poly |
---|---|---|
Hippopotamus_lemerlei | Hippopotamus_madagascariensis | FALSE |
Hippopotamus_pentlandi | Hippopotamus_madagascariensis-Hippopotamus_lemerlei | TRUE |
Hippopotamus_amphibius | Hippopotamus_pentlandi-Hippopotamus_madagascariensis | FALSE |
Hippopotamus_antiquus | Hippopotamus_amphibius-Hippopotamus_madagascariensis | FALSE |
Hippopotamus_gorgops | Hippopotamus_antiquus-Hippopotamus_madagascariensis | TRUE |
Hippopotamus_afarensis | Hippopotamus_gorgops-Hippopotamus_madagascariensis | FALSE |
Hexaprotodon_sivalensis | Genus Hippopotamus | FALSE |
Hexaprotodon_palaeindicus | Hexaprotodon_sivalensis | FALSE |
Archaeopotamus_harvardi | Hexaprotodon_sivalensis-Hippopotamus_madagascariensis | FALSE |
Saotherium_mingoz | Archaeopotamus_harvardi-Hippopotamus_madagascariensis | FALSE |
Choeropsis_liberiensis | Saotherium_mingoz-Hippopotamus_madagascariensis | FALSE |
### Build an uncalibrated version of the tree
tree.merger(data=dato,plot=FALSE)->tree.uncal
#> Warning in doTryCatch(return(expr), name, parentenv, handler): Root age not indicated: the tree root arbitrarily set at
#> 8
### Set tips and nodes calibration ages
## Please note: the following ages are only used to show how to use the function
## they are not assumed to be correct.
c("Hippopotamus_lemerlei"=0.001,
"Hippopotamus_pentlandi"=0.45,
"Hippopotamus_amphibius"=0,
"Hippopotamus_antiquus"=0.5,
"Hippopotamus_gorgops"=0.4,
"Hippopotamus_afarensis"=0.75,
"Hexaprotodon_sivalensis"=1,
"Hexaprotodon_palaeindicus"=0.4,
"Archaeopotamus_harvardi"=5.2,
"Saotherium_mingoz"=4,
"Choeropsis_liberiensis"=0)->tip.ages
c("Choeropsis_liberiensis-Hippopotamus_amphibius"=13,
"Archaeopotamus_harvardi-Hippopotamus_amphibius"=8.5,
"Hexaprotodon_sivalensis-Hexaprotodon_palaeindicus"=6)->node.ages
### Build a calibrated version of the tree
tree.merger(data=dato,tip.ages=tip.ages,node.ages=node.ages,plot=FALSE)->tree.cal
The function scaleTree
is a useful tool to deal with
phylogenetic age calibration written around Gene Hunt’s scalePhylo
function (https://naturalhistory.si.edu/staff/gene-hunt). It
rescales branches and leaves of the tree according to species and/or
nodes calibration ages (meant as distance from the youngest tip within
the tree).
If only species ages are supplied (argument tip.ages
),
the function changes leaves length, leaving node ages and internal
branch lengths unaltered. When node ages are supplied (argument
node.ages
), the function shifts nodes position along their
own branches while keeping other nodes and species positions
unchanged.
#> 98 152 123 85 118 127 164 143
#> 10.7 0.7 1.2 12.6 5.1 5.8 18.8 12.8
It may happen that species and/or node ages to be calibrated are
older than the age of their ancestors. In such cases, after moving the
species (node) to its target age, the function reassembles the phylogeny
above it by assigning the same branch length (set through the argument
min.branch
) to all the branches along the species (node)
path, so that the tree is well-conformed and ancestor-descendants
relationships remain unchanged. In this way changes to the original tree
topology only pertain to the path along the “calibrated” species.
# load the RRphylo example dataset including Felids tree
data("DataFelids")
DataFelids$treefel->tree
# get species and nodes ages
# (meant as distance from the youngest species, that is the Recent in this case)
max(nodeHeights(tree))->H
H-dist.nodes(tree)[(Ntip(tree)+1),(Ntip(tree)+1):(Ntip(tree)+Nnode(tree))]->age.nodes
H-diag(vcv(tree))->age.tips
# apply Pagel's lambda transformation to change node ages only
rescaleRR(tree,lambda=0.8)->tree1
# apply scaleTree to the transformed phylogeny, by setting
# the original ages at nodes as node.ages
scaleTree(tree1,node.ages=age.nodes)->treeS1
# change leaf length of 10 sampled species
tree->tree2
set.seed(14)
sample(tree2$tip.label,10)->sam.sp
age.tips[sam.sp]->age.sam
age.sam[which(age.sam>0.1)]<-age.sam[which(age.sam>0.1)]-1.5
age.sam[which(age.sam<0.1)]<-age.sam[which(age.sam<0.1)]+0.2
tree2$edge.length[match(match(sam.sp,tree$tip.label),tree$edge[,2])]<-age.sam
# apply scaleTree to the transformed phylogeny, by setting
# the original ages at sampled tips as tip.ages
scaleTree(tree2,tip.ages=age.tips[sam.sp])->treeS2
# apply Pagel's kappa transformation to change both species and node ages,
# including the age at the tree root
rescaleRR(tree,kappa=0.5)->tree3
# apply scaleTree to the transformed phylogeny, by setting
# the original ages at nodes as node.ages
scaleTree(tree1,tip.ages = age.tips,node.ages=age.nodes)->treeS3
As its name suggests, the function move.lineage
allows
moving a single tip or an entire clade to a different position within
the tree. Similarly to tree.merger
, the new position for
the focal lineage is defined by using the
sister as a reference. Both the focal
and the sister can be specified as either tip
names/numbers or node numbers. Additionally, exactly as with
tree.merger
, if tree$node.label
is not
NULL
, a focal/sister
clade can be indicated as “Clade NAMEOFTHECLADE”; similarly, an entire
genus can be “Genus NAMEOFTHEGENUS”.
When moving clades, it can happen that the age of the
focal clade (i.e. the age of its most recent common
ancestor) is older than the age of its new ancestor (i.e. the node right
above sister). In this case, the user can choose
whether the focal clade must be rescaled on the height
of the new ancestor (rescale=TRUE), or the topology of
the tree must be modified to accommodate the height of
focal as it is (rescale=FALSE) by mean
of scaleTree
. Finally, if focal is
attached to the tree root, a new age the latter can be provided as
rootage
.
require(phytools)
DataCetaceans$tree->treecet
### Moving a single tip
## sister to a tip
move.lineage(treecet,focal="Orcinus_orca",sister="Balaenoptera_musculus")->mol1
## sister to a clade
move.lineage(treecet,focal="Orcinus_orca",sister=131)->mol2
### Moving a clade
## sister to a tip
move.lineage(treecet,focal="Genus Mesoplodon",sister="Balaenoptera_musculus")->mol7
## sister to a clade
move.lineage(treecet,focal="Clade Delphinida",sister=131)->mol11
## sister to a clade by using treecet$node.label
move.lineage(treecet,focal="Clade Delphinida",sister="Clade Plicogulae")->mol14
## sister to the tree root with and without rootage
move.lineage(treecet,focal="Genus Mesoplodon",sister=117)->mol19
#> Warning in move.lineage(treecet, focal = "Genus Mesoplodon", sister = 117):
#> Argument 'rootage' is missing, the tree root will be arbitrarily moved back in
#> time
move.lineage(treecet,focal="Clade Delphinida",
sister=117,rootage=max(diag(vcv(treecet))))->mol23
The function cutPhylo
is meant to cut the phylogentic
tree to remove all the tips and nodes younger than a reference
(user-specified) age, which can also coincide with a specific node. When
an entire clade is cut, the user can choose (by the argument
keep.lineage
) to keep its branch length as a tip of the new
tree, or remove it completely.
The function fix.poly
randomly resolves polytomies
either at specified nodes or througout the tree (Castiglione et
al. 2020). This latter feature works like ape’s multi2di
.
However, contrary to the latter, polytomies are resolved to non-zero
length branches, to provide credible partition of the evolutionary time
among the nodes descending from the dichotomized node. This could be
useful to gain realistic evolutionary rate estimates at applying
RRphylo.
Under the type = collapse
specification the user is expected to indicate which node
/s
must be transformed into a multichotomus clade.
### load the RRphylo example dataset including Cetaceans tree
data("DataCetaceans")
DataCetaceans$treecet->treecet
### Resolve all the polytomies within Cetaceans phylogeny
fix.poly(treecet,type="resolve")->treecet.fixed
## Set branch colors
unlist(sapply(names(which(table(treecet$edge[,1])>2)),function(x)
c(x,getDescendants(treecet,as.numeric(x)))))->tocolo
unlist(sapply(names(which(table(treecet$edge[,1])>2)),function(x)
c(getMRCA(treecet.fixed,tips(treecet,x)),
getDescendants(treecet.fixed,as.numeric(getMRCA(treecet.fixed,tips(treecet,x)))))))->tocolo2
colo<-rep("gray60",nrow(treecet$edge))
names(colo)<-treecet$edge[,2]
colo2<-rep("gray60",nrow(treecet.fixed$edge))
names(colo2)<-treecet.fixed$edge[,2]
colo[match(tocolo,names(colo))]<-"red"
colo2[match(tocolo2,names(colo2))]<-"red"
par(mfrow=c(1,2))
plot(treecet,no.margin=TRUE,show.tip.label=FALSE,edge.color = colo,edge.width=1.3)
plot(treecet.fixed,no.margin=TRUE,show.tip.label=FALSE,edge.color = colo2,edge.width=1.3)
### Resolve the polytomies pertaining the genus Kentriodon
fix.poly(treecet,type="resolve",node=221)->treecet.fixed2
## Set branch colors
c(221,getDescendants(treecet,as.numeric(221)))->tocolo
c(getMRCA(treecet.fixed2,tips(treecet,221)),
getDescendants(treecet.fixed2,as.numeric(getMRCA(treecet.fixed2,tips(treecet,221)))))->tocolo2
colo<-rep("gray60",nrow(treecet$edge))
names(colo)<-treecet$edge[,2]
colo2<-rep("gray60",nrow(treecet.fixed2$edge))
names(colo2)<-treecet.fixed2$edge[,2]
colo[match(tocolo,names(colo))]<-"red"
colo2[match(tocolo2,names(colo2))]<-"red"
par(mfrow=c(1,2))
plot(treecet,no.margin=TRUE,show.tip.label=FALSE,edge.color = colo,edge.width=1.3)
plot(treecet.fixed2,no.margin=TRUE,show.tip.label=FALSE,edge.color = colo2,edge.width=1.3)
### Collapse Delphinidae into a polytomous clade
fix.poly(treecet,type="collapse",node=179)->treecet.collapsed
# Set branch colors
c(179,getDescendants(treecet,as.numeric(179)))->tocolo
c(getMRCA(treecet.collapsed,tips(treecet,179)),
getDescendants(treecet.collapsed,as.numeric(getMRCA(treecet.collapsed,tips(treecet,179)))))->tocolo2
colo<-rep("gray60",nrow(treecet$edge))
names(colo)<-treecet$edge[,2]
colo2<-rep("gray60",nrow(treecet.collapsed$edge))
names(colo2)<-treecet.collapsed$edge[,2]
colo[match(tocolo,names(colo))]<-"red"
colo2[match(tocolo2,names(colo2))]<-"red"
par(mfrow=c(1,2))
plot(treecet,no.margin=TRUE,show.tip.label=FALSE,edge.color = colo,edge.width=1.3)
plot(treecet.collapsed,no.margin=TRUE,show.tip.label=FALSE,edge.color = colo2,edge.width=1.3)
Castiglione, S., Serio, C., Piccolo, M., Mondanaro, A., Melchionna, M., Di Febbraro, M., Sansalone, G., Wroe, S.,& Raia, P. (2020). The influence of domestication, insularity and sociality on the tempo and mode of brain size evolution in mammals. Biological Journal of the Linnean Society, 132: 221-231. Pandolfi, L., Martino, R., Rook, L., & Piras, P. (2020). Investigating ecological and phylogenetic constraints in Hippopotaminae skull shape. Rivista Italiana di Paleontologia e Stratigrafia, 126: 37-49.