W8_L16
Modelling trait evolution on phylogenies
modelling trait evolution on phylogenies
Before starting this practical set the appropriate working directory and load the necessary libraries in R:
Libraries:
setwd("~/Desktop/MP25/data/example_PCM/")
install.packages('ape')
install.packages('phytools')
install.packages('geiger')
library(ape)
library(phytools)
library(geiger)
As always in phylogenetic comparative methods, two fundamental components are required:
- a phylogeny - but ultrametric tree is much better
- either discrete or continuous trait(s)
Here we will leverage a Octopodiformes phylogeny and a discrete classification of the maximum depth at which they have been observed:
- P - pelagic
- E - euphotic, up to 100 meters in depth
- D1 - disphotic 1, up to 400 meters in depth
- D2 - disphotic 2, up to 400 meters in depth
- N - no-photic, more than 800 meters in depth
Let's start by importing the tree and the data:
tree <- read.tree("tree_octopodiformes.nwk")
tree <- force.ultrametric(tree,"extend")
data <- read.table("traits_octopodiformes.csv", sep = ",", row.names = 1, header = T)
data <- setNames(as.factor(data[,"discrete"]),rownames(data))
Then, as seen in other instances, we have to find an model appropriate for our data. Also here, the central component is a rates matrix, not dissimilar from the Q matrix we have seen for amino acids / nucleotides models. Let's fit three models to the data, which are:
- a timetree
- character states
The three models are:
- ER - Equal Rates
- SYM - SYMmetrical
- ARD - All Rates Different
ER <- fitDiscrete(tree, data, model="ER")
SYM <- fitDiscrete(tree,data,model="SYM")
ARD <- fitDiscrete(tree,data,model="ARD")
Here is a graphical visualisation of the models.

By recalling the object you can also see the Q matrix! WHich model do you think this one is?
D1 D2 E N P
D1 -2.195643e+00 1.247291e+00 6.094590e-01 3.388934e-01 1.600471e-20
D2 1.310616e+00 -1.385760e+00 5.926930e-19 3.286915e-24 7.514486e-02
E 2.149612e-01 1.223017e-40 -2.149612e-01 6.065468e-19 7.414863e-18
N 1.205376e-24 2.497221e-01 3.362381e-16 -2.497221e-01 5.080608e-26
P 4.972750e-62 7.598279e-16 1.330707e-29 2.225052e-59 -7.598279e-16
Then we will asses how good their fit is, using the Akaike weights - the smaller the better!
aicc<-setNames(c(ER$opt$aicc,
ARD$opt$aicc,
SYM$opt$aicc),
c("Equal Rates",
"All Rates Different",
"Symmetrical"))
aicw(aicc)
We can see that ARD is the better fit!
fit delta w
Equal Rates 123.9865 0.000000 9.893373e-01
All Rates Different 180.3405 56.354017 5.730930e-13
Symmetrical 133.0470 9.060562 1.066273e-02
Then we can use the best-fit model to perform the ancestral state reconstruction:
fitARD<-ace(data,tree,model="ARD",type="discrete")
Then we need to specify a color for each state:
levels(data)
cols <- c("red","purple","orange","blue","yellow")
trait_cols <- setNames(c(
levels(data)[1],
levels(data)[2],
levels(data)[3],
levels(data)[4],
levels(data)[5]),
c("red",
"purple",
"orange",
"blue",
"yellow"))
trait_cols
And then plot the ASR on the phylo:
plot.phylo(tree,lwd=7.5, edge.width = 7.5, label.offset = .2,show.tip.label = T,no.margin = TRUE,
edge.color = "lightgray")
nodelabels(node=1:tree$Nnode+Ntip(tree),pie=fitARD$lik.anc,cex=0.42, piecol = cols)
tiplabels(pie=to.matrix(data[tree$tip.label],levels(data)),cex=0.42, piecol = cols)

So ... do Octopodiformes come from the photic zone ... or they originated deep down in the sea where the light is extremely scarce?
modelling trait evolution on phylogenies: continuous traits
Before starting this practical set the appropriate working directory and load the necessary libraries in R:
Libraries:
setwd("~/Desktop/MP25/data/example_PCM/")
install.packages('ape')
install.packages('phytools')
install.packages('geiger')
library(ape)
library(phytools)
library(geiger)
a proof of concept on simulated data:
Let's simulate a tree! Everybody should change a bit these numbers:
- the random seed (set to 42)
- the number of tips (set to 100)
- the birth rate (set to 1.0)
- the death rate (set to 0.6)
- include only extant - and not extinct - species (set to TRUE)
set.seed(42)
tree<-NULL
while(is.null(tree))
tree<-pbtree(n=100,b=1,d=0.6,extant.only=T)
plotTree(tree,ftype="off")
Then we can simulate uncorrelated (Brownian) evolution:
x <- fastBM(tree)
y <- fastBM(tree)
and then plot the resulting trees!
par(mfrow = c(1,2))
obj <- contMap(tree,x,plot=T, ftype="off", lwd=5, outline = FALSE)
obj <- contMap(tree,y,plot=T, ftype="off", lwd=5, outline = FALSE)

Now, I want you to perform a linear regression on the two traits:
- a first time completely disregarding the phylogeny:
fit <- lm(x~y)
summary(fit)
- a second time adjusting the data based on the phylogeny:
x_PIC<-pic(x,tree)
y_PIC<-pic(y,tree)
fit.pic<-lm(x_PIC~y_PIC)
summary(fit.pic)