2022-02-16

A few slides from Lecture 6

lecture6 lecture6

What is a Likelihood?
What is a Maximum Likelihood Estimate?
What is AIC?
What does this have to do with model selection and parsimony?

An example!

Oakie Island

Orange Island

little known fact …

your professor once played in a band called the Black Squirrels.

The question:

Oakie Island

Orange Island

What is the “best” model for squirrel morph distribution?

Data and Models

Data / observations: \(X_{ij}\)

island a. Orange b. Oakie
squirrel 1 \(X_{a,1}\) = light \(X_{b,1}\) =light
squirrel 2 \(X_{a,2}\) = light \(X_{b,2}\) dark

Models = Hypotheses

model n. parameters
M1: \(P(X_{ij} = light) = p = 0.5\) 1
M2: \(p = 0.75\) 1
M3: \(p_a = 1\); \(p_b = .5\) 2
M4: \(p_{a,1} = 1;\,\,\, p_{b,1} = 1 \\ p_{a,2} = 1;\,\,\, p_{b,2} = 0\) 4

Likelihoood (of a model)

Product of probabilities of data given model.

\[{\cal L}(model) = \prod_{i = 1}^n Pr(X | model)\]

  • We never care about the absolute value of the likelihood!
  • Only the relative value of the likelihood.

Squirrel Models:

model likelihood
M1: \(p = 0.5\) \({1\over2} \times {1\over2} \times {1\over2} \times {1\over2} = {1 \over 16}\) 0.0625
M2: \(p = 0.75\) \({3\over4} \times {3\over4} \times {3\over4} \times {1\over4} = {27 \over 256}\) 0.1055
M3: \(p_a = .5\); \(p_b = 1\) \(1\times1\times{1\over2}\times{1\over2} = {1\over 4}\) 0.25
M4: \(p_{a,1} = 1;\,\,\, p_{b,1} = 1 \\ p_{a,2} = 1;\,\,\, p_{b,2} = 0\) \(1\times1\times1\times1\) 1

\[\cal{L}(M4) > \cal{L}(M3) > \cal{L}(M2) > \cal{L}(M1)\]

A(kaike) Information Criterion

A good fit is great! But it is useless if it uses too much information (too many parameters). This is overfitting. One parameter per data point is TOO MANY parameters!

Hirotugo Akaike 赤池 弘次 (1927-2006)

Simple formula:

\[AIC = -2 \log({\cal L})+ 2k\]

(where \(k\) is the number of parameters)

  • Better fit = higher \(\cal L\) = lower AIC.
  • Too complicated = more k = higher AIC.

Lowest AIC is “best” model

Compute AIC

model likelihood log-likelihood k AIC
M1: 0.0625 -2.77 1 7.55
M2: 0.1055 -2.25 1 6.50
M3: 0.25 -1.39 2 6.77
M4: 1 0 4 8

AIC2 < AIC3 < AIC1 < AIC4

Most parsimonious model is M2!

Compute AIC in R

L <- c(1/2^4, 27/256, 1/4, 1)
k <- c(1,1,2,4)

data.frame(L, k, AIC = - 2 * log(L) + 2 * k)
##           L k      AIC
## 1 0.0625000 1 7.545177
## 2 0.1054688 1 6.498681
## 3 0.2500000 2 6.772589
## 4 1.0000000 4 8.000000

Let’s add one more observation …

Oakie Island

Orange Island

\[X_{b,3} = dark\]

Updated squirrel models:

model probs \({\cal L} = \Pi P(X|M)\) \({\cal L}\) k AIC
M1 \(p = {1\over2}\) \({1\over2} \times {1\over2} \times {1\over2} \times {1\over2} \times {1\over2}\) 0.03125 1 8.93
M2 \(p = {3\over 4}\) \({3\over4} \times {3\over4} \times {3\over4} \times {1\over4} \times {1\over4}\) 0.02637 1 9.27
M2b \(p = {3\over 5}\) \({3\over5} \times {3\over5} \times {3\over5} \times {2\over5} \times {2\over5}\) 0.0346 1 8.73
M3 \(p_a = .5; p_b = 1\) \(1\times1\times{1\over2}\times{1\over2}\times0\) 0 (!!) 2 \(\infty\)
M3b \(p_a = {1\over2}; p_b = {2\over3}\) \({1\over2} \times {1\over2} \times {2\over3} \times {2\over3} \times {1\over3}\) 0.037 2 10.6

Updated (1 parameter) squirrel models:

\(\widehat p = 3/5\) is the maximum likelihood estimate of the probability that a squirrel is light morph.

model probs \({\cal L} = \Pi P(X|M)\) \({\cal L}\)
M1 \(p = {1\over2}\) \({1\over2} \times {1\over2} \times {1\over2} \times {1\over2} \times {1\over2}\) 0.03125
M2 \(p = {3\over 4}\) \({3\over4} \times {3\over4} \times {3\over4} \times {1\over4} \times {1\over4}\) 0.02637
M2b \(p = {3\over 5}\) \({3\over5} \times {3\over5} \times {3\over5} \times {2\over5} \times {2\over5}\) 0.0346

Likelihoods are the SOUL of INFERENCE!

They allow you to FIT models (i.e estimate parameters) via maximization;

They are essential for model selection, e.g. with AIC;

They underlie Bayesian approaches;

They are used throughout Ecology!

  • Phylogenies, Coalescence Trees
  • Population dynamics
  • Species Distributions
  • Habitat Selection
  • Movement Ecology
  • Survival Analysis
  • and on and on and on and on and on

lecture6