What is a Likelihood?
What is a Maximum Likelihood Estimate?
What is AIC?
What does this have to do with model selection and parsimony?
2022-02-16
What is a Likelihood?
What is a Maximum Likelihood Estimate?
What is AIC?
What does this have to do with model selection and parsimony?
your professor once played in a band called the Black Squirrels.
What is the “best” model for squirrel morph distribution?
island | a. Orange | b. Oakie |
---|---|---|
squirrel 1 | \(X_{a,1}\) = light | \(X_{b,1}\) =light |
squirrel 2 | \(X_{a,2}\) = light | \(X_{b,2}\) dark |
model | n. parameters |
---|---|
M1: \(P(X_{ij} = light) = p = 0.5\) | 1 |
M2: \(p = 0.75\) | 1 |
M3: \(p_a = 1\); \(p_b = .5\) | 2 |
M4: \(p_{a,1} = 1;\,\,\, p_{b,1} = 1 \\ p_{a,2} = 1;\,\,\, p_{b,2} = 0\) | 4 |
Product of probabilities of data given model.
\[{\cal L}(model) = \prod_{i = 1}^n Pr(X | model)\]
model | likelihood | |
---|---|---|
M1: \(p = 0.5\) | \({1\over2} \times {1\over2} \times {1\over2} \times {1\over2} = {1 \over 16}\) | 0.0625 |
M2: \(p = 0.75\) | \({3\over4} \times {3\over4} \times {3\over4} \times {1\over4} = {27 \over 256}\) | 0.1055 |
M3: \(p_a = .5\); \(p_b = 1\) | \(1\times1\times{1\over2}\times{1\over2} = {1\over 4}\) | 0.25 |
M4: \(p_{a,1} = 1;\,\,\, p_{b,1} = 1 \\ p_{a,2} = 1;\,\,\, p_{b,2} = 0\) | \(1\times1\times1\times1\) | 1 |
\[\cal{L}(M4) > \cal{L}(M3) > \cal{L}(M2) > \cal{L}(M1)\]
A good fit is great! But it is useless if it uses too much information (too many parameters). This is overfitting. One parameter per data point is TOO MANY parameters!
Hirotugo Akaike 赤池 弘次 (1927-2006)
Simple formula:
\[AIC = -2 \log({\cal L})+ 2k\]
(where \(k\) is the number of parameters)
Lowest AIC is “best” model
model | likelihood | log-likelihood | k | AIC |
---|---|---|---|---|
M1: | 0.0625 | -2.77 | 1 | 7.55 |
M2: | 0.1055 | -2.25 | 1 | 6.50 |
M3: | 0.25 | -1.39 | 2 | 6.77 |
M4: | 1 | 0 | 4 | 8 |
AIC2 < AIC3 < AIC1 < AIC4
Most parsimonious model is M2!L <- c(1/2^4, 27/256, 1/4, 1) k <- c(1,1,2,4) data.frame(L, k, AIC = - 2 * log(L) + 2 * k)
## L k AIC ## 1 0.0625000 1 7.545177 ## 2 0.1054688 1 6.498681 ## 3 0.2500000 2 6.772589 ## 4 1.0000000 4 8.000000
\[X_{b,3} = dark\]
model | probs | \({\cal L} = \Pi P(X|M)\) | \({\cal L}\) | k | AIC |
---|---|---|---|---|---|
M1 | \(p = {1\over2}\) | \({1\over2} \times {1\over2} \times {1\over2} \times {1\over2} \times {1\over2}\) | 0.03125 | 1 | 8.93 |
M2 | \(p = {3\over 4}\) | \({3\over4} \times {3\over4} \times {3\over4} \times {1\over4} \times {1\over4}\) | 0.02637 | 1 | 9.27 |
M2b | \(p = {3\over 5}\) | \({3\over5} \times {3\over5} \times {3\over5} \times {2\over5} \times {2\over5}\) | 0.0346 | 1 | 8.73 |
M3 | \(p_a = .5; p_b = 1\) | \(1\times1\times{1\over2}\times{1\over2}\times0\) | 0 (!!) | 2 | \(\infty\) |
M3b | \(p_a = {1\over2}; p_b = {2\over3}\) | \({1\over2} \times {1\over2} \times {2\over3} \times {2\over3} \times {1\over3}\) | 0.037 | 2 | 10.6 |
\(\widehat p = 3/5\) is the maximum likelihood estimate of the probability that a squirrel is light morph.
model | probs | \({\cal L} = \Pi P(X|M)\) | \({\cal L}\) |
---|---|---|---|
M1 | \(p = {1\over2}\) | \({1\over2} \times {1\over2} \times {1\over2} \times {1\over2} \times {1\over2}\) | 0.03125 |
M2 | \(p = {3\over 4}\) | \({3\over4} \times {3\over4} \times {3\over4} \times {1\over4} \times {1\over4}\) | 0.02637 |
M2b | \(p = {3\over 5}\) | \({3\over5} \times {3\over5} \times {3\over5} \times {2\over5} \times {2\over5}\) | 0.0346 |
They allow you to FIT models (i.e estimate parameters) via maximization;
They are essential for model selection, e.g. with AIC;
They underlie Bayesian approaches;
They are used throughout Ecology!