There are THREE exercises in this lab to submit by midnight Friday, March 1.

Paramecia are at least as cute as sea otters.

Loading and exploring Paramecium data

The paramecium.csv data set contains observations of paramecia (unicellular ciliates) densities (in unclear units) measured daily in a petri dish with limited resources.

paramecium <- read.csv("paramecium.csv")

Look at the columns in this data:

head(paramecium)
##               Species Day         Nt       Nt1        dN         r
## 1 Paramecium caudatum   2   5.849284        NA        NA        NA
## 2 Paramecium caudatum   3  13.672890  5.849284  7.823607 1.3375324
## 3 Paramecium caudatum   4         NA 13.672890        NA        NA
## 4 Paramecium caudatum   5  35.697880        NA        NA        NA
## 5 Paramecium caudatum   6  73.947840 35.697880 38.249959 1.0714910
## 6 Paramecium caudatum   7 103.855191 73.947840 29.907352 0.4044385

Note the output of the following commands:

nrow(paramecium)
ncol(paramecium)
dim(paramecium)
names(paramecium)

The columns are:

  • Species
  • Day of experiment
  • Nt - number (or density) of paramecium now
  • Nt1 - number one time step previous $N_{t-1
  • \(r\) - per-capita growth rate, i.e. \(\Delta N_t/N_t\)

Fit Method I: By Hand

Open the script file logisticfitscript.R (you can click on the file name, and either download, or just paste into a NEW script). It contains just the following code:

paramecium <- read.csv("paramecium.csv")
N.logistic <- function(x, N0, K, r0) K/(1 + ((K - N0)/N0)*exp(-r0*x))

# try different values!
N0 <- 10
K <- 150
r0 <- 1

plot(Nt ~ Day, data = paramecium)
curve(N.logistic(x, N0, K, r0), add = TRUE, col = 2, lwd = 2)

The initial values guessed for N0, K, and r0 are not very good.

Lab Exercise 1

Play around with those numbers and see how close you can come (visually) to fitting the curve. Report your estimates for \(N_0\), \(K\), and \(r_0\) and include a plot of the fit.

Fit method II: Using \(r(N)\)

Ploting N, dN and r

Lets plot the three figures we talked about in class: \(N\) as a function of time, \(\Delta N\) as a function of \(N\), and \(r\) as a function of \(N\)

These roughly show the patterns we expect of a logistic growth model: sigmoidal growth in time, parabolic growth as a function of density, and linear decrease in growth as a function of density.

Linear model of R

One form of the logistic growth model simply says that:

\[ r(N) = r_0 - {r_0 \over K} N\]

So if we do a straightforward linear regression, we can estimate both \(r\) and \(K\):

lm(r ~ Nt1, data = paramecium)
## 
## Call:
## lm(formula = r ~ Nt1, data = paramecium)
## 
## Coefficients:
## (Intercept)          Nt1  
##    1.088436    -0.005205

This suggests that r0 is roughly 1.09. We write: \[\widehat{r_0} = 1.09\]

What does the slope tell us? Well - we can convert that to a carrying capacity, since: \[\beta = -{r_0 \over K}\].

Solving for \(\widehat{K}\) we get: \[\widehat{K} = 1.088 / 0.0052 = 209.1\].

Here is a plot of that linear model:

plot(r~Nt1, data = paramecium, main = "Growth rate over time")
abline(lm(r ~ Nt1, data = paramecium), col = 2, lwd = 2)

Lab Exercise 2a

Generate a plot of this fit of logistic growth over time on top of the paramecium data by using exactly the code in Exercise 1, but plugging in the estimates for \(K\) and \(r_0\). You’ll have to put in your own guess for \(N_0\). What is that guess?

Standard errors

Note - we can - again - get standard errors around these coefficients:

r.lm <- lm(r ~ Nt1, data = paramecium)
summary(r.lm)
## 
## Call:
## lm(formula = r ~ Nt1, data = paramecium)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.30712 -0.07164  0.02884  0.11450  0.27954 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.0884356  0.1218525   8.932 4.43e-06 ***
## Nt1         -0.0052052  0.0007293  -7.137 3.15e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1899 on 10 degrees of freedom
##   (12 observations deleted due to missingness)
## Multiple R-squared:  0.8359, Adjusted R-squared:  0.8195 
## F-statistic: 50.94 on 1 and 10 DF,  p-value: 3.153e-05

Lab Exercise 2b

Report a point estimate and 95% confidence interval for the intrinsic growth rate \(r_0\).

Overall this method is … Ok. But, it doesn’t provide an estimate for \(N_0\)

Fit Method III: Non-linear least squares

One method of fitting a curve is to use the “least squares” method, i.e. find the parameters for which the squared deviation of the data from the curve is minimized. This is an optimization problem that the computer solves numerically.

Here’s a fun illustration / simulation of how “least squares” works: https://seeing-theory.brown.edu/regression-analysis/index.html

The function that does this in R is called nls(). It is “non-linear”, unlike the regression model above. To do this, we have to use our function for the solution of the logistic curve and provide initial values. But the basic syntax is similar in that the response is Nt and the predictor is the logistic function of time (Day), with the parameters to be estimated following that Day. And the formula symbol is always ~.

N.logistic <- function(x, N0, K, r0) K/(1 + ((K - N0)/N0)*exp(-r0*x))

nls(Nt ~ N.logistic(Day, N0, K, r0), data = paramecium, 
    start = list(N0 = 1, K = 200, r0 = 0.75))
## Nonlinear regression model
##   model: Nt ~ N.logistic(Day, N0, K, r0)
##    data: paramecium
##       N0        K       r0 
##   1.3087 222.3641   0.7075 
##  residual sum-of-squares: 2172
## 
## Number of iterations to convergence: 8 
## Achieved convergence tolerance: 8.406e-06

As usual, we can save the results to an object and get a summary table with the fits:

summary(nls(Nt ~ N.logistic(Day, N0, K, r0), data = paramecium, 
    start = list(N0 = 1, K = 200, r0 = 0.75)))
## 
## Formula: Nt ~ N.logistic(Day, N0, K, r0)
## 
## Parameters:
##     Estimate Std. Error t value Pr(>|t|)    
## N0   1.30870    0.85195   1.536    0.148    
## K  222.36412    5.22413  42.565  2.4e-15 ***
## r0   0.70747    0.09028   7.836  2.8e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 12.92 on 13 degrees of freedom
## 
## Number of iterations to convergence: 8 
## Achieved convergence tolerance: 8.406e-06
##   (8 observations deleted due to missingness)

Lab Exercise 3

  1. Generate a plot of this third fit using exactly the code in Exercise 1, but plugging in estimates for \(K\), \(r_0\) and \(N_0\).
  2. Report 95% confidence intervals around all three estimates
  3. Which is the “best” method? Why?