I. Basic manipulations of movement data

  1. Provide a brief summary of the data - what is it, where did it come from,

In part to illustrate some places where movement data are “tucked away” in R, here’s an example of a mouflon (Ovis gmelini) in the adehabitatLT package. This particular mouflon was tracked in 2003 somewhere in southern France, data from the Office National de la Chasse et de la Faune Sauvage - the national game and wildlife agency of France.

  1. Load the data, convert the coordinates to complex numbers, plot the data (with a fixed aspect ratio)

The following bit of code extracts one of two mouflons in the package, retaining only the x, y and date (time) columns, which as all we need. Note that the time vector is already in POSIX format.

require(adehabitatLT)
# See all the data in the package by running:
## >  data(package = "adehabitatLT")
data(mouflon)  # loads the data.  
mouflon <- as.data.frame(mouflon[[1]][,1:3])
str(mouflon)
## 'data.frame':    135 obs. of  3 variables:
##  $ x   : num  650634 650569 650589 650554 650468 ...
##  $ y   : num  1844951 1844938 1844963 1844959 1845157 ...
##  $ date: POSIXct, format: "2003-11-02 06:40:00" "2003-11-02 07:00:00" ...

There is a bit of an issue with some missing observations in these data

table(is.na(mouflon$x))
## 
## FALSE  TRUE 
##   125    10

A simple way to tidy those (for our purposes) is simply:

mouflon <- na.omit(mouflon)

Creating a complex vector and plotting the movement:

mouflon$z <- mouflon$x + 1i * mouflon$y
plot(mouflon$z, asp = 1, type = "o")

  1. Report some basic stats: how many observations are there; what is the sampling rate (aka duty cycle); what is the total extent in the east-west direction and north-south direction?

Check out this R markdown trick for reporting stats. If I type:

There are `r nrow(mouflon)` locations in the data base, spanning `r round(diff(range(mouflon$x)))` m East-West, and `r round(diff(range(mouflon$y)))` m North-South. 

The output is: There are 125 locations in the data base, spanning 695 m East-West, and 1044 m North-South., and so on. The other important piece of infromation I asked for was the sampling rate, which is obtained by:

dT <- difftime(mouflon$date[-1], mouflon$date[-nrow(mouflon)], "minutes")
table(round(dT,-1))
## 
##  20  40 140 
## 119   4   1

almost all of the time intervals are 20 minutes, with a few 40 minute intervals (because of single missing locations), and one of 140 minutes (because of 6 missing locations).

  1. Compute the step lengths (\(S\)), report the mean, standard deviation, and draw a histogram,
dS <- Mod(diff(mouflon$z))
mean(dS)
## [1] 49.45847
sd(dS)
## [1] 50.471
hist(dS, breaks = 50, col = "grey",xlab = "step length (m)")

So, in 20 minutes the mouflon typically moves 50 m (sd 50 m). POP QUIZ: What is a distribution where the mean and standard deviation are equal? Note also that since not ALL step durations are 20 minutes, this is a slightly biased analysis.

mouflon[which.max(dS) + c(-1:1),]
##            x       y                date               z
## 99  650801.5 1844614 2003-11-03 15:20:00 650802+1844614i
## 100 650704.7 1844669 2003-11-03 15:40:00 650705+1844669i
## 101 650755.1 1844946 2003-11-03 16:00:00 650755+1844946i
  1. Compute the absolute orientations of the steps (\(\Phi\)), report the mean (\(\widehat{\mu} = \overline{\Phi}\)), the mean cosine (\(\widehat{\rho} = \overline{\cos(\Phi)}\)), and draw a histogram and rose diagram of the angles.

The key is to take the argument of the steps, i.e. of diff(z):

Phi <- Arg(diff(mouflon$z))

The rest follows directly

mean(Phi)
## [1] -0.1762241
mean(cos(Phi))
## [1] 0.001033232
hist(Phi, breaks = 20, col = "grey")
require(circular)
rose.diag(Phi, bins = 24, prop = 2, col = "grey")

The angles are more less uniformy distributed between \(-\pi\) and \(\pi\) - as expected. The mean AND mean cosine are near 0, indiciating no overall orientation to the movement.

  1. Perform the steps above for turning angles (\(\Theta\)).

The turning angles are the differences in the absolute orientations:

Theta <- diff(Phi)

The rest follows directly as abbove:

mean(Theta)
## [1] 0.03993427
mean(cos(Theta))
## [1] -0.1625449
hist(Theta, breaks = 20, col = "grey")
rose.diag(Theta, bins = 24, prop = 2, col = "grey")

In this particular example, the turning angles are ALSO surprisingly uniform, and the mean cosine (\(\rho\)) is amazingly negative. In MOST cases, it will be positive, to reflect a general persistence of movement in the same direction. his is NOT typical of movement data, and (presumably) reflects the fact that the mouflon are (a) in a highly mountainous and often range limited. If you look at the movement data, you will see that there are highly stationary states and a few displacements. We can explore that hypothesis a little bit by filterin the data to larger step lengths. Thus:

steps <- diff(mouflon$z)
long_steps <- steps[Mod(steps) > 50]
mean(cos(diff(Arg(long_steps))))
## [1] 0.1606068
rose.diag(cos(diff(Arg(long_steps))), prop = 2, col = "grey")

Ah! Much better. The \(\widehat{\rho} > 0\), and most of the angles are “forward” looking.

  1. What do you think the statistic \(\widehat{\rho}\) tells you?

If \(\rho = 1\), all the movement is in the same direction. If \(\rho = 0\) the turning angles are completely random. If \(\rho < 0\) the movements tend to go back and forth.

Mouflon - by Karl Joseph Brodtmann

II. Simple statistics on spatial data (with some self-learning).

I emailed you a file, AmericanBeech.csv which contains the coordinates of American beech trees (Fagus grandifola) measured at SUNY-ESF’s Adirondack Research Station in 1985, 2000 and 2009 (data courtesy of M. Dovciak).

  1. Load those data, convert the coordinates to complex numbers, and plot.
## beech <- read.csv("AmericanBeech.csv")
str(beech)
## 'data.frame':    219 obs. of  5 variables:
##  $ EAST : num  17 46.9 42 19.5 32.6 ...
##  $ NORTH: num  10.5 49.9 16.3 34.9 15.7 ...
##  $ DBH85: num  54.9 42.7 46.4 45.2 44 47.8 41.4 46.4 44.8 42.1 ...
##  $ DBH00: num  57 49.5 50.9 48.5 47.8 49 45.5 46.8 46.9 42.8 ...
##  $ DBH09: num  58.5 53.3 52.8 51.2 50.8 50.3 49.1 48.1 47.2 47 ...
Z <- beech$EAST + 1i*beech$NORTH
plot(Z, asp = 1)

  1. Use outer and apply to compute the distance from each tree to it’s nearest neighbor,

To use outer(), you need a function that takes two arguments and returns a single value. But those arguments can be complex! So the distance between two points functions can be:

getDistance <- function(z1, z2) Mod(z2 - z1)

Applying this across all combinations of points giv es:

D.matrix <- outer(Z, Z, getDistance)

Let’s take a quick peek at this matix:

D.matrix[1:5,1:5]
##          [,1]     [,2]      [,3]     [,4]      [,5]
## [1,]  0.00000 49.47798 25.663079 24.53798 16.461290
## [2,] 49.47798  0.00000 33.964640 31.22298 37.052843
## [3,] 25.66308 33.96464  0.000000 29.16821  9.407665
## [4,] 24.53798 31.22298 29.168213  0.00000 23.209425
## [5,] 16.46129 37.05284  9.407665 23.20942  0.000000

This matrix is symmetric, because the distance between tree 1 and tree 2 is equal to the distance between tree 2 and tree 1. And it has 0’s along the diagonal, because the distance between a tree and itself is 0. Actually, any symmetric matrix with 0’s along the diagonals can be considered a distance matrix (or - formally - a distance matrix in a metric space). The diagonal is not very interesting, so we could replace all of those 0’s with NA’s:

diag(D.matrix) <- NA

Now that we have a distance matrix, we can use apply along the rows (or columns) get the distance of the nearest tree:

NN.distances <- apply(D.matrix, 1, min, na.rm = TRUE)

If you want to do everthing in one line, it would look like this:

NN.distances <- apply(outer(Z, Z, function(z1, z2) Mod(z1-z2)), 
                      1, function(x) min(x[x>0]))

High-level stuff! Two bespoke functions defined within a single line of code!

  1. Report the mean nearest neighbor distance, the inter-quartile intervals, and plot a histogram. Describe.
summary(NN.distances)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
## 0.02279 1.32507 1.91916 1.94243 2.40744 5.91733
hist(NN.distances, breaks = 40, col = "grey")

The mean & median nearest neighber distance between trees is about 2 meters, but it looks like there might be some inhibition in the growth of the beech trees, i.e. below the 2 meter peak, it is much less likely to find another tree within, say 1 meter. There are real, biological reasons for this, of course, that even non-plant ecologists might surmise!