In part to illustrate some places where movement data are “tucked away” in R, here’s an example of a mouflon (Ovis gmelini) in the
adehabitatLT
package. This particular mouflon was tracked in 2003 somewhere in southern France, data from the Office National de la Chasse et de la Faune Sauvage - the national game and wildlife agency of France.
The following bit of code extracts one of two mouflons in the package, retaining only the x, y and date (time) columns, which as all we need. Note that the time vector is already in POSIX format.
require(adehabitatLT)
# See all the data in the package by running:
## > data(package = "adehabitatLT")
data(mouflon) # loads the data.
mouflon <- as.data.frame(mouflon[[1]][,1:3])
str(mouflon)
## 'data.frame': 135 obs. of 3 variables:
## $ x : num 650634 650569 650589 650554 650468 ...
## $ y : num 1844951 1844938 1844963 1844959 1845157 ...
## $ date: POSIXct, format: "2003-11-02 06:40:00" "2003-11-02 07:00:00" ...
There is a bit of an issue with some missing observations in these data
table(is.na(mouflon$x))
##
## FALSE TRUE
## 125 10
A simple way to tidy those (for our purposes) is simply:
mouflon <- na.omit(mouflon)
Creating a complex vector and plotting the movement:
mouflon$z <- mouflon$x + 1i * mouflon$y
plot(mouflon$z, asp = 1, type = "o")
Check out this R markdown trick for reporting stats. If I type:
There are `r nrow(mouflon)` locations in the data base, spanning `r round(diff(range(mouflon$x)))` m East-West, and `r round(diff(range(mouflon$y)))` m North-South.
The output is: There are 125 locations in the data base, spanning 695 m East-West, and 1044 m North-South., and so on. The other important piece of infromation I asked for was the sampling rate, which is obtained by:
dT <- difftime(mouflon$date[-1], mouflon$date[-nrow(mouflon)], "minutes")
table(round(dT,-1))
##
## 20 40 140
## 119 4 1
almost all of the time intervals are 20 minutes, with a few 40 minute intervals (because of single missing locations), and one of 140 minutes (because of 6 missing locations).
dS <- Mod(diff(mouflon$z))
mean(dS)
## [1] 49.45847
sd(dS)
## [1] 50.471
hist(dS, breaks = 50, col = "grey",xlab = "step length (m)")
So, in 20 minutes the mouflon typically moves 50 m (sd 50 m). POP QUIZ: What is a distribution where the mean and standard deviation are equal? Note also that since not ALL step durations are 20 minutes, this is a slightly biased analysis.
mouflon[which.max(dS) + c(-1:1),]
## x y date z
## 99 650801.5 1844614 2003-11-03 15:20:00 650802+1844614i
## 100 650704.7 1844669 2003-11-03 15:40:00 650705+1844669i
## 101 650755.1 1844946 2003-11-03 16:00:00 650755+1844946i
The key is to take the argument of the steps, i.e. of
diff(z)
:
Phi <- Arg(diff(mouflon$z))
The rest follows directly
mean(Phi)
## [1] -0.1762241
mean(cos(Phi))
## [1] 0.001033232
hist(Phi, breaks = 20, col = "grey")
require(circular)
rose.diag(Phi, bins = 24, prop = 2, col = "grey")
The angles are more less uniformy distributed between \(-\pi\) and \(\pi\) - as expected. The mean AND mean cosine are near 0, indiciating no overall orientation to the movement.
The turning angles are the differences in the absolute orientations:
Theta <- diff(Phi)
The rest follows directly as abbove:
mean(Theta)
## [1] 0.03993427
mean(cos(Theta))
## [1] -0.1625449
hist(Theta, breaks = 20, col = "grey")
rose.diag(Theta, bins = 24, prop = 2, col = "grey")
In this particular example, the turning angles are ALSO surprisingly uniform, and the mean cosine (\(\rho\)) is amazingly negative. In MOST cases, it will be positive, to reflect a general persistence of movement in the same direction. his is NOT typical of movement data, and (presumably) reflects the fact that the mouflon are (a) in a highly mountainous and often range limited. If you look at the movement data, you will see that there are highly stationary states and a few displacements. We can explore that hypothesis a little bit by filterin the data to larger step lengths. Thus:
steps <- diff(mouflon$z)
long_steps <- steps[Mod(steps) > 50]
mean(cos(diff(Arg(long_steps))))
## [1] 0.1606068
rose.diag(cos(diff(Arg(long_steps))), prop = 2, col = "grey")
Ah! Much better. The \(\widehat{\rho} > 0\), and most of the angles are “forward” looking.
If \(\rho = 1\), all the movement is in the same direction. If \(\rho = 0\) the turning angles are completely random. If \(\rho < 0\) the movements tend to go back and forth.
I emailed you a file, AmericanBeech.csv
which contains
the coordinates of American beech trees (Fagus grandifola)
measured at SUNY-ESF’s Adirondack Research Station in 1985, 2000 and
2009 (data courtesy of M.
Dovciak).
## beech <- read.csv("AmericanBeech.csv")
str(beech)
## 'data.frame': 219 obs. of 5 variables:
## $ EAST : num 17 46.9 42 19.5 32.6 ...
## $ NORTH: num 10.5 49.9 16.3 34.9 15.7 ...
## $ DBH85: num 54.9 42.7 46.4 45.2 44 47.8 41.4 46.4 44.8 42.1 ...
## $ DBH00: num 57 49.5 50.9 48.5 47.8 49 45.5 46.8 46.9 42.8 ...
## $ DBH09: num 58.5 53.3 52.8 51.2 50.8 50.3 49.1 48.1 47.2 47 ...
Z <- beech$EAST + 1i*beech$NORTH
plot(Z, asp = 1)
outer
and apply
to compute the
distance from each tree to it’s nearest neighbor,To use
outer()
, you need a function that takes two arguments and returns a single value. But those arguments can be complex! So the distance between two points functions can be:
getDistance <- function(z1, z2) Mod(z2 - z1)
Applying this across all combinations of points giv es:
D.matrix <- outer(Z, Z, getDistance)
Let’s take a quick peek at this matix:
D.matrix[1:5,1:5]
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.00000 49.47798 25.663079 24.53798 16.461290
## [2,] 49.47798 0.00000 33.964640 31.22298 37.052843
## [3,] 25.66308 33.96464 0.000000 29.16821 9.407665
## [4,] 24.53798 31.22298 29.168213 0.00000 23.209425
## [5,] 16.46129 37.05284 9.407665 23.20942 0.000000
This matrix is symmetric, because the distance between tree 1 and tree 2 is equal to the distance between tree 2 and tree 1. And it has 0’s along the diagonal, because the distance between a tree and itself is 0. Actually, any symmetric matrix with 0’s along the diagonals can be considered a distance matrix (or - formally - a distance matrix in a metric space). The diagonal is not very interesting, so we could replace all of those 0’s with NA’s:
diag(D.matrix) <- NA
Now that we have a distance matrix, we can use
apply
along the rows (or columns) get the distance of the nearest tree:
NN.distances <- apply(D.matrix, 1, min, na.rm = TRUE)
If you want to do everthing in one line, it would look like this:
NN.distances <- apply(outer(Z, Z, function(z1, z2) Mod(z1-z2)),
1, function(x) min(x[x>0]))
High-level stuff! Two bespoke functions defined within a single line of code!
summary(NN.distances)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.02279 1.32507 1.91916 1.94243 2.40744 5.91733
hist(NN.distances, breaks = 40, col = "grey")
The mean & median nearest neighber distance between trees is about 2 meters, but it looks like there might be some inhibition in the growth of the beech trees, i.e. below the 2 meter peak, it is much less likely to find another tree within, say 1 meter. There are real, biological reasons for this, of course, that even non-plant ecologists might surmise!