Introduction

During this workshop, we will demonstrate the TuktuTools package, which has been developed for studying Tuktu (caribou) movement ecology and spatial patterns.

To date, this package contains functions that prep and filter movement data:

  • process_moveNWT - processes data from Movebank;
  • removeoutliers - flags fixes considered as outliers;
  • prepData - prepares and filters data to a specific period of time and/or a minimum number of fixes per day;
  • scan_tracks - visualizes individual paths through time

The package also contains functions that analyze spatial patterns, including: - getSpeed - computes individual movement rates, displacements, speeds; - getDailyMean - estimates individual mean daily location (i.e., the mean x and y coordinates of all the daily locations for a given individual); - getLoCoH and getKernelUD - estimates ranging areas with Local Convex Hulls (LoCoH) or Kernel Utilization Distributions (KUD); - getPairwiseDistance and getPairwiseOverlap - estimates daily pairwise distance or pairwise overlap between pairs of individuals; - estimateCalving - estimates calving status (non-calving or calving, calving with survival or calving with calf death), calving timing, and calving location for given females, using movement rate; - estimateMigration_stan - estimates population-level migration timing using STAN

The package contains the following data:

  • caribou - anonymized movement data for 4 individual Barren-ground caribou;
  • simulated_migrations - simulated movement tracks for 18 individuals

Most of the functions in this package have been developed to handle simple features. Simple features, supported by the package sf, are georeferenced spatial objects, allowing to easily project data on a map, transform the projection and manipulate the data (e.g., link points into lines, union polygons, etc.).

Installation

To install the current version of this package in R:

install.packages("devtools")
devtools::install_github("ocouriot/TuktuTools", build_vignettes = TRUE)

Load the package:

require(TuktuTools)

Loading Data

The TuktuTools package contains real movement data from 4 barren-grond caribou from the Bathurst herd in the Northwest Territories, which have been anonymized and shifted spatially and temporally. The package also contains data of 18 simulated tracks of caribou, during the spring migration.

To see a list of these datasets, enter:

data(package = 'TuktuTools')

The Three (or Four) First Operations!

The caribou dataset contains real movement data for 4 caribou individuals. It contains the ID of the individuals, the sex, as well as the Time and Lon,Lat (in WGS 84, epsg: 4326) and x,y (UTM zone 10N, epsg: 32610) coordinates of their GPS locations.

data(caribou)
head(caribou)
##       ID sex                Time Year       Lon      Lat         x       y
## 1 Dancer   f 2002-04-01 00:00:00 2002 -133.8427 62.77028 -51589.97 7006600
## 2 Dancer   f 2002-04-01 08:00:00 2002 -133.8394 62.76716 -51480.43 7006228
## 3 Dancer   f 2002-04-01 16:00:00 2002 -133.8453 62.77767 -51584.51 7007437
## 4 Dancer   f 2002-04-02 00:00:00 2002 -133.8643 62.80010 -52121.93 7010074
## 5 Dancer   f 2002-04-02 08:00:00 2002 -133.8621 62.79918 -52024.50 7009953
## 6 Dancer   f 2002-04-02 16:00:00 2002 -133.8625 62.80007 -52030.49 7010055

These data are very typical examples of movement data. The key columns are individual identifies (ID), a time stamp (Time), and coordinates. Here there are longitude and latitude coordinates.

The following steps are absolutely essential to master - in general - for dealing with movement data. They require the following three packages: sf, lubridate and mapview.

1. Simple features

Currently, the most powerful tool for working with data is the simple feature (sf) package, which is a versatile, flexible, relatively easy to use structure for spatial data. Creating a “simple feature” object with st_as_sf:

require(sf)
caribou.sf <- st_as_sf(caribou, 
                       coords = c("Lon","Lat"), crs = 4326)
caribou.sf
## Simple feature collection with 18074 features and 6 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -134.4759 ymin: 62.60674 xmax: -122.6872 ymax: 66.12819
## Geodetic CRS:  WGS 84
## First 10 features:
##        ID sex                Time Year         x       y
## 1  Dancer   f 2002-04-01 00:00:00 2002 -51589.97 7006600
## 2  Dancer   f 2002-04-01 08:00:00 2002 -51480.43 7006228
## 3  Dancer   f 2002-04-01 16:00:00 2002 -51584.51 7007437
## 4  Dancer   f 2002-04-02 00:00:00 2002 -52121.93 7010074
## 5  Dancer   f 2002-04-02 08:00:00 2002 -52024.50 7009953
## 6  Dancer   f 2002-04-02 16:00:00 2002 -52030.49 7010055
## 7  Dancer   f 2002-04-03 00:00:00 2002 -51569.55 7009647
## 8  Dancer   f 2002-04-03 08:00:00 2002 -53286.09 7008459
## 9  Dancer   f 2002-04-03 16:00:00 2002 -50269.61 7005972
## 10 Dancer   f 2002-04-04 00:00:00 2002 -49748.15 7005049
##                      geometry
## 1  POINT (-133.8427 62.77028)
## 2  POINT (-133.8394 62.76716)
## 3  POINT (-133.8453 62.77767)
## 4   POINT (-133.8643 62.8001)
## 5  POINT (-133.8621 62.79918)
## 6  POINT (-133.8625 62.80007)
## 7  POINT (-133.8523 62.79717)
## 8  POINT (-133.8815 62.78412)
## 9  POINT (-133.8153 62.76672)
## 10 POINT (-133.8022 62.75936)

The crs = 4326 refers to the coordinate system being - basically - Longitude and Latitude on a nearly spherical Earth. This particular CRS is detailed here, and the “4326” is the so-called “EPSG” code. This is the coordinate system used by all GPS satellite navigation systems.

1b. Projecting

In general, we do not want to work in Longitude-Latitude, but in a projected coordinate system such that the unit of North-South and East-West is in meters. For these data, for example, since we’re working in northern Canada we might choose EPSG:3347, aka Statistics Canada Lambert:

caribou.sf <- st_transform(caribou.sf, 3347)
caribou.sf
## Simple feature collection with 18074 features and 6 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: 4268069 ymin: 3484118 xmax: 4863478 ymax: 3771396
## Projected CRS: NAD83 / Statistics Canada Lambert
## First 10 features:
##        ID sex                Time Year         x       y
## 1  Dancer   f 2002-04-01 00:00:00 2002 -51589.97 7006600
## 2  Dancer   f 2002-04-01 08:00:00 2002 -51480.43 7006228
## 3  Dancer   f 2002-04-01 16:00:00 2002 -51584.51 7007437
## 4  Dancer   f 2002-04-02 00:00:00 2002 -52121.93 7010074
## 5  Dancer   f 2002-04-02 08:00:00 2002 -52024.50 7009953
## 6  Dancer   f 2002-04-02 16:00:00 2002 -52030.49 7010055
## 7  Dancer   f 2002-04-03 00:00:00 2002 -51569.55 7009647
## 8  Dancer   f 2002-04-03 08:00:00 2002 -53286.09 7008459
## 9  Dancer   f 2002-04-03 16:00:00 2002 -50269.61 7005972
## 10 Dancer   f 2002-04-04 00:00:00 2002 -49748.15 7005049
##                   geometry
## 1  POINT (4268706 3594395)
## 2  POINT (4268630 3594028)
## 3  POINT (4269093 3595106)
## 4  POINT (4269837 3597598)
## 5  POINT (4269864 3597451)
## 6  POINT (4269906 3597540)
## 7  POINT (4270113 3596983)
## 8  POINT (4268108 3596753)
## 9  POINT (4269545 3593258)
## 10 POINT (4269569 3592233)

2. Mapview

To easily and interactively visualize spatial data, load the mapview package and simply run the mapview function on these spatial data:

require(mapview)
mapview(caribou.sf)

Click around! Experiment! Here is a version of a mapview that is colored by individual ID:

mapview(caribou.sf, zcol = "ID")

The following code is a bit cryptic (also to us), but it will work to turn the data into spatial lines rather than points. Note that we start using the “piping” logic from the magrittr package, which is convenient notation for passing data through multiple processing functions, and the dplyr package to process.

require(magrittr); require(dplyr)
caribou.lines <- caribou.sf %>% group_by(ID) %>% 
  summarize(do_union=FALSE) %>% st_cast("LINESTRING")
mapview(caribou.lines[,"ID"])

3. Taming time

Movement data consists not just of locations but, importantly, of time. If you first load data, you might have time show up in some sort of format which will (usually) be interpreted as a character string. The stable format for time objects in R is called POSIX - which stands for I have no idea. Converting character strings to POSIX is extremely simple and convenient with the lubridate package. This is not strictly necessary for the caribou data, where time is already POSIX. However, for the record, this is how it would be done if these were some observations:

SomeTime <- (caribou$Time[1:10] + ddays(1/12)) %>% as.character
##  [1] "2002-04-01 02:00:00" "2002-04-01 10:00:00" "2002-04-01 18:00:00"
##  [4] "2002-04-02 02:00:00" "2002-04-02 10:00:00" "2002-04-02 18:00:00"
##  [7] "2002-04-03 02:00:00" "2002-04-03 10:00:00" "2002-04-03 18:00:00"
## [10] "2002-04-04 02:00:00"
##  [1] "2002-04-01 02:00:00 UTC" "2002-04-01 10:00:00 UTC"
##  [3] "2002-04-01 18:00:00 UTC" "2002-04-02 02:00:00 UTC"
##  [5] "2002-04-02 10:00:00 UTC" "2002-04-02 18:00:00 UTC"
##  [7] "2002-04-03 02:00:00 UTC" "2002-04-03 10:00:00 UTC"
##  [9] "2002-04-03 18:00:00 UTC" "2002-04-04 02:00:00 UTC"

The ymd_hms function says: take the character string and separate into Year-Month-Date Hour-Minute-Second, but it totally doesn’t care what the separators are. Very useful!

Some handy summaries

Number of individuals and sex

unique(caribou$ID)
## [1] Dancer  Prancer Vixen   Comet  
## Levels: Dancer Prancer Vixen Comet

There are 4 individuals: Dancer, Prancer, Vixen and Comet.

Their sex:

with(caribou, unique(data.frame(ID, sex)))
##            ID sex
## 1      Dancer   f
## 8564  Prancer   f
## 11195   Vixen   f
## 13715   Comet   f

They are all females.

R range and duration of their monitoring?

What are the years each individual was monitored?

with(caribou, unique(data.frame(ID, Year)))
##            ID Year
## 1      Dancer 2002
## 2469   Dancer 2003
## 4240   Dancer 2004
## 8356   Dancer 2005
## 8564  Prancer 2014
## 9369  Prancer 2015
## 10464 Prancer 2016
## 11195   Vixen 2007
## 12019   Vixen 2008
## 13122   Vixen 2009
## 13715   Comet 2006
## 15390   Comet 2007
## 17799   Comet 2008

One approach is to use some dplyr commands. This function allows you apply functions to different groups (subsets) of a data set. Here’s an example:

require(dplyr)

caribou_summary <- caribou %>% group_by(ID) %>% 
  summarize(start= min(Time), end = max(Time)) %>%  
  mutate(duration = difftime(end, start, units = "days"))

We can visualize the monitoring duration for each individual.

ggplot(caribou_summary, aes(y = ID, xmin = start, xmax = end)) + 
    geom_linerange() 

Eah line on those figures represents the duration of the monitoring (x axis) for a given individual (y axis).

Fix rates

The fix rate, or the time lag between successive locations, can be extracted by using the difftime() function on the Time column. Again, this function needs to be applied to each individual separately. Here, we are subsetting the data set per ID, and applying a function which is adding a column to each subset. Note that since the vector of time difference is smaller than the vector of time, we add a missing value at the beginning of each vector, for each value to represent the difference in time to the previous location.

caribou <- caribou %>% group_by(ID) %>% 
  mutate(dtime = c(NA, difftime(Time[-1], Time[-length(Time)], units = "hours")))
head(caribou)
## # A tibble: 6 × 9
## # Groups:   ID [1]
##   ID     sex   Time                Year    Lon   Lat       x        y dtime
##   <fct>  <chr> <dttm>              <chr> <dbl> <dbl>   <dbl>    <dbl> <dbl>
## 1 Dancer f     2002-04-01 00:00:00 2002  -134.  62.8 -51590. 7006600.    NA
## 2 Dancer f     2002-04-01 08:00:00 2002  -134.  62.8 -51480. 7006228.     8
## 3 Dancer f     2002-04-01 16:00:00 2002  -134.  62.8 -51585. 7007437.     8
## 4 Dancer f     2002-04-02 00:00:00 2002  -134.  62.8 -52122. 7010074.     8
## 5 Dancer f     2002-04-02 08:00:00 2002  -134.  62.8 -52024. 7009953.     8
## 6 Dancer f     2002-04-02 16:00:00 2002  -134.  62.8 -52030. 7010055.     8

What are the statistics (min, max, mean, median, …) of this fix rate?

caribou$dtime %>% summary
##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's 
##   0.01667   1.00000   7.98333   4.70526   8.00000 160.00000         4

On average, the fix rate is ~4 hours (median is 8 hours), the shortest fix is 1 minute and the longest is 160 hours, or ~ 7 days.

Which animal is it and when did it occur?

# which row corresponds to the big fix rate
which(caribou$dtime > 159 & is.na(caribou$dtime) == FALSE)
## [1] 13428
# look at the rows around the one with a big fix rate
caribou[13426:13430,]
## # A tibble: 5 × 9
## # Groups:   ID [1]
##   ID    sex   Time                Year    Lon   Lat       x        y  dtime
##   <fct> <chr> <dttm>              <chr> <dbl> <dbl>   <dbl>    <dbl>  <dbl>
## 1 Vixen f     2009-04-11 08:00:00 2009  -134.  63.2 -35215. 7056887.   8   
## 2 Vixen f     2009-04-11 16:00:00 2009  -134.  63.2 -35511. 7056849.   8   
## 3 Vixen f     2009-04-18 08:00:00 2009  -134.  63.3 -31746. 7064711. 160   
## 4 Vixen f     2009-04-18 16:00:00 2009  -134.  63.3 -31696. 7064751.   8   
## 5 Vixen f     2009-04-19 00:01:00 2009  -134.  63.3 -31739. 7064807.   8.02

Vixen has missing data between April 11, 2009 and April 18, 2009.

Filtering data

Now that we know what the data look like, we can prepare and filter them depending on what we are examining.

Later in this workshop, we will identify calving timing and calving location for each female. Barren-ground caribou calve between mid-May to mid-June. Studies examining calving dates considered data between May 19 and July 7 (@cameronMovementbasedMethodsInfer2018; @couriotContinentalSynchronyLocal2023).

So let’s filter and prepare the data during this period for each individual.

The prepData function (1) filters data for a defined period, using a start date and an end date, irrespective of the year; (2) sets an average minimum number of fixes (relocations) per day, which exclude tracks for individuals having fewer daily fixes than this threshold; and (3) finally, it defines a maximum number of successive days of missing data. The two last parameters are optional, but help .

This function returns a data frame filtered as desired, and indicates the number of individual-year that have been excluded.

caribou_prepped <- prepData(caribou, start = "05-19", end = "07-07")
## Period clipped to 05-19 - 07-07
## Number of excluded individuals-years: 2

2 individual-year have been excluded. We can compare the individual-year with our original dataset, to know which individuals have been lost.

ID_year = with(caribou, paste(ID, Year))
ID_year_filtered = with(caribou_prepped, paste(ID, Year)) 

(exclude <- setdiff(ID_year, ID_year_filtered))
## [1] "Dancer 2005" "Comet 2008"

The durations of observations are shorter post-filtering.

Visualizing data

We covered above how to transform movement data to a spatial feature. It is highly recommended to do this before exploring some plotting tools:

caribou_prepped <- st_as_sf(caribou_prepped, coords = c("Lon", "Lat"), crs = 4326)

Scanning some tracks

TuktuTools contains a few functions to visualize animal tracks: scan_tracks. It is helpful to see individuals’ trajectories, as well as examine the variation of their x and y locations through time.

Here are some examples:

scan_tracks(caribou_prepped)

The animals are tracked only for brief periods of time with major gaps! Let’s try some separate years

scan_tracks(caribou_prepped %>% subset(Year == 2007), legend = TRUE)

scan_tracks(caribou_prepped %>% subset(Year == 2008), legend = TRUE)

There is a whole other fun data set in this package called simulated_migrations which can also be excellently visualized:

data(simulated_migrations)
scan_tracks(simulated_migrations)

Visualization on a map

We can visualize the trajectories on a static map as well, using the package basemap, which allows to download a base map as a raster. Note that it takes coordinates in WGS84 (crs = 4326), and return a map with a different crs (crs = 3857).

First, download the map using the basemaps::basemap_raster function. Important here, to limit the Longitude and Latitude range by your data, or use a bounding box.

bbox <- st_bbox(caribou_prepped) + c(-1,-1,1,1)
map <- basemaps::basemap_raster(ext = bbox, 
                                map_service = "osm",
                                map_type = "topographic")

You can collapse the caribou_prepped simple feature into lines as per the (cryptic) code above. Note that here, we group by ID and Year and - importantly - transform the CRS of the lines to the projection of the raster:

caribou_lines <- caribou_prepped %>% 
  group_by(ID, Year) %>% arrange(Time) %>%
  summarize(do_union = FALSE) %>%
  st_cast("LINESTRING") %>% 
  st_transform(raster::crs(map))

Finally, we Draw the map and add the individuals tracks with legend

raster::plotRGB(map, asp = 1, alpha = 100)
plot(st_geometry(caribou_lines), add = TRUE, 
     col = 1:4, lwd = 2)
legend("right", legend = levels(caribou_lines$ID), 
     col = 1:4, lwd = 2,  ncol=1)