Function description

Given a known geographical location, the TuktuTools package contains a function to download the corresponding raw text file of observations for the closest station from “https://www.ncei.noaa.gov/pub/data/ghcn/daily/gsn/”, and converts the relevant average, maximum, minimum daily temperature into a dataframe. This function also reports the distance between the given geographical location and the corresponding station: getWeatherAtLocation.

Map gsn stations

The package contains a dataset containing location information for all stations. Let’s first map all the stations in the Alaska region where longitude is < -130 and latitude > 60.

require(TuktuTools)
## Warning: package 'lubridate' was built under R version 4.2.3
## Warning: package 'ggplot2' was built under R version 4.2.3
## Warning: package 'sp' was built under R version 4.2.3
data("gsn_stations")
gsn_stations %>% head
##       station     lat      lon   alt                              name   GSN
## 1 ACW00011604 17.1167 -61.7833  10.1    ST JOHNS COOLIDGE FLD               
## 2 ACW00011647 17.1333 -61.7833  19.2    ST JOHNS                            
## 3 AE000041196 25.3330  55.5170  34.0    SHARJAH INTER. AIRP              GSN
## 4 AEM00041194 25.2550  55.3640  10.4    DUBAI INTL                          
## 5 AEM00041217 24.4330  54.6510  26.8    ABU DHABI INTL                      
## 6 AEM00041218 24.2620  55.6090 264.9    AL AIN INTL                         
##   StationID
## 1          
## 2          
## 3     41196
## 4     41194
## 5     41217
## 6     41218
gsn.sf <- st_as_sf(gsn_stations %>% subset(lon < -130 & lat > 60), 
                   coords = c("lon", "lat")) %>% st_set_crs(4326)

mapview(gsn.sf[,"station"])

Let’s define our location of interest and plot it to see which stations are nearest.

point <- data.frame(lon = -152.5, lat = 68.25) 
point.sf <- point %>% st_as_sf(coords = c("lon", "lat"), crs = st_crs(4326))

mapview(gsn.sf[,"station"]) + mapview(point.sf, col.regions = "red")

Get weather at location

Now let’s run the function. It takes a few minutes but notice as it runs that it also reports the distance between the given geographical location and the corresponding station.

realWeather <- getWeatherAtLocation(point$lon, point$lat, 
                          start = 2010, end = 2022) 
## [1] "1 USC00500270"
## The 1th closest station is USC00500270
## The distance is 31.655 km.
## Data to be covered from 2020 to 2022.
## [1] "2 USR0000APAM"
## The 2th closest station is USR0000APAM
## The distance is 54.658 km.
## Data to be covered from 2012 to 2022.
## [1] "3 USC00504683"
## [1] "4 USC00509859"
## [1] "5 USC00509858"
## [1] "6 USR0000ACHM"
## The 6th closest station is USR0000ACHM
## The distance is 100.017 km.
## Data to be covered from 2012 to 2022.
## [1] "7 USR0000ARMC"
## [1] "8 USR0000AKLP"
## [1] "9 USC00501497"
## The 9th closest station is USC00501497
## The distance is 122.884 km.
## Data to be covered from 2010 to 2022.
## [1] "10 USW00026537"
## [1] "11 USW00026508"
## [1] "12 USR0000AUMI"
## [1] "13 USS0049T03S"
## [1] "14 USC00503210"
## [1] "15 USC00506144"
## [1] "16 USC00502425"
## [1] "17 USW00096409"
## [1] "18 USW00026564"
## [1] "19 USC00509869"
## [1] "20 USS0049T01S"
## [1] "21 USC00502103"
## [1] "22 USC00502104"
## [1] "23 USS0050S01S"
## [1] "24 USS0051R01S"
## [1] "25 USW00026517"
## [1] "26 USW00026533"
## [1] "27 USC00508130"
## [1] "28 USR0000ANOR"
## [1] "29 USC00507778"
## [1] "30 USC00503558"
## [1] "31 USR0000AHOW"
## [1] "32 USS0050R04S"
## [1] "33 USC00501492"
## [1] "34 USC00505354"
## [1] "35 USC00500230"
## [1] "36 USR0000AINI"
# plot avg temperature by year
require(ggplot2)

realWeather %>% subset(Month == 7,) %>% arrange(Day) %>% 
  ggplot(aes(Day, Tavg, col = Station)) + 
  geom_ribbon(aes(ymin = Tmin, ymax = Tmax)) + 
  geom_point() + geom_path() + facet_wrap(.~Year)
## Warning: Removed 1 rows containing missing values (`geom_point()`).