This document provides code to solve the most common issues faced when transforming tracking data in the format required for the Seabird Tracking Database using R.
The script uses an artificial example than can be downloaded here:
R is a free open-source software environment and we recommend running R using R Studio
library(dplyr) #for general data wrangling
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## filter, lag
## The following objects are masked from 'package:base':
## intersect, setdiff, setequal, union
library(leaflet) #for maps
#If you don't have a package installed use the following after removing "#"
Read in your tracking data csv (see the section at the end if you data is not all in one csv).
data <- read.csv("C:/Users/bethany.clark/OneDrive - BirdLife International/STDB/STDB_admin_shared_folder/GPS_stdb_bad_example.csv")
#Change the filepath to the location of your csv
head(data) #check the format
## datetime latitude longitude bird_id sex breed_stage tag_type
## 1 2015.12.05 12:13:00 -100.005 6.86970 Bird1 M brood-guard GPS
## 2 2015.12.05 13:13:00 -100.345 6.87014 Bird1 M brood-guard GPS
## 3 2015.12.05 14:13:00 -100.685 6.86029 Bird1 M brood-guard GPS
## 4 2015.12.05 15:13:00 -101.025 6.85092 Bird1 M brood-guard GPS
## 5 2015.12.05 16:13:00 -101.365 6.86155 Bird1 M brood-guard GPS
## 6 2015.12.05 17:13:00 -101.705 6.87218 Bird1 M brood-guard GPS
#Remove NAs in the key variables
## [1] 46
data <- data %>% tidyr::drop_na(latitude, longitude, datetime)
nrow(data) #Check the difference between the number of rows before and after, and investigate if needed
## [1] 44
Common issues include:
- Positions outside the boundaries e.g. lat >90 or < -90, lon
>180 or < -180
- Locations before or after deployment (e.g. of the institute, not the
- Lat/lon reversed
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -109.53 -105.87 -102.22 -98.34 -98.07 0.00
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.000 6.870 6.965 6.842 7.102 7.448