This page will demonstrate one situation where the difference between site and core location coordinates matters when using paleodata from the Neotoma Paleoecology Database: getting sediment depths associated with locations.
Our sites and cores are mapped out below. Sites and associated cores
are filled with the same color. Sites have black outlines, and cores
have white outlines, and if you click on a site or core its name should
pop up. For those sites that are described as polygons, the point
location plotted by the neotoma2::plotLeaflet()
function is
also rendered as a big point with a black outline.
Let’s compare the depths associated with the point site locations to
the depths associated with the core locations. We grab all the sites
from Neotoma with “Tanganyika” in their names. Then we download the
sample data from all the sites just to get the elevation field, using
the neotoma2::samples()
, get_downloads()
, and
get_datasets()
functions.
Meanwhile, we define the object site_sf as the point locations of our sites, and we download and filter for the relevant collection units associated with those sites. (See this tutorial for more detail about the collection units and site information.)
tanganyikaSites = get_sites(sitename = "%Tanganyika%")
samps = get_downloads(get_datasets(tanganyikaSites))
## ........................
site_sf = as.data.frame(tanganyikaSites)[c(4,3,2,1)] %>% st_as_sf(coords=c("long","lat"), crs="WGS84")
text="collectionunits"
collunits = content(GET(paste0('https://api.neotomadb.org/v2.0/data/dbtables/',text,'?count=false&limit=99999&offset=0')))$data
collunit_mat = matrix(nrow=length(collunits),ncol=20)
for (i in seq(length(collunits))) {
for (j in seq(20)) {
if (!is.null(collunits[[i]][[j]])) {
collunit_mat[[i,j]] = collunits[[i]][[j]]
}
}
}
collunit_df = collunit_mat %>% as.data.frame()
names(collunit_df) = c("collectionunitid","handle","siteid","colltypeid","depenvtid","collunitname","colldate","colldevice","gpslatitude","gpslongitude","gpsaltitude","gpserror","waterdepth","substrateid","slopeaspect","slopeangle","location","notes","recdaterecreated","recdatemodified")
filtered_colls = collunit_df %>% dplyr::filter(collectionunitid %in% siteMetadata_df$collectionunitid)
filteredColl_sf = collunit_df %>% dplyr::filter(collectionunitid %in% siteMetadata_df$collectionunitid) %>% st_as_sf(coords=c("gpslongitude","gpslatitude"), crs="WGS84")
Now that we’ve defined all our objects, we are going to look for all the relevant depth and elevation data. We can load up a bathymetric map of Lake Tanganyika and vectors of the depths of Lake Tanganyika provided by Mike McGlue and tcarta. When we plot the bathymetric raster map, we get a qualitative sense of the depth profile of Lake Tanganyika, but we want actual numbers, so we need to use the depth contour vectors. (The difference between raster and vector data is beyond the scope of this tutorial, but if you need a refresher, see this page.)
## Reading layer `TCarta_tanganyika_contours' from data source
## `C:\Users\Blois Lab Generic\Documents\GitHub\neotomaTutorials\TCarta_tanganyika_contours.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 8101 features and 4 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 29.08808 ymin: -8.81098 xmax: 31.20154 ymax: -3.347897
## Geodetic CRS: WGS 84
## stars object downsampled to 625 by 1600 cells. See tm_shape manual (argument raster.downsample)
To get the values of our site and collection unit points, we need to
do some spatial interpolation, because the contours don’t overlap our
points of interest. We’ll use a simple method called inverse
distance-weighted spatial interpolation, implemented in the R package
gstat
. First we get 100,000 points from our contour lines,
filtering out any unsampled contour lines, and then we build our spatial
interpolation model, which simply looks for the nearest neighbors of of
our site and collection unit points from the points we sampled, and
averages them, giving more weight to points that are closer. Finally, we
use our model to predict depths based on site location and based on core
location, and display them, as well as their difference, in our table
below.
Note: while creating this exercise, the Neotoma team noticed an error in the Neotoma elevation metadata and corrected it. People curate these data, which makes them high value, but sometimes people make mistakes!
## although coordinates are longitude/latitude, st_sample assumes that they are
## planar
sf_object <- st_sf(depth = contours$ELEVATION, geometry = depth_multipoints) %>% filter(!st_is_empty(geometry))
idw_model <- gstat(
formula = depth ~ 1, # Depth is the dependent variable
locations = sf_object,
nmax = 10, # Optional: Limit to nearest neighbors
set = list(idp = 2) # Set power for IDW
)
# Predict at specific points
predictions_site <- predict(idw_model, newdata = rbind(site_sf))
## [inverse distance weighted interpolation]
## [inverse distance weighted interpolation]
site_df = st_drop_geometry(site_sf) %>% cbind(round(predictions_site$var1.pred))
coll_df = st_drop_geometry(filteredColl_sf)
coll_df$depth_coll = (round(predictions_coll$var1.pred))
names(site_df) = c("sitename","siteid","depth_site")
depths_all = as.data.frame(samps) %>% left_join(coll_df,by=join_by("siteid" == "siteid")) %>% left_join(site_df,by=join_by("siteid" == "siteid")) %>% dplyr::select(siteid,collectionunitid, elev,gpsaltitude,location,depth_site,depth_coll)
depths_all = depths_all %>% dplyr::mutate(depth_difference = depth_site - depth_coll)
datatable(depths_all,rownames=FALSE, options = list(pageLength=11))
The difference between the depth estimated from the site location and the depth estimated from the core location is in the column “depth_difference” in the above table. “Depth_site,” “depth_coll”, and “depth_difference” are all measured in meters. We can see there is one case where the depth based on the collection unit is equal to the depth based on the site - but that’s our one point where the site is in fact the location of a single core, so we expect that. Otherwise, there’s always a difference between the purported site depth and the actual collection unit depth that can be as big as 500 m. That’s a significant difference if you’re interested in the aquatic ecology of the collection unit area. Also notice that when the depth is reported in the Neotoma metadata (in the location field), except for our one site where there’s no difference, the core location depth is always closer to the reported depth.
We compared the depths associated with nominal point site locations and with actual core locations for Neotoma paleodata from Lake Tanganyika. We found that there’s always a difference between the site and core location depth, confirming our intuition: we better be careful about the difference between sites and collection units in Neotoma.
If you would like to provide feedback about this tutorial, please complete this form or reach out to Nick Hoffman at nicholashoffman@ucmerced.edu.