Ethical Open Science for Past Global Change Data

Improving Interoperability

Building interconnected and open data resources

org-chart

Overview

We are developing a network of data managers, research practitioners, disciplinary experts, and early career researchers focused on identifying gaps and mismatches among Quaternary data resources, in order to improve interoperability among them. We are focusing on:

  • outreach to data managers and assessment of the current Quaternary informatics landscape,
  • identifying areas for crosswalk development to resolve conflicts between existing ontologies and to help support the adoption of CARE aims,
  • identifying points of connection between repositories, where the addition of related identifiers could help bridge silos, and
  • building on existing initiatives to make data resources more open and interconnected.

Activities



One of our focal repositories, the Neotoma Paleoecology Database, is a federated database constituted by a set of more specialized paleoecological databases that focus on the curation of data from some particular configuration of time, space, and proxy type. We are developing landing pages for these constituent Neotoma databases that offer up-to-date information on the spatial and temporal coverage of the database in question, as well as information on the kinds of datasets contained in the database, the people who have contributed data to the database, and the growth of the database over time. These constituent database landing pages increase the transparency of Neotoma's holdings, allowing for enhanced exploration of paleoecological data. They also facilitate the identification of errors in our stewardship (for instance, they make it easy to see records that purport to concern 8 billion year old pollen!)

In addition to the development of these database landing pages, we have modified Neotoma's already existing dataset-landing pages to include cultural provenance information at the site level. We have begun to map the extent of Indigenous territories intersecting Neotoma sites as recorded by the Native Land project, and link to the Native Land landing pages for those territories.

Quaternary science often yields heterogeneous datasets. These complex records can include physical specimens (e.g., faunal remains, sediment cores, human artifacts), assemblage-level occurrence data, trait measurements (e.g., stable isotopic ratios, osteometrics), and chronologies. A diverse set of data repositories now curate Quaternary data, but different elements of the data and metadata from a single site may be better fits for some data repositories than others and repositories may overlap in scope. This presents disciplinary practitioners with the daunting task of navigating an increasingly opaque data ecosystem, and choosing the better resource for data upload often depends on tacit expertise. This working group will develop a set of tools and recommendations for navigating this complex landscape, through case studies of particular sites.

Led by Thomer and Raia, we are conducting interviews with data resource users and repository managers to understand use of community-curated data resources, needs and visions for interoperability of these resources, and data ethics related to FAIR and CARE principles. The findings from this study will be used to inform technical improvements to community-curated data resources and development of guidelines to enable researchers to make their data more interoperable, reproducible, and ethical.