{{ site.title }}
ARL Views

Conversations with US Federal Agency Representatives: Exploring Data Management and Sharing Expenses

Last Updated on January 22, 2024, 2:14 pm ET

adults using desktop computers to analyze data
photo by Sigmund on Unsplash

The Realities of Academic Data Sharing (RADS) research entered its second phase in July 2023 thanks to funding from the US Institute of Museum and Library Services (IMLS). The initial exploratory RADS work funded by the US National Science Foundation (NSF) revealed the complexity of the institutional research ecosystem at six Data Curation Network (DCN) institutions. This IMLS grant will allow for further research into developing cost models for data sharing, as well as provide insights into how institutions, specifically libraries, can best plan for, and allocate resources for, research support. As we begin this three-year project, an early outcome is to refine our questions to federally funded researchers about their data management and sharing (DMS) expenses and to institutional administrators who provide service support for DMS activities.

Our prior study, conducted from 2021 to 2023, focused on identifying the expenses researchers and service units incurred to meet federal policies for research data management and sharing (see the forthcoming report, Publicly Shared Data: A Gap Analysis of Researcher Actions and Institutional Support throughout the Data Life Cycle). Phase two of RADS will also capture institutional DMS expenses, but from a more diverse set of institutions.

Given recent developments, fall of 2023 was an opportune time to connect with funding agencies for a series of conversations regarding DMS expenses. Specifically, we aim to better understand how federal agencies are thinking about DMS expenses in light of new federal policies emerging as a result of the 2022 US Office of Science and Technology Policy (OSTP) Memo “Ensuring Free, Immediate, and Equitable Access to Federally Funded Research” (also known as the Equitable Access Memo or Nelson Memo). Funded researchers and institutions will be impacted by these new and developing DMS requirements, which includes  identifying and budgeting for DMS expenditures during the proposal stage of research projects. These discussions with funding agency representatives were invaluable, as they will help ensure our forthcoming studies with researchers and administrators will be more aligned with funder expectations, and how researchers will need to consider and budget for their DMS expenses.

The RADS team convened conversations with the following funding agency representatives in the fall of 2023:

  • Leighton Christianson, Librarian/Data Curator, National Transportation Library, Bureau of Transportation Statistics, US Department of Transportation
  • Michael Cooke, Senior Technical Advisor, Office of the Deputy Director for Science Programs, US Department of Energy
  • Martin Halbert, Science Advisor for Public Access, US National Science Foundation
  • Alan Moss, American Association for the Advancement of Science (AAAS) Science & Technology Policy Fellow, National Agricultural Library, US Department of Agriculture
  • Cynthia Parr, Assistant Chief Data Officer for the Research Education and Economics Mission Area, US Department of Agriculture
  • Ashley Sands, Senior Program Officer, US Institute of Museum and Library Services

The conversations were semi-structured, featuring dedicated time for follow-up questions and ensuing in-depth discussions. Below are the questions presented to each funder representative, along with the prevailing themes that emerged from these insightful conversations.

How are the agencies making the distinction between DMS activities and activities that support good scientific practice, and the expenses for each?

The distinction between DMS activities and good scientific practice is somewhat nebulous, with agencies taking note of common or established practices from the scientific and discipline-based communities they fund and support. Several of the funder representatives indicated that they do not necessarily see DMS activities as practices separate from good scientific practice, and DMS expenses and budget lines are not entirely separate from broader expenditure categories, such as personnel and infrastructure. However, some funders have heard from their researcher stakeholders that they have not yet been budgeting enough to support all that is required for curation and preservation activities in particular. Additionally, data management plans (DMPs) were noted as key tools, or roadmaps, for predicting what DMS activities will look like in a given project. However, DMPs were recognized as dynamic, with acknowledgments that additional DMS activities and their subsequent expenses may be added at a later date, depending on the project needs. All agency representatives encouraged conversations between researchers and agency program officers to ensure DMS activities are identified and included in project budgeting.

How are the agencies determining what are allowable or non-allowable expenses for DMS activities?

Interestingly, no activities were specifically identified as being entirely out of the scope of data management and sharing practices. As long as activities reported as direct expenses are not “double-charged” via institutional indirect expenses, follow Office of Management and Budget (OMB) “Uniform Administrative Requirements, Cost Principles, and Audit Requirements for Federal Awards” (Uniform Guidance), and can be justified in the budget justification, they may be considered allowable expenses. An exception to this are typically activities that fall under the project and/or proposal development phase, such as developing DMPs or budgeting for DMS expenses. However, once past the proposal phase, discipline and scientific community guidance is leveraged to determine if or when DMS activities and expenditures are appropriate.

On another note, IMLS in particular funds many projects that may not be considered “scholarly research” or even “research” per se. However, these important funded projects do generate outputs and deliverables, such as curricula, databases, or white papers, which should be shared. Expenses, in terms of staff time or other direct expenses, are allowable by IMLS to enable the sharing and management of these outputs.

How are the agencies thinking about data retention?

For context, we asked funders this question specifically, as we learned in the first phase of the RADS study that funding was often the most significant factor in determining how long to retain research data. However, we know that using a contingent variable such as funding as a key decision-making point for data retention is problematic for ensuring the reusability and reproducibility of research data.

Requirements on data retention varied, with some expressing a five-year minimum and others a ten-year minimum. However, there were no blanket requirements for retention beyond this minimum, which may or may not be sufficient for the retention of all data. For funders who rely on the peer review process, they once again emphasize community expertise and the need to surface answers on broader questions about the best practices for data retention. The peer review process may identify data that should be retained and/or be preserved for longer than the agency minimum. The US Department of Energy (DOE), however, already supports certain mission-driven long-term data retention and preservation efforts for high-priority DOE-funded research data. These strategic data resources are typically supported through national laboratory system infrastructure to ensure longer-term stewardship of data.

What would be the most valuable institutional DMS expense information for us to report on through this research?

Our team specifically posed this question to better understand how RADS research might be of use to funding agencies, and we will use this information to further guide our work. We found the insight provided to be useful, especially when considering the role of research libraries and institutions in supporting researchers. Among the key expense information that was of interest to funding agencies were:

  • How much are libraries spending for data management and sharing services in relation to overall institutional expenses?
  • How much are “human” curation costs (staff, researchers, librarians, etc.) and what is the average cost of curation per gigabyte?
  • What are the average data storage costs per gigabyte?
  • What are the post-award work costs (specifically around data retention and storage)?
  • What is the percentage of all DMS expenses for individually awarded projects, and as a total against all research funds brought into an academic institution?

Finally, during these conversations, we solicited feedback on the DMS activities used in the surveys of phase one of our study. Suggested revisions and feedback were incorporated into the RADS Public Access DMS activities v3, and will serve as a foundation for how we will continue to ask researchers and institutional administrators about their DMS expenses in forthcoming project studies. To view these activities and learn more about how they can aid researchers and supporting research services, see the ARL Views blog post “Navigating the Complex Landscape of Research Data Management and Sharing (DMS): DMS Activities from the RADS Initiative.”

Conclusion

Insights from funding agency representatives reaffirmed the pivotal role of information professionals in facilitating public access to research data. As federal funding agencies proceed with plans following the 2022 Equitable Access/Nelson Memo, research libraries, particularly, must be cognizant of policy developments impacting their institutions. Library leaders are well positioned not only to stay informed but also to adapt services, ensuring tailored support for their research communities. Beyond offering essential services like DMP reviews and hosting institutional repositories, libraries play a crucial role in assisting disciplinary communities in establishing standards for access to research data. While disciplinary communities lead in defining norms, information professionals possess a holistic view of the data sharing landscape. Institutions that offer baseline support and collaborative efforts, such as the Data Curation Network, may gain prominence as institutions seek specialized assistance for their researchers.

Acknowledgements

The phase two RADS project research team, listed below, looks forward to continuing these conversations with federal and private funders.

  • Jake Carlson, Associate University Librarian for Research, Collections, and Outreach, University at Buffalo Libraries, University at Buffalo
  • Joel Herndon, Director of the Center for Data and Visualization Sciences, University Libraries, Duke University
  • Jennifer Moore, Head of Data Services, University Libraries, Washington University in St. Louis
  • Mikala Narlock, Director, Data Curation Network
  • Shawna Taylor, Project Manager, Open Science, Association of Research Libraries
  • Cynthia Hudson Vitale (RADS Principal Investigator), Director, Science Policy and Scholarship, Association of Research Libraries

This project was made possible in part by the Institute of Museum and Library Services award LG-254930-OLS-23.

Affiliates