{{ site.title }}
ARL Views

Persistent Identifiers Connect a Scholarly Record with Many Versions

Last Updated on July 9, 2022, 10:02 am ET

photo of an open padlock with key in it
image CC0 by PDPics

In the past few months, we’ve seen large commercial publishers express renewed concern that green open access (OA)—sharing author manuscripts in open repositories—threatens the scholarly record because multiple versions undermine the “version of record” (VOR). These publishers are making two arguments here: (1) that multiple “inferior” versions will cause confusion for researchers (and won’t be linked to other related assets such as data and code) and, (2) that if research funders fund gold OA—publication in OA journals—and allow authors to share their work in open repositories, the free availability of manuscripts will threaten the gold funding stream, which publishers need in order to maintain the “version of record.”

When publishers speak about linked research and scholarship only in terms of the market transition to open access, it is an inherently limiting view of scholarly research. In this context, concern for the version of record reflects a business interest, not a scholarly value. As a stewardship strategy, insisting on only publisher-hosted versions of record does not align with a modern research workflow inclusive of multiple tools and potential repositories. Recently, a number of publishers have expressed the “version of record” concern with regards to the PlanS “Rights Retention Strategy.” Yet, as was pointed out in the response by cOAlition S, establishing and maintaining relationships to other versions of articles or research assets has already been shown to be successful in disciplinary and scholarly communities.

Whereas the published, printed version of the research article was once the authoritative source of research, new modes of publishing and the publishing of other research outputs (postprints, protocols, data, code, etc.) have made the term “version of record” all but irrelevant. The scholarly communications landscape has already moved into what Herbert Van de Sompel, Bianca Kramer, and Jeroen Bosman call a “record of versions,” where persistent identifiers (PIDs) enhance the discoverability and linking of research outputs regardless of where those outputs are housed.

The “record of versions” terminology better captures the outputs of the rich, holistic, and increasingly open research process that is critical to ensuring research integrity. As institutions, scholarly societies, funding agencies, and others have made commitments to open access and the transparency of robust, scholarly research, these outputs are more widely available and accessible—but often distributed across varying platforms. For example, a preprint or postprint may be available through an institutional repository (IR), a related data set may be published in a discipline’s or funder’s data repository, and related code may be available on GitLab (ideally backed up in an IR). The distributed nature of the assets is actually key for ensuring that each output is properly curated, findable, and preserved. When they are distributed in specialized repositories, research assets are more likely to have digital object identifiers (DOIs) minted, metadata created and shared, and deposits checked and preserved.

This need not cause confusion for researchers though—because these versions and assets can be linked through persistent identifiers. Through the use of PIDs (such as DOIs, PURLs, ORCID IDs, ROR IDs, and RRIDs) distributed research assets can be discovered and linked, and have relationships asserted.

The continued insistence on “version of record”—including using the VOR date instead of the issue date to calculate journal impact factors—is also a subtle attempt by some commercial publishers to continue to exert control over the entire scholarly communications ecosystem and to be seen as sole authorities or stewards of research publishing. Thus, there is a pressing need to shift the dialogue from a single “version of record” to a “record of versions” that encompasses multiple versions and outputs, and makes room for a more diverse and inclusive publishing environment.

Libraries play a key role in partnering across the scholarly ecosystem to ensure equitable and enduring access to the full scholarly record—working with creators, publishers, and standards bodies, and across different business models. So let’s embrace the “record of versions” and promote PIDs to connect research outputs to one another and to other related work.