In the scholarly communications environment, the evolution of a journal article can be traced by the relationships it has with its preprints. Those preprint–journal article relationships are an important component of the research nexus. Some of those relationships are provided by Crossref members (including publishers, universities, research groups, funders, etc.) when they deposit metadata with Crossref, but we know that a significant number of them are missing. To fill this gap, we developed a new automated strategy for discovering relationships between preprints and journal articles and applied it to all the preprints in the Crossref database. We made the resulting dataset, containing both publisher-asserted and automatically discovered relationships, publicly available for anyone to analyse.
The second half of 2023 brought with itself a couple of big life changes for me: not only did I move to the Netherlands from India, I also started a new and exciting job at Crossref as the newest Community Engagement Manager. In this role, I am a part of the Community Engagement and Communications team, and my key responsibility is to engage with the global community of scholarly editors, publishers, and editorial organisations to develop sustained programs that help editors to leverage rich metadata.
STM, DataCite, and Crossref are pleased to announce an updated joint statement on research data.
In 2012, DataCite and STM drafted an initial joint statement on the linkability and citability of research data. With nearly 10 million data citations tracked, thousands of repositories adopting data citation best practices, thousands of journals adopting data policies, data availability statements and establishing persistent links between articles and datasets, and the introduction of data policies by an increasing number of funders, there has been significant progress since.
Have you attended any of our annual meeting sessions this year? Ah, yes – there were many in this conference-style event. I, as many of my colleagues, attended them all because it is so great to connect with our global community, and hear your thoughts on the developments at Crossref, and the stories you share.
Let me offer some highlights from the event and a reflection on some emergent themes of the day.
Having joined the Crossref team merely a week previously, the mid-year community update on June 14th was a fantastic opportunity to learn about the Research Nexus vision. We explored its building blocks and practical implementation steps within our reach, and within our imagination of the future.
Read on (or watch the recording) for a whistlestop tour of everything – from what on Earth is Research Nexus, through to how it’s taking shape at Crossref, to how you are involved, and finally – to what concerns the community surrounding the vision and how we’re going to address that.
Summary of presentations
The idea is simple in principle: scholarly records ought to be transparent – available to examine and learn from for all. Much of scientific production and communication these days has a heavy digital footprint so the Nexus is nothing but simply connecting the loose strands, right? Yet, as the scholarly record is a reflection of the continuous progress made by multiple actors within the context of scientific structures and processes, bringing the Nexus to life is a little short of simple.
“What we think of as metadata is expanding, and the notion of ‘record types’ is changing” – said Ginny Hendricks. A great majority of scholarly ‘objects’, whether they are data sets, research articles, monographs, or others, undergo many processes (including review, publication, licensing, correction, derivation) and influence knowledge and practice over time.
Making that progress visible and discoverable will allow for tracing the development of ideas and changes in our thinking over time. Transparency of the complete scholarly records will help to understand the impact of science funding and changing policies. It can support a more robust and comprehensive assessment of research, and contribute to improving integrity within as well as public trust in sciences.
Patricia Feeney has given us reasons for optimism in building a robust Nexus. She’s shown areas of greatest growth in metadata reported to Crossref and shared a public roadmap of types of information we’re asked to enable in the future. We’re seeing a true boom of datasets and peer review reports registrations, and the relationship metadata for our records is improving too. At the dawn of defaulting to open references, 44% of records we hold have associated references and that is growing. Provision of the newly enabled affiliation information (ROR IDs) is on the rise, as is the funder information. Some conversations and questions followed highlighting the need for further guidance in these areas.
To make a case for enriching metadata records, Martyn Rittman demonstrated examples of traceability of research influence on realities outside academia. He captured recent examples of data citations and other references present not just between scholarly papers, but also in policy documents and popular media. These allow for greater discoverability of literature – but also show the public influence and impact of the research and the work’s context in our wider society.
While Martyn shared our blue-skies aspiration to streamline Crossref’s APIs to offer insight to all these relationships with a single service, Joe Wass grounded those ambitions in the reality of technical work underway. His team’s attention is divided between three main areas. They continue to maintain and de-bug our existing infrastructure. They are developing self-service solutions for members. Finally, they are mapping and planning improved infrastructure, evaluating technology against the Research Nexus vision.
Bringing it back to the source (of metadata), Rachael Lammey offered a very practical guide to key activities enabling Research Nexus that all members can take on now. She highlighted the benefits of collecting and registering data citations, ROR IDs, and grant funding information. She went on to talk about challenges of subject classification (at a journal level) that our research and development efforts are focusing on at the moment.
Summary of discussions
Publishing has changed dramatically and our members recognise increasing opportunities for transparency of the scholarly record. Breaking the distant vision of Research Nexus down into actionable chunks made it more relatable for call participants. Many reflected on seeing their place in it properly for the first time. Yet, challenges remain and many were brought to the fore in the discussions.
The reliability and usability of the technology for registering metadata with Crossref needs to improve. We need to do better in supporting multi-language and multi-alphabet information. Not just developing systems anew, but also streamline the way content is registered and annotated, and continue to disambiguate the competing identifiers. Different record types, chiefly books, present specific challenges in this regard. Finally, making all that metadata accessible and usable is key to enabling insights from the rich data we collectively make available.
Technology is important, but won’t overcome the barriers that exist in the mindsets. Siloed thinking means that publishers may not be sensitive to benefits that improved relationship metadata could have for colleagues working on assessment, even within the same institutions. Greater guidance or best practices for new identifiers, such as ORCID, ROR, grants, would allow more publishers to get on board with the changes. Researchers often don’t help the cause either – many don’t realise the role and benefits of metadata for their work and are reluctant to provide rich information related to it, perceiving it as a bureaucratic burden.
In a nutshell, I learnt that – while the concept of Research Nexus is pretty complex – we’re all already participating in making it a reality. I’m grateful to the call participants for sharing their challenges and ideas so generously. It means we can work to address those. I’ll be sure to follow-up on requests for support and clearer guidelines about citing data, recording ROR IDs and grants information in the metadata, and we’ll engage our community on complex topics of record updates (corrections, retractions and versions). Be sure to keep in touch with the conversations on the Community Forum. I’ll see you there!