Quality metadata is foundational to the research nexus and all Crossref services. When inaccuracies creep in, these create problems that get compounded down the line. No wonder that reports of metadata errors from authors, members, and other metadata users are some of the most common messages we receive into the technical support team (we encourage you to continue to report these metadata errors).
We make members’ metadata openly available via our APIs, which means people and machines can incorporate it into their research tools and services - thus, we all want it to be accurate.
The other day I was out and about and got into a conversation with someone who asked me about my doctoral work in English literature. I’ve had the same conversation many times: I tell someone (only if they ask!) that my dissertation was a history of the villanelle, and then they cheerfully admit that they don’t know what a villanelle is, and then I ask them if they’re familiar with Dylan Thomas’s poem “Do not go gentle into that good night.
Having joined the Crossref team merely a week previously, the mid-year community update on June 14th was a fantastic opportunity to learn about the Research Nexus vision. We explored its building blocks and practical implementation steps within our reach, and within our imagination of the future.
Read on (or watch the recording) for a whistlestop tour of everything – from what on Earth is Research Nexus, through to how it’s taking shape at Crossref, to how you are involved, and finally – to what concerns the community surrounding the vision and how we’re going to address that.
TL;DR A year ago, we announced that we were putting the “R” back in R&D. That was when Rachael Lammey joined the R&D team as the Head of Strategic Initiatives.
And now, with Rachael assuming the role of Product Director, I’m delighted to announce that Dominika Tkaczyk has agreed to take over Rachael’s role as the Head of Strategic Initiatives. Of course, you might already know her.
We will also immediately start recruiting for a new Principal R&D Developer to work with Esha and Dominika on the R&D team.
Crossref strives for balance. Different people have always wanted different things from us and, since our founding, we have brought together diverse organizations to have discussions—sometimes contentious—to agree on how to help make scholarly communications better. Being inclusive can mean slow progress, but we’ve been able to advance by being flexible, fair, and forward-thinking.
We have been helped by the fact that Crossref’s founding organizations defined a clear purpose in our original certificate of incorporation, which reads:
“To promote the development and cooperative use of new and innovative technologies to speed and facilitate scientific and other scholarly research.”
As Crossref prepares to turn 20 in January 2020, it’s an opportunity to reflect on achievements and highlights from 2018-19 and also ponder the preceding decades. Change is a constant at Crossref but the organization has never strayed from its initial defined purpose. Our services and value now extend well beyond persistent identifiers and reference linking, and our connected open infrastructure benefits our 11,000+ membership as well as all those involved in scholarly research. This expansion is exactly what was envisioned to meet the goal of “speeding and facilitating” research.
This year’s annual report is different from previous years’; it has been expanded into a ‘fact file’ so that we can invite comments on the path ahead, based on transparent access to data about our membership, activities, and finances. As we were pulling together the charts and tables for this annual report we noticed stark differences in where Crossref is today compared to years past.
The rate of membership growth has accelerated and we now have over 180 new members joining every month, leading to one of the most striking changes we found. The lowest three membership tiers now account for 46% of revenue (up from 25% in 2011) while the highest three tiers account for 36% (down from 56% in 2011).
Today, the typical Crossref member has just a few hundred registered content items.
One way we have been able to accommodate this growth efficiently is by collaborating with sponsors in different countries. Very small members can join via a local sponsor that is able to provide technical, financial, language, and administrative support. We now have more members joining via sponsors, who otherwise would largely not be able to join at all. While you’d need to be a millionaire by US standards to join directly from Indonesia in our lowest fee tier (calculated using Purchasing Power Parity), the sponsor program—supported often by government investment in science and education—has enabled Indonesian organizations to join Crossref in large numbers, supporting their aim to become one of the fastest-growing nations in open research, and to help that research be discovered.
Crossref has repeatedly stayed ahead of developments in the community
In 2007, when the Similarity Check working group discussions and pilot started, there was disagreement on the board about whether Crossref should provide such a service and whether it was a strategic priority for members. By the end of the pilot, when the decision came to launch a production service, it was seen as essential and a top priority. This conclusion has been borne out in recent research into the value of Crossref; Similarity Check is one of the services of most importance to members.
Adding preprints as a content type was controversial at the time. The board discussed the topic of “duplicative works” for about two years with strong opinions on all sides. The working group delivered a good set of policies and technical specifications and in the July 2015 board meeting there was a majority—but not 100%—agreement on the motion to approve. We implemented preprints as a content type just in time to accommodate the snowballing of preprint servers emerging from existing and new members.
Another example of a former—and current—area of contention is the approach to metadata. When Crossref first launched, there were lengthy discussions about what metadata we should collect. The initial focus was on the minimal set of metadata to enable reference matching in support of reference linking. In the beginning, neither article titles, lists of authors, references, nor abstracts were included in the minimal metadata set. We supported them as optional but most members opted out. However, the huge set of metadata that Crossref collects and disseminates now is seen as essential, providing a lot of value for members in terms of discoverability.
Today, Crossref enables metadata retrieval on a large scale—an average of more than 600 million queries per month—through a variety of interfaces, most notably the REST API (Public, Polite, and Plus versions). The metadata is used by thousands of organizations and services—both commercial and not-for-profit—increasing the discoverability of member content. In fact, members of all stripes have long initiated projects to expand the metadata Crossref is able to collect and disseminate: from facilitating text mining (through license and full-text URLs); to enabling better connections with and evidence of contributions (through Funder IDs, ORCID iDs, and soon CRediT roles and ROR IDs).
These are all examples of where Crossref has successfully “promoted the cooperative use of new and innovative technologies” and where we are meeting our mission to make scholarly communications a little bit better. As ever, we need to thank our brilliant staff for their unfailing resilience, balance, and diligence, in these times of dynamic change.
Considering the value and future of Crossref
Research is global, and supporting a diverse global community is a challenge. This year, we conducted our first wide-ranging investigation into what people value from Crossref. This involved telephone interviews with over 40 community members as well as an online survey of 600+ respondents.
“The Crossref of 2040 could be an even more robust, inclusive, and innovative consortium to create and sustain core infrastructures for sharing, preserving, and evaluating research information.”
But only if Crossref is not:
“held back, and its remit circumscribed, by legacy priorities and forces within the industry that may perceive open data and infrastructure as a threat to their own evolving business interests.”
We welcome this public commentary and encourage others in the community to respond and report what value Crossref offers as community-owned infrastructure, and how they’d like to see the organization evolve.
More than ever, we need to have this discussion with a broad and representative group. So please, read the value research report and the annual report/fact file, and get ready to voice your opinions!