16 minute read.
POSI fan tutte
Just over a year ago, Crossref announced that our board had adopted the Principles of Open Scholarly Infrastructure (POSI).
It was a well-timed announcement, as 2021 yet again showed just how dangerous it is for us to assume that the infrastructure systems we depend on for scholarly research will not disappear altogether or adopt a radically different focus. We adopted POSI to ensure that Crossref would not meet the same fate.
POSI proposes three areas that an Open Infrastructure organization can address to garner the trust of the broader scholarly community: accountability (governance), funding (sustainability), and protection of community interests (insurance). POSI also proposes a set of concrete commitments that an organization can make to build community trust in each area. There are 16 such commitments.
In our announcement of Crossref’s adoption of POSI, we made two critical points:
- One doesn’t have to meet all the commitments of POSI already to adopt it. For one thing, this would make it impossible for new organizations to adopt POSI. So instead, we should view the adoption of the POSI principles as a “statement of intent” against which stakeholders can measure an organization’s progress.
- That, conversely, meeting all of the POSI principles doesn’t mean an organization can relax. It is always possible for an organization to regress on a particular commitment. For example, an emergency expenditure might mean that the organization no longer maintains a 12-month contingency fund and therefore has to replenish it.
With these two points made, we ended our announcement with a candid self-audit against the principles. We concluded that Crossref was already entirely or partially meeting the requirements of 15 of the 16 POSI commitments. And adopting the 16th commitment would just formalize a direction Crossref had already been heading toward for several years. We also said that we would update our self-audit regularly.
But before we continue with the Crossref POSI audit update, we should talk about the immediate aftermath of our adopting the principles.
Since Crossref adopted POSI, nine other organizations have made the same commitment and conducted similar self-audits. We affectionately call them the “POSI Posse”.
- OA Switchboard
- Europe PMC
These organizations represent a critical part of the hidden infrastructure that scholarly research depends on every day. By committing to POSI, they are helping ensure their accountability to the research community. They are also emphasizing that stakeholders must participate in the governance and stewardship of organizations running that infrastructure.
But perhaps most importantly- these ten organizations that have publicly committed to adopting POSI will not suddenly disappear or change priorities without giving the community time to react and, if need be, intervene.
There are also more quotidian advantages to these organizations adopting POSI. Adopting the principles makes it easier for the respective organizations to collaborate to make research infrastructure more effective and efficient. The foundation of effective collaboration is trust. And, so by agreeing that we share basic principles of operation, we virtually eliminate a whole slew of negotiations that typically need to occur before two organizations trust each other enough to collaborate closely on projects.
One of Crossref’s strategic priorities is to “collaborate and partner” with other organizations on improving our open scholarly infrastructure. And the easiest way to collaborate with us is to adhere to the same principles. So we look forward to more scholarly infrastructure organizations adopting POSI in 2022 so that, together, we can make research infrastructure work better.
Establishing this level of trust has already paid significant dividends with the Research Organization Registry (ROR) - a relatively new infrastructure project founded jointly by DataCite, CDL, and Crossref.
Having nine organizations adopt POSI so soon after our announcement was a wonderful feeling. It is hard for us to convey how happy we are about this without gushing.
Here is a picture of me gushing.
But now we have some outstanding business to update our self-audit.
This post is the first of our regular updates on our progress (or regress) on meeting the POSI principles.
We didn’t regress on any commitment. We’ve improved a little bit where we were not meeting the POSI principles, but we have still not met all our POSI commitments.
|Governance||Coverage across the research enterprise|
|Formal incentives to fulfill mission & wind-down|
|Sustainability||Time-limited funds are used only for time-limited activities|
|Goal to generate surplus|
|Goal to create a contingency fund to support operations for 12 months|
|Mission-consistent revenue generation|
|Revenue based on services, not data|
|Insurance||Available data (within constraints of privacy laws)|
|Open data (within constraints of privacy laws)|
Stakeholder governance moves from red to yellow
Our only red mark in our POSI self-audit was against the principle of stakeholder governance. Our board did not yet reflect our members’ diversity or the broader stakeholder community. In particular, as funders have become more central to shaping the scholarly communications landscape, it seemed important that Crossref have funder representation in our governance.
So this year, the Crossref nominations committee was charged with proposing a board slate that addressed some of our representational gaps. They did this, and as a direct result, two of the members elected to next year’s board were a funder (Melanoma Research Alliance) and a significant preprint platform (Center for Open Science).
These new additions to our board mark a significant improvement in stakeholder governance, but we can do more. Researchers and research institutions are also substantial Crossref stakeholders. We need to have a better representation of their concerns.
Also, there are still members of the scholarly communications community who depend on Crossref but cannot afford to join it because our fees are too high for them. Since membership is a prerequisite to participation in Crossref governance, we are also placing emphasis on figuring out how to further extend Crossref membership to those who still cannot afford it, through programs like Sponsorship, country-level journal gap analyses work, and a forthcoming fee review. So this is a source of stakeholder governance inequity that may be best handled by our membership & fees committee rather than our nominations committee.
In short, we’ve made progress on our stakeholder governance commitment. Still, we need to do more- so we are updating our adherence to the POSI stakeholder governance principle from red to yellow.
Another place where we have improved things is under the banner of “transparency.” But here, we see one of the shortcomings of the ‘traffic light” representation used in the self-audit. The degree that one meets a commitment falls along a gradient. And this gradient cannot be represented accurately in the ternary classification of red/yellow/green. So while last year we marked ourselves as “green” under the commitment to transparency, over the past year we have become greener. We did this by creating sections on our website that provide further detail on our governance and finances- even including the 990 forms that are required by US tax authorities for non-profits when they submit their taxes. So what do we do here? Make it neon-green? Make it blink?
moves from yellow to chartreuse stays yellow
In our first self-audit, we had several yellow marks- places where we were doing OK, but where we needed to make improvements.
The first yellow mark involved one of the principles of “sustainability,” which stipulates that an organization should have a goal to create a contingency fund to support operations for 12 months. At the time, we had a contingency fund of 9 months. The board instructed the finance committee to develop a plan for meeting the new 12-month goal. To do this, the board decided to create three funds. The first is fairly flexible and holds operating expenses for three months. Staff leadership can use this fund at their discretion to manage cash flow issues and support budgeted expenses. The second fund is the fund that holds operating expenses for 12 months. This fund is board-restricted and is only meant to be used in emergencies to help with substantial changes in our financial position or to, in extremis, fund an orderly wind-down of Crossref’s operations. Furthermore, the board’s investment committee established guidelines for investing our operating and investment surpluses. Any surpluses are first applied to supporting the 3-month fund. Once that funding goal is met, any surpluses are applied to the 12-month fund. And once both the 3-month and 12-month funding goals are met, any further surpluses will be put into another board-restricted fund that can be used to fund new investments or new Crossref initiatives.
But again, the simple yellow mark against this item does not capture this level of detail. We only get to turn it green once we have the 12-month fund in place.
It looks like we will meet the goal in 2022, but it is hard to say exactly when. If we did shades of color- we might make it chartreuse. But nobody wants to see chartreuse. So while we have made significant progress here, our commitment to maintaining a 12-month contingency fund remains yellow until we have reached our goal.
Patent non-assertion stays yellow
The second yellow mark was against our publishing a patent-non-assertion statement. This feels like a missed opportunity because it will be straightforward for us to do, but we have not yet done it. We have never applied for patents, and we don’t intend to start. In short, nothing is blocking us from doing this other than our natural reluctance to have to draft anything that involves lawyers. Our lawyers are very nice people, but everything we have to draft with them makes our eyes glaze over. We need to get this done ASAP in 2022.
Open source remains yellow
The third yellow mark makes me cringe because, as technical director, it is firmly in my bailiwick. We have committed to open-sourcing all of our code. In last year’s self-audit, I predicted that we should be able to open all of our code within 12 to 18 months. I was wrong. That means this commitment remains yellow. And what’s more- it is likely to remain yellow for a year or two. Let me try and explain why.
First, I should note that all new services that we’ve written since 2007 have been released as open-source (under an MIT license). These include our REST API, Crossmark, Metadata Search, and Event Data. You can find all our open-source code on Gitlab.
This leaves us with our “content system” with its legacy code, which handles content registration, OAI-PMH, OpenURL, and XML APIs. This code was originally developed for Crossref by a third party (who I won’t name because they are in no way to blame for our predicament). Crossref only took over the development of the code base internally ~ 2010. But the system has accumulated over twenty years of technical debt and includes many once-common engineering practices that are deprecated (to put it delicately). Additionally, the code is a labyrinth of dependencies on very old libraries under very old licenses.
And although we have spent much of the past two years replacing critical parts of the system’s authentication and authorization code, I am certain that there remain swathes of code that, under scrutiny, would prove a security nightmare.
Now we know that so-called “security through obscurity” is bad practice. Our legacy code base illustrates the point. We had credentials embedded in the code. We had backdoors and application-level root access. We had countless places where we didn’t sanitize input. But the code was private- and so it gave developers a false sense of confidence when they occasionally made these shortcuts in the interest of developing new features more quickly. And in those early days of hyper-growth, we often had to develop things very, very quickly. Technical debt, like any debt, is a tradeoff.
As I said- we’ve cleaned a ton of this stuff up. For example, we’ve replaced our primary authentication system. But this experience has made us better appreciate just how difficult it would be to harden a system this old.
And besides, we are already replacing it - albeit incrementally. We have been extracting and rewriting key components of the old system, and we plan to continue to extract and rewrite until there is nothing left of the old code. All this new code is, naturally, open-source. And it follows modern security practices.
And so we face a difficult choice- do we try and fix code that is hard to fix and that we are replacing anyway- or do we just focus just on replacing the code and making sure the new, open-source code follows modern security best -practices? We’ve chosen to take the latter route. But it does mean this entry will have a yellow circle next to it for a few more years as we replace things.
Open data moves from yellow to green
And this brings us to our final yellow mark- which was next to the principle of open data. The root of the problem is that what we colloquially call “Crossref metadata” is a mix of elements, some of which come from our members, some from third parties, and some from Crossref itself. These elements, in turn, each have different copyright implications.
On top of this, Crossref has terms and conditions for its members and terms and conditions for specific services. These terms and conditions grant Crossref the right to do things with some classes of metadata and not do things with other classes of metadata - regardless of copyright.
The net result is that users can freely use and redistribute any metadata they retrieve via our APIs or in our periodic public data files. But it also means we cannot just slap a CC0 waiver on all the data. Instead, we have to specify exactly what copyright and terms apply to each class of data. We’d never done this in a clear and accessible way, so some of our users were understandably concerned that maybe we were hedging or perhaps the reuse rights were unclear. But we are not hedging; they are clear. They just weren’t documented. And now they are. In human-readable form. And soon-to-be in machine-readable form. So we can move this from yellow to green.
Reflections on the year since our adoption of POSI
When the Crossref board adopted POSI last year, frankly, a few of us were surprised. We never doubted Crossref’s direction as an open infrastructure organization, but we were not sure that others would see the value in making a public commitment to the principles. We’d heard some people say that they thought adopting them would be seen as “Virtue Signaling.” Which, to be fair, it is. This shouldn’t be surprising or contentious. Our entire scholarly communication system is based on virtue signaling. But, of course, the term “virtue signaling” (with scare quotes) is also sometimes used to insinuate that such signaling is disingenuous and designed primarily for marketing purposes. And that would be a real danger. But the principles were drafted with a built-in safeguard against disingenuous use. The commitments POSI lists are practical things that can be verified by anyone. Is our data open? Does the diversity of our board reflect the diversity of our stakeholders?
So from the start, we knew that the community would be able to hold us to our commitments. And knowing that made it imperative that we develop a mechanism and process for tracking whether we were meeting them. Thus was born the self-audit.
And the self-audit, in turn, has served as a forcing function to ensure that we didn’t just launch a proclamation and then forget about it. We needed to integrate our POSI commitments into all aspects of our day-to-day work. As such, “Live up to POSI” is now a prominent part of Crossref’s Strategic Agenda. POSI has become a fundamental part of our planning and our public product roadmap. POSI has even become a part of our internal staff annual development plans.
Adopting POSI has changed the way we work. It has changed the way the board works. It has changed the way staff works.
And we hope that it is having a similar effect on our fellow POSI Posse.
But how about changing the way POSI works?
Now that Crossref and the nine other members of the POSI Posse have had a year of considering and/or living up to the POSI standards, what would we change? What would we add?
A few themes have started to emerge as we’ve fielded questions from the current POSI Posse and others who have expressed an interest in adopting POSI.
- How does POSI apply to non-membership organizations?
- Can POSI apply to commercial organizations?
- How could POSI be extended to apply to open infrastructure organizations outside of scholarly communication?
- How in the hell do you pronounce “POSI?”
We’ve tried to answer some of these questions in the POSI FAQ, but can we update POSI so that we don’t need the FAQ? Or at least so that we can start a new FAQ?
And, critically, if we change POSI, how do we ensure we make it stronger and not weaker? Because, to be candid, some of the questions that we’ve fielded have come from parties concerned that POSI is too restrictive. That, for example, the stipulation that revenue should be based on services and not on data makes for inflexible business models. Yes. It does. Deliberately.
Because one of the biggest barriers to a community being able to fork digital infrastructure is closed (incl. fee-based) data. And one of the fundamental positions of POSI is one the authors learned from open-source communities. This is that these efforts can fail no matter how much care you take to ensure financial sustainability and how much care you take to ensure community-based governance. The ultimate power the open-source community has is to take the code and fork it. This is the insurance policy that helps keep open source projects honest. And we have tried our best to bake this lesson into the POSI principles. We don’t want to weaken POSI. They are, after all, principles.
So in 2022, we look forward to more organizations endorsing POSI. And the current POSI Posse has started a conversation about how we can strengthen the principles and also extend them so that they can more easily be applied to different kinds of organizations and perhaps even in different sectors. A summary of these discussions will be published in the coming weeks.
But how will we open these conversations to the broader community? How will we engage those who have yet to adopt the principles but are interested in doing so? What about those interested but perhaps only if they are adapted in some way?
We already have a mechanism for soliciting feedback, questions, and suggestions concerning POSI. However, it is a relatively primitive system, based on either sending email to one of the POSI Posse or raising a GitLab ticket. It was the best we could do in the short time we had to put together the POSI site. An MVP, if you will. The feedback mechanism served us well over the past year; we engaged with many interested parties and even managed to help nine of them adopt the principles.
But as with all things POSI - there is room for improvement. And so, we hope to have a more user-friendly way to solicit public feedback and hold discussions. This feedback and our own experiences with adopting POSI over the past year will, in turn, inform our efforts at revising POSI to take into account the things we’ve learned since POSI was originally written.
So look out for announcements on the POSI site. And we look forward to another year of expanding the list of POSI adopters and continuing our own POSI progress. If you’re POSI-curious, get in touch with any of the ten POSI adopters to start a conversation about your own path towards truly open infrastructure.