Wellcome Trust APC Data – Thank you!
The Wellcome Trust publicly raised concerns about the cost of so called ‘hybrid publishing’ last week as a direct result of the incredible work carried out by many in the Open Knowledge Foundation Open Access working group, and others in the wider community, enriching a data set of author processing charges released by the Wellcome Trust.
‘Hybrid publishing’ occurs when a journal contains articles which have been paid to be freely available online from the point of publication, while also containing articles which can only be accessed through personal or institutional subscription. To ensure academics can view all contents of such a journal, a university library must subscribe to that journal, resulting in publications supported by two funding streams; fees from authors *and* subscriptions.
There have long been concerns among many open access advocates about traditional journal publishers exploiting universities through this publishing model – but there has traditionally been a lack of openly available data. Funders and Universities have not released much data about the author charges, while libraries are subject to nondisclosure agreements stopping them discussing details of subscriptions.
The effort put in to developing the data set was incredible – and I wanted to publicly acknowledge some of those who put so much energy into it:
-
Theo Andrews – created the original google document, and came up with the idea of crowd-sourcing, as well as spending much time hunting for data.
-
Cameron Neylon – carried out an initial tidying up of the data, and helped promote the idea widely
-
Stuart Lawson – put in a super-human effort in data hunting
-
Emanuil Tolev – fixing DOIs, general technical thoughts
-
Alf Eaton – for flagging initial problems with DOIs, data hunting
-
Sam Smith – for helping finish the last load of hybrid/pure journal identification
- Tom Pollard – who created a script to pull DOIs from PMCIDs
-
Nic Weber
-
Jackie Proven
-
Fiona Wright
-
Yvonne Budden
-
Dawn Pike
And of course – thank you to the Wellcome Trust, and Robert Kiley, for releasing such a valuable data set that enables much better understanding of the current state of open access publishing.
I’m sure I’ve missed some names out, and certainly many were anonymously adding data. If I’ve missed you out, please do let me know as I’d love to ensure everyone is acknowledged for the effort put in.
The enriched Wellcome Trust data set is incredibly valuable. Not only has it enabled us to show the cost of hybrid publishing to the second biggest medical research charity in the world (which in turn indicates the potential costs to other research funders) it also provides a very useful precedent in terms of data release, provides a useful test bed for a number of tools that are currently being developed to automate much of this work, and enables exploration of the different licenses used by various publishers. I am sure many will continue to use this data set in the future.
The spreadsheet isn’t yet complete, but when it is, we’ll upload the data to Figshare.
Thank you to everyone who has helped with this work! And if you aren’t already, I’d urge you to sign up to the open access mailing list and join us for further discussions and activities around and towards open access.
Blogs that have come out of the data:
(if yours isn’t here, please email me)
-
Wellcome Trust – ‘The cost of open access publishing: a progress report’
-
Ernesto Priego – Provided an early analysis of the Wellcome Trust data here, and raised some questions here about a possible serials crisis
-
Michelle Brook – A post on the sheer scale of open access publishing, and questioning some of the conditions included in a CC-BY license by Wiley-Blackwell
-
Peter Murray-Rust – Noticed some Open Access articles from Elsevier existing behind paywalls, leading to a Times Higher Education article
[…] followed by a substantial effort to clean that data up. This crowd-sourced data curation process has been described by Michelle Brook. Here I want to reflect on the tools that were available to us and how they made some aspects of […]
[…] followed by a substantial effort to clean that data up. This crowd-sourced data curation process has been described by Michelle Brook. Here I want to reflect on the tools that were available to us and how they made some aspects of […]
[…] followed by a substantial effort to clean that data up. This crowd-sourced data curation process has been described by Michelle Brook. Here I want to reflect on the tools that were available to us and how they made some aspects of […]
[…] been cleaned up, improved upon, and augmented by a number of people working together (see these posts from the Open Knowledge Foundation for a summary of this work so far, and a follow-up post from […]