Wellcome Trust APC Data – Thank you! | Open Access Working Group

Wellcome Trust APC Data – Thank you!

April 1, 2014 in Comment

The Wellcome Trust publicly raised concerns about the cost of so called ‘hybrid publishing’ last week as a direct result of the incredible work carried out by many in the Open Knowledge Foundation Open Access working group, and others in the wider community, enriching a data set of author processing charges released by the Wellcome Trust.

‘Hybrid publishing’ occurs when a journal contains articles which have been paid to be freely available online from the point of publication, while also containing articles which can only be accessed through personal or institutional subscription. To ensure academics can view all contents of such a journal, a university library must subscribe to that journal, resulting in publications supported by two funding streams; fees from authors *and* subscriptions.

There have long been concerns among many open access advocates about traditional journal publishers exploiting universities through this publishing model – but there has traditionally been a lack of openly available data. Funders and Universities have not released much data about the author charges, while libraries are subject to nondisclosure agreements stopping them discussing details of subscriptions.

The effort put in to developing the data set was incredible – and I wanted to publicly acknowledge some of those who put so much energy into it:

Theo Andrews – created the original google document, and came up with the idea of crowd-sourcing, as well as spending much time hunting for data.
Cameron Neylon – carried out an initial tidying up of the data, and helped promote the idea widely
Stuart Lawson – put in a super-human effort in data hunting
Emanuil Tolev – fixing DOIs, general technical thoughts
Alf Eaton – for flagging initial problems with DOIs, data hunting
Sam Smith – for helping finish the last load of hybrid/pure journal identification
Tom Pollard – who created a script to pull DOIs from PMCIDs
Daniel Mietchen
Rupert Gatti
Jenny Molloy
Nic Weber
Jackie Proven
Fiona Wright
Yvonne Budden
Dawn Pike

And of course – thank you to the Wellcome Trust, and Robert Kiley, for releasing such a valuable data set that enables much better understanding of the current state of open access publishing.

I’m sure I’ve missed some names out, and certainly many were anonymously adding data. If I’ve missed you out, please do let me know as I’d love to ensure everyone is acknowledged for the effort put in.

The enriched Wellcome Trust data set is incredibly valuable. Not only has it enabled us to show the cost of hybrid publishing to the second biggest medical research charity in the world (which in turn indicates the potential costs to other research funders) it also provides a very useful precedent in terms of data release, provides a useful test bed for a number of tools that are currently being developed to automate much of this work, and enables exploration of the different licenses used by various publishers. I am sure many will continue to use this data set in the future.

The spreadsheet isn’t yet complete, but when it is, we’ll upload the data to Figshare.

Thank you to everyone who has helped with this work! And if you aren’t already, I’d urge you to sign up to the open access mailing list and join us for further discussions and activities around and towards open access.

Blogs that have come out of the data:
(if yours isn’t here, please email me)

Wellcome Trust – ‘The cost of open access publishing: a progress report’
Ernesto Priego – Provided an early analysis of the Wellcome Trust data here, and raised some questions here about a possible serials crisis
Michelle Brook – A post on the sheer scale of open access publishing, and questioning some of the conditions included in a CC-BY license by Wiley-Blackwell
Peter Murray-Rust – Noticed some Open Access articles from Elsevier existing behind paywalls, leading to a Times Higher Education article

← What is content mining?

The cost of academic publishing →

Science in the Open » Blog Archive » Fork, merge and crowd-sourcing data curation says:

April 26, 2014 at 3:10 pm

[…] followed by a substantial effort to clean that data up. This crowd-sourced data curation process has been described by Michelle Brook. Here I want to reflect on the tools that were available to us and how they made some aspects of […]

Editor’s Choice: Fork, Merge and Crowd-Sourcing Data Curation | Digital Humanities Now says:

May 1, 2014 at 4:01 pm

Impact of Social Sciences – Fork, merge and crowd-sourcing data curation: tools for collective data processing and analysis. says:

May 5, 2014 at 10:30 am

Wellcome Trust APC data: correlation of APC costs with Impact Factor | semantic rain says:

May 18, 2014 at 9:22 pm

[…] been cleaned up, improved upon, and augmented by a number of people working together (see these posts from the Open Knowledge Foundation for a summary of this work so far, and a follow-up post from […]

Wellcome Trust APC Data – Thank you!

4 responses to “Wellcome Trust APC Data – Thank you!”

Leave a Reply to Editor’s Choice: Fork, Merge and Crowd-Sourcing Data Curation | Digital Humanities Now Cancel reply

Written by
Michelle Brook

Get Involved

Recent Posts

Can Open Access transform global health?

License

Wellcome Trust APC Data – Thank you!

4 responses to “Wellcome Trust APC Data – Thank you!”

Leave a Reply to Editor’s Choice: Fork, Merge and Crowd-Sourcing Data Curation | Digital Humanities Now Cancel reply

Written by Michelle Brook

Get Involved

Recent Posts

Can Open Access transform global health?

License

Written by
Michelle Brook