Open Access: a remedy against bad science
December 4, 2012 in Uncategorized
Who has never been in the situation that he had a set of data where some of them just didn’t seem to fit. A simple adjusting of the numbers or omitting of strange ones could solve the problem. Or so you would think. I certainly have been in such a situation more than once, and looking back, I am glad that I left the data unchanged. At least in one occasion my “petty” preformed theory proved to be wrong and the ‘strange data’ I had found were corresponding very well with another concept that I hadn’t thought of at the time.
There has been a lot of attention in the media recently for cases of scientific fraud. Pharmaceutical companies are under fire for scientific misconduct (Tamiflu story), and in the Netherlands the proven cases of fraud by Stapel (social psychology), Smeesters (psychology) and Poldermans (medicine/cardiology) have resulted in official investigations into the details of malpractice by these scientists. All this has led to a sharp decline in the trust that people used to have in science (Flawed science:The fraudulent research practices of socialpsychologist Diederik Stapel: report of the committee LND (Levelt,Noort, Drenth)). A report with recommendations for preventing scientific fraud, called “sharpening policy after Stapel” was published by four Dutch social psychologists: Paul van Lange (Amsterdam), Bram Buunk (Groningen), Naomi Ellemers (Leiden) and Daniel Wigboldus (Nijmegen). One of the report’s main recommendations is to share raw data and have them permanently stored safely and accessible for everyone. It will be clear that the issue of scientific misconduct is by no means restricted to the Netherlands, nor to specific fields of research. Other countries have similar stories for other scientific fields. For example a committee like the committee LND mentioned above , recently presented the outcome of an investigation into the scientific publications of Eric Smart from the University of Kentucky in the field of food science. And then there is the essay by John Ioannidis “Why most published research findings are false”, where he gives 6 reasons for the bad quality of (medical) science……
In this article I propose that for almost all of the instances where scientific misconduct was found, open access to articles AND raw data would have either prevented the fraud altogether, or at the very least would have caused them to be exposed much more rapidly than has been the case in the current situation. Especially in the field of medical research such a change can literally change lives.
To illustrate this point I want to make a distinction between different forms of ‘Bad Science’. On the author side we can have selective publishing (omitting data that do not fit one’s theory), non-reproducibility, data manipulation and at the far end of the spectrum even data fabrication. On the side of publishers we have publication bias (preferential publishing of positive results or data that confirm an existing theory), fake peer review and reviewers or editors pushing authors to make a clear story by omitting data (effectively resulting in selective publishing!).
PUBLICATION BIAS. The strategy of publishers to preferentially publish the most exciting stories and stories in support of a new finding (publication bias) contributes to selective publishing and sloppy science. Under much pressure to publish their (exciting) results researchers take less care than would be advisable when they submit their research to highly ranked journals. No small wonder that so-called high impact journals also show very high retraction rates of manuscripts. Publication bias is also a real problem when validating scientific findings. Published results are often unrepresentative of the true outcome of many similar experiments that were not selected for publication. For example, an empirical evaluation of the 49 most-cited papers on the effectiveness of medical interventions, published in highly visible journals in1990–2004, showed that a quarter of the randomised trials and five of six non-randomised studies had already been contradicted or found to have been exaggerated by 2005 (see: why current publication practices may distort science). At the same time negative findings tend to be dismissed. In the case of efficacy studies for a new drug two positive studies are sufficient for registration with the FDA while cases are reported where the number of submitted negative studies can be as high as 18 (see: selective publication of anti-depressant trials and its influence on apparent efficacy). I don’t think that I have to spell out the consequences that this has for medical health.
QUALITY CONTROL. In cases where scientists commit fraud, the main control mechanism against this in the current situation consists of peer-review and comments from colleagues who have read the article(s). This control sometimes suffices, but in many cases peer-reviewers don’t have or don’t take the time to look at the actual content of an article in detail, let alone at the raw data. Often these data are not even available anyway, or inexplicably got lost somehow. Another complication is that because of the enormous growth in number of journals and total scientific output it has become increasingly difficult to do proper quality checks of all the articles in the form of peer-review. And the kind of rigorous study into malpractices like the one done by the committee LND for the case of social psychologist D. Stapel, shows how much time (1 ½ years) it can take to check on just one scientist. This underpins the notion that it will be impossible to check on all suspicious articles in this way.The solution in my view can be found in open access publishing. Making information available for virtually everybody automatically entails a control mechanism for scientific quality, by something like ‘crowd-sourced peer-review’. To state it more simply: the more people there are who can take a look at complete data, the more likely it is that inconsistencies will be quickly spotted.
THE CASE FOR OPEN ACCESS. When articles and data are published open access, this fact alone discourages scientific misconduct. The availability of the complete article, including the raw data, to a very large audience has this kind of effect. One can be sure that if there is something wrong with the article, there will be someone out there who will spot this. The same mechanism is responsible for a major advantage of open access: the fact that scientific information that has been made available using open access will reach such a large audience that there will always be someone out there who can and will improve on the ideas described in a publication. At the recent Berlin10 conference in Stellenbosch this so-called “proximity paradox” has been brilliantly explained by Elliot Maxwell. He described the effect with the single sentence: ““With enough eyeballs, all bugs are shallow”, meaning that with enough dissemination any problem can be solved. Tim Gowers, a fervent proponent of open access has exploited this in his now famous Polymath Project: sharing a very difficult mathematical problem with as many people as possible solved the problem in a fraction of the time than would have been possible doing it any other way. The company Innocentive.com exploits this effect by broadcasting a problem that has to be solved and offering a reward of a fixed amount of money for anyone offering the solution. In this manner the “wisdom of the crowd” offers a way to keep science and scientists on track, while at the same time it stimulates a new way of doing science: by speeding it up, promoting the pursuit of new research, increasing innovation potential by contributions from unforeseen sources and accelerating the translation from actual discovery to practical solutions (E. Maxwell). And the crowd can only be wise with Open Access to information. Another effect of open access on the quality of science is that it effectively reduces duplicative and dead-end scientific projects. And last but not least open access facilitates the accountability for research funding and it facilitates focusing on real priorities.
THE ROLE OF OPEN ACCESS JOURNALS. While peer-review remains indispensable for publishing good science, open access enables other forms of peer-review than the ones that are traditionally in use. Open access articles can be peer-reviewed by many more people. Post-publication peer-reviewing will certainly prove to be an effective control mechanism for the quality of scientific articles, and for the detection of scientific misconduct. But also pre-publication peer-review can be improved. BiomedCentral recently started a new system of peer review Peerage of Science, which works with a “pool” of possible peer-reviewers much larger than the small number of reviewers that other journals usually have on call. This will speed up the process of peer-review, and having less of a burden on single reviewers will probably also improve the quality of the peer-review process itself. Another very important point concerning the prevention of fraud is the open access publication of the underlying raw data together with the article. A number of initiatives exist in this area. Figshare and Dryad facilitate the storage and linking of raw research data and journals are slowly starting to move towards the publication of raw data linked to the article. The National Science Foundation and other funders are now accepting data as first class research output. However, we still have a long way to go. In spite of the fact that 44/50 journals had a policy put in place for the sharing of data, a survey in PLoSONE (Public availability of published research data in high-impact journals) concluded that for only 47/500 scientific publications that had appeared in these journals in 2009, research data had been made available online. Despite all the efforts described above this situation has probably not changed substantially.
CONCLUSION. Implementation of open access inclusive of full access to raw research data would minimize the possibilities for scientific fraud, which can be anything from biased presentation to the fabrication of data or the dismissal of negative results. It would most certainly change the way that science is done. Having the data available, open access, for the whole world to see and check on, will provide a very strong incentive for scientists to publish good science. In my view it will prove to be very difficult indeed to present faulty data in an open access / open science system, and actually get away with it.