Home » Opinion Pieces
Category Archives: Opinion Pieces
Guest post by Raphael Calel, Ciriacy-Wantrup Postdoctoral Fellow at the Department of Agricultural and Resource Economics at the University of California, Berkeley.
One of the most important tools for enhancing the credibility of research is the pre-analysis plan, or the PAP. Simply put, we feel more confident in someone’s inferences if we can verify that they weren’t data mining, engaging in motivated reasoning, or otherwise manipulating their results, knowingly or unknowingly. By publishing a PAP before collecting data, and then closely following that plan, researchers can credibly demonstrate to us skeptics that their analyses were not manipulated in light of the data they collected.
Still, PAPs are credible only when the researcher can anticipate and wait for the collection of new data. The vast majority of social science research, however, does not satisfy these conditions. For instance, while it is perfectly reasonable to test new hypotheses about the causes of the recent financial crisis, it is unreasonable to expect researchers to have pre-specified their analyses before the crisis hit. To give another example, no one analysing a time series of more than a couple of years can reasonably be expected to publish a PAP and then wait for years or decades before implementing the study. Most observational studies face this problem in one form or another.
Guest post by Olivia D’Aoust, Ph.D. in Economics from Université libre de Bruxelles, and former Fulbright Visiting Ph.D. student at the University of California, Berkeley.
As a Fulbright PhD student in development economics from Brussels, my experience this past year on the Berkeley campus has been eye opening. In particular, I discovered a new movement toward improving the standards of openness and integrity in economics, political science, psychology, and related disciplines lead by the Berkeley Initiative for Transparency in the Social Sciences (BITSS).
When I first discovered BITSS, it struck me how little I knew about research on research in the social sciences, the pervasiveness of fraud in science in general (from data cleaning and specification searching to faking data altogether), and the basic lack of consensus on what is the right and wrong way to do research. These issues are essential, yet too often they are left by the wayside. Transparency, reproducibility, replicability, and integrity are the building blocks of scientific research.
Roger Peng and Jeffrey Leek of John Hopkins University claim that “ridding science of shoddy statistics will require scrutiny of every step, not merely the last one.”
This blog post originally appeared in Nature on April 28, 2015 (see here).
There is no statistic more maligned than the P value. Hundreds of papers and blogposts have been written about what some statisticians deride as ‘null hypothesis significance testing’ (NHST; see, for example, go.nature.com/pfvgqe). NHST deems whether the results of a data analysis are important on the basis of whether a summary statistic (such as a P value) has crossed a threshold. Given the discourse, it is no surprise that some hailed as a victory the banning of NHST methods (and all of statistical inference) in the journal Basic and Applied Social Psychology in February.
Such a ban will in fact have scant effect on the quality of published science. There are many stages to the design and analysis of a successful study. The last of these steps is the calculation of an inferential statistic such as a P value, and the application of a ‘decision rule’ to it (for example, P < 0.05). In practice, decisions that are made earlier in data analysis have a much greater impact on results — from experimental design to batch effects, lack of adjustment for confounding factors, or simple measurement error. Arbitrary levels of statistical significance can be achieved by changing the ways in which data are cleaned, summarized or modelled2.
Dec 15th Maggie Puniewska posted an article in the Atlantic Magazine summarizing the obstacles preventing researchers from sharing their data.
The article asks if “science has traditionally been a field that prizes collaboration […] then why [are] so many scientists stingy with their information.”
Puniewska outlines the most cited reasons scientists reframe from sharing their data.
The culture of innovation breeds fierce competition, and those on the brink of making a groundbreaking discovery want to be the first to publish their results and receive credit for their ideas.
[I]f sharing data paves the way for an expert to build upon or dispute other scientists’ results in a revolutionary way, it’s easy to see why some might choose to withhold.
In a recent interview appearing in Discover Magazine, Brian Nosek, Co-founder of the Center for Open Science and speaker at the upcoming BITSS Annual Meeting, discusses the credibility crisis in psychology.
According to the article, Psychology has lost much of it credibility after a series of published papers were revealed as fraudulent and many other study results were found to be irreproducible.
Fortunately, “psychologists, spurred by a growing crisis of faith, are tackling it [the credibility crisis] head-on. Psychologist Brian Nosek at the University of Virginia is at the forefront of the fight.” Below are excerpts from Nosek’s interview with Discover Magazine discussing what he and others are doing to increase the rigor of research.
What are you doing about the crisis?
BN: In 2011, colleagues and I launched the Reproducibility Project, in which a team of about 200 scientists are carrying out experiments that were published in three psychology journals in 2008. We want to see how many can reproduce the original result, and what factors affect reproducibility. That will tell us if the problem of false-positive results in the psychology journals is big, small or non-existent…
[W]e built the Open Science Framework (OSF) a web application where collaborating researchers can put all their data and research materials so anyone can easily see them. We also offer incentives by offering “badges” for good practices, like making raw data available. So the more open you are, the more opportunities you have for building your reputation.
Originally posted on the Open Science Collaboration by Denny Borsboom
This train won’t stop anytime soon.
That’s what I kept thinking during the two-day sessions in Charlottesville, where a diverse array of scientific stakeholders worked hard to reach agreement on new journal standards for open and transparent scientific reporting. The aspired standards are intended to specify practices for authors, reviewers, and editors to follow in order to achieve higher levels of openness than currently exist. The leading idea is that a journal, funding agency, or professional organization, could take these standards off-the-shelf and adopt them in their policy. So that when, say, The Journal for Previously Hard To Get Data starts to turn to a more open data practice, they don’t have to puzzle on how to implement this, but may instead just copy the data-sharing guideline out of the new standards and post it on their website.
The organizers1 of the sessions, which were presided by Brian Nosek of the Center for Open Science, had approached methodologists, funding agencies, journal editors, and representatives of professional organizations to achieve a broad set of perspectives on what open science means and how it should be institutionalized. As a result, the meeting felt almost like a political summit. It included high officials from professional organizations like the American Psychological Association (APA) and the Association for Psychological Science (APS), programme directors from the National Institutes of Health (NIH) and the National Science Foundation (NSF), editors of a wide variety of psychological, political, economic, and general science journals (including Science and Nature), and a loose collection of open science enthusiasts and methodologists (that would be me).
In a recent post on Data Colada, University of Pennsylvania Professor Uri Simonsohn discusses what do in the event you (a researcher) are accused of having altered your data to increase statistical significance.
It has become more common to publicly speculate, upon noticing a paper with unusual analyses, that a reported finding was obtained via p-hacking.
For example “a Slate.com post by Andrew Gelman suspected p-hacking in a paper that collected data on 10 colors of clothing, but analyzed red & pink as a single color” [.html] (see authors’ response to the accusation .html) or “a statistics blog suspected p-hacking after noticing a paper studying number of hurricane deaths relied on the somewhat unusual Negative-Binomial Regression” [.html].
Instinctively, Simonsohn says, a researcher may react to accusations of p-hacking by attempting to justify the specifics of his/her research design but if that justification is ex-post, the explanation will not be good enough. In fact:
P-hacked findings are by definition justifiable. Unjustifiable research practices involve incompetence or fraud, not p-hacking.