Guest Post by David Mehler, Cardiff University and University of Münster
Are fMRI studies valid? That is a question that has been posited across the news media the past month – including most recently in the New York Times – in the wake of a new study by Anders Eklund and colleagues at Linköping University in Sweden and Warwick University. Several news articles claimed that the study “could invalidate 15 years of research”.
While demonstrating flaws in certain statistical approaches in fMRI data analysis that lead to high rates of false positives, the study does not invalidate all, or even the majority of, fMRI work. Over the last several weeks, I have scrutinized the study’s conclusions, discussing it with a broad swatch of cognitive neuroscientists and neuroimaging experts to determine the myths and facts surrounding the new work.
My interest in the topic grew when I started analyzing fMRI data during my Ph.D. and had to decide which statistical approach is appropriate. I primarily work with fMRI data from mental imagery tasks — tasks during which participants imagine certain actions or visual images. This kind of data usually gives smaller effects compared to most other types of experiments. For example, imagining a house leads to less activation in respective visual brain areas than seeing a picture of a house. It is thus important for my research to use statistics that are sensitive to relatively small effects and not too conservative, but also conservative enough in restricting false positives.
When we conduct an experiment, we usually aim to test a hypothesis for a certain effect against the hypothesis that there is no effect, the “null hypothesis”. The decision whether to accept or reject the hypothesis is based on the “p-value” that is calculated and a pre-defined threshold. A p-value gives us the probability for finding the same result by pure chance. In the life sciences, we call an effect significant if the p-value is below a pre-defined threshold p <0.05, this means that in up to 5% of the time we falsely reject the null hypothesis — so we say that there is an effect even though there is none. This threshold is therefore the upper bound of false positives that we deem acceptable. However, when we do more than one test, these 5% of false positives mount up (proportional to the number of tests conducted), unless we control for multiple testing with an appropriate correction procedure.
In their PNAS paper, Eklund and colleagues studied how reliably different correction procedures widely used in fMRI research control for multiple testing. This was an important investigation at the core of fMRI analyses: fMRI data are essentially blurry 3-D images of the brain. Each image contains many thousand image elements called voxels; think of them as the 3-D version of a pixel. Different types of fMRI data analysis exist, but in the predominant approach, a statistical test is conducted for every voxel. This means that we do multiple tests and can easily end up with 100,000 tests or more! It thus requires a reliable method to correct for multiple testing to avoid that the false positive rate inflates massively beyond the 5%.
Eklund and colleagues reported that for some techniques, multiple testing correction does not guarantee the 5% false positive rate but leads to much higher rates. The findings are less alarming than the news headlines might suggest, however, for two reasons: 1) settings that lead to inflated false positives have only been used in a subset of published fMRI studies; and 2) most identified flaws have been reported in previous work.
That leads to the two primary myths that need to be debunked:
- Of the more than 40,000 fMRI studies that have been published to date, certainly not all will be affected. In fact, PNAS has recently released an erratum in which the authors clarify that the results of their study do not affect all published fMRI work. Also, one of the co-authors, Tom Nichols, provided a more realistic number of about 3,600 studies that have used one of the problematic procedures that lead to significantly more than 5% false positive rate. However, even among these studies not all results are necessarily invalid but it would have to be decided for individual cases. The false positive rate is not equivalent to the percentage of false positives reported for a given study but only describes the chance of finding false positives. For instance, in experiments that show large effects and many true positives a high false positive rate would overall still result in a relatively small percentage of false positives.
- Some reports claimed a software bug caused the high false positive rates. While there was indeed a bug found in one particular software package (AFNI’s 3DCluststim, which was fixed in May 2015), it only comprises a relatively small proportion of increased false positives (~15%) in an even smaller proportion of studies.
The study by Eklund and colleagues has clearly shown that certain practices in fMRI research are flawed and provides evidence that methods that have been known to be more reliable produce valid results (for a more detailed explanation, please read on here). Their study has stimulated new discussions and new developments, and made researchers in the field more aware of the assumptions made when conducting analyses on fMRI data.
Therefore, for cognitive neuroscience in general, the study represents an important step in self-correcting and learning. However, for science to improve itself and continue to be valued by society in the future, it also requires rigorous reporting on the results to understand them and the full nuances in the broader context.
If you are interested in the statistical details of the study and more practical advice based on the papers findings, read on here for a journal club article I wrote about the Eklund paper.
David Mehler is an MD-PhD candidate in medicine and neuroscience at Cardiff University and University of Münster. He uses neuroimaging techniques (fMRI, EEG) to investigate neurofeedback training in healthy participants and patients with a focus on motor rehabilitation.
Are you a member of CNS with research you want to share? Consider contributing a guest post about your work or trends in the field. It could then become part of the a HuffPost Science series exploring the surge of new research on the human brain or featured on the online magazine Aeon. Email your ideas to CNS Public Information Officer, Lisa M.P. Munoz (cns.publicaffairs[@]gmail.com).