I will argue that there is a great value of the study of parapsychology in assessing how bad the replication crisis is. First, I will discuss how chance and incentives of academic researchers and journals lead to the replication crisis. Next, I will argue that Parapsychology can be seen as a control group to the other academic literature in a huge randomized control trial.
Statistical significance
The p-value indicates statistical significance with a generally accepted threshold of 0.05 (5%). This means is that given something is true, if we did an experiment (collected data) multiple times only in 5% of the cases the data would suggest the wrong answer.
Imagine you run an experiment: You select a group of people at random (sample). You split them at random and give half of them Asprin (treatment) and the other half you give a pill with nothing inside (control). The researcher then measures over the coming years how many heart attacks were observed in the treatment group vs the control group. Oh look at that: The people who got Asprin had fewer heart attacks.
But can you imagine a scenario in which the people who didn’t get the aspirin got fewer heart attacks? Yes! Many other factors come to play: People in the treatment group didn’t take it regularly, the control took other medicine to prevent heart attacks, or, out of pure chancel, people who were very prone to heart attacks were randomly selected into the treatment group. In rare cases, this will give the researcher a significant p-value that aspirin causes heart attacks!
P-Hacking
A Significant P-Value is the currency of the humanities. With this you can buy journal publications, resulting in a tenure track, and a happy solid carrier in academia. The system leads to incentives where researchers, knowingly or unknowingly, data-fish, or p-hack.
Journals don’t like publishing non-significant results. Admittedly it’s the dullest read: “We studied this, and nothing come out of it”. But because of that journals like to publish significant results. This marks a huge problem
Imagine now replacing asprin with a vitamin C pills. If you run this experiment a million times most of the results will be insignificant (cause we assume that vitamin C won't help prevent heart attacks). But you will find one sample where there is a significant result in which the treatment of vitamin C did help prevent heart attacks. If you do something often enough, the rarest findings will show up.
That in itself is not the problem. The problem is that attention is given to only these rare examples of significant p-values. The vast majority of studies in which the p-value was not significant is not published, thus only the random unlikely outlier is celebrated in the academic community.
Thus, if you read an academic article that found a significant p-value you cannot be sure whether this shows the actual probability of replication or whether this is the one study out of a thousand that shows a statistical significance.
People took notice and replicated celebrated findings and found no significant result. They struggled to publish those results. That’s what’s called the replication crisis. Whole fields of studies (f.e. priming effects) are inflated due to it.
Parapsychology might be the solution
Parapsychology is the study of extrasensory perception. This includes telepathy, psychokinesis, telekinesis, and others. There are peer-reviewed Parapsychology journals and hundreds of studies published in mainstream psychology journals. There are three main points to stress here:
1. Parapsychology isn’t a thing (people don’t have extrasensory perception).
2. Most of the published articles in the parapsychology field use the same methodology and p-value threshold as in mainstream psychology.
3. Parapsychology is a thriving field of study that manages to publish significant results.
If positive results show up for things that don’t exist we have a problem (this can most likely be attributed to the p-hacking described above). On the other hand, this might be a blessing in disguise: We can run a randomized control trial where mainstream journal articles are the treatment group and parapsychology is the control group.
These two fields (mainstream and Parapsychology) thrive within the same methodology and system of published journal articles. But we know that one must be false. For us to trust the mainstream we need to see parapsychology fail. If this is not the case then we have no significant result about significant results.
Additional thoughts:
Are fields of study where the cost of experiments are low most prone to this problem? I imagine psychology studies are the cheapest to do: Get 50 students in a room and let them do a task, pay them 5 bucks each. With 250 bucks you are set. This means that you could easily do the experiment 100 times and thus find a significant result and some point. The costlier the experiment, the less likely it is that it has been conducted several times in order to find a significant result. Medical studies might not run 100 times since it is costly, psychology studies can afford it. Does that make sense?
References:
This topic is discussed in https://economicsdetective.com/2020/08/science-fictions-with-stuart-ritchie/ podcast (great episode, great series)
Comments