A blog on statistics, methods, and open science. Understanding 20% of statistics will improve 80% of your inferences.

Thursday, January 18, 2018

The Costs and Benefits of Replications


This blog post is based on a pre-print by Coles, Tiokhin, Scheel, Isager, and Lakens “The Costs and Benefits of Replications”, submitted to Behavioral Brain Sciences as a commentary on “Making Replication Mainstream”.

In a summary of recent discussions about the role of direct replications in psychological science, Zwaan, Etz, Lucas, and Donnellan (2017) argue that replications should be more mainstream. The debate about the importance of replication research is essentially driven by disagreements about the value of replication studies, in a world where we need to carefully think about the best way to allocate limited resources when pursuing scientific knowledge. The real question, we believe, is when replication studies are worthwhile to perform.

Goldin-Meadow stated that "it’s just too costly or unwieldy to generate hypotheses on one sample and test them on another when, for example, we’re conducting a large field study or testing hard-to-find participants" (2016). A similar comment is made by Tackett and McShane (2018) in their comment on ZELD: “Specifically, large-scale replications are typically only possible when data collection is fast and not particularly costly, and thus they are, practically speaking, constrained to certain domains of psychology (e.g., cognitive and social).”

Such statements imply a cost-benefit analysis. But these scholars do not quantify their costs and benefits. They hide their subjective expected utility (what is a large-scale replication study worth to me) behind absolute statements, as they write “is” and “are” but really mean “it is my subjective belief that”. Their statements are empty, scientifically speaking, because they are not quantifiable. What is “costly”? We can not have a discussion about such an important topic if researchers do not specify their assumptions in quantifiable terms.

Some studies may be deemed valuable enough to justify even quite substantial investments to guarantee that a replication study is performed. For instance, because it is unlikely that anyone will build a Large Hadron Collider to replicate the studies at CERN, there are two detectors (ATLAS and CMS) so that independent teams can replicate each other’s work. That is, not only do these researchers consider it important to have a very low (5 sigma) alpha level when they analyze data, they also believe it is worthwhile to let two team independently do the same thing. As a physicist remarks: “Replication is, in the end, the most important part of error control. Scientists are human, they make mistakes, they are deluded, and they cheat. It is only through attempted replication that errors, delusions, and outright fraud can be caught.” Thus, high cost is not by itself a conclusive argument against replication. Instead, one must make the case that the benefits do not justify the costs. Again, I ask: what is “costly”?

Decision theory is a formal framework that allows researchers to decide when replication studies are worthwhile. It requires researchers to specify their assumptions in quantifiable terms. For example, the expected utility of a direct replication (compared to a conceptual replication) depends on the probability that a specific theory or effect is true. If you believe that many published findings are false, then directly replicating prior work may be a cost-efficient way to prevent researchers from building on unreliable findings. If you believe that psychological theories usually make accurate predictions, then conceptual extensions may lead to more efficient knowledge gains than direct replications. Instead of wasting time arguing about whether direct replications are important or whether conceptual replications are important, do the freaking math. Tell us at which probability that H0 is true you think it is efficient enough to weed out false positives from the literature through direct replications. Show us, by pre-registering all your main analyses, that you are building on strong theories that allow you to make correct predictions with a 92% success rate, and that you therefore do not feel direct replications are the more efficient way to gain knowledge in your area.

I am happy to see our ideas about the importance of using decision theory to determine when replications are important enough to perform were independently replicated in this commentary on ZELD by Hardwicke, Tessler, Peloquin, and Frank. We have collaboratively been working on a manuscript to specify the Replication Value of replication studies for several years, and with the recent funding I received, I’m happy that we can finally dedicate the time to complete this work. I look forward to scientists explicitly thinking about the utility of the research they perform. This is an important question, and I can’t wait for our field to start discussing ways to answer how we can quantify the utility of the research we perform. This will not be easy. But unless you never think about how to spend your resources, you are making these choices implicitly all the time, and this question is too important to give up without even trying. In our pre-print, we illustrate how all concerns raised against replication studies basically boil down to a discussion about their costs and benefits, and how formalizing these costs and benefits would improve the way researchers discuss this topic.

5 comments:

  1. Perhaps it would be easier for people to get into justifying replications in this way if they were also in the habit of similarly justifying why they run their initial studies. I'm not convinced that this is always the case. "Because we have a grad student who thinks this is interesting and a participant pool who have quotas to meet" does not necessarily meet this criterion, I would suggest.

    ReplyDelete
  2. "Tell us at which probability that H0 is true you think it is efficient enough to weed out false positives from the literature through direct replications. Show us, by pre-registering all your main analyses, that you are building on strong theories that allow you to make correct predictions with a 92% success rate, and that you therefore do not feel direct replications are the more efficient way to gain knowledge in your area."

    Hmm, not sure if i agree with (my interpretation of) this. I reason:

    1) it seems to me that it is impossible to determine/gauge the percentage of hypotheses that (will) turn out to be "correct"/"proven".

    2) more importantly, it seems to me that a) it doesn't matter what this percentage is (within reasonable boundaries), and b) it is not even desirable to try and determine/gauge this percentage, because i reason both a) and b) are irrelevant for building knowledge.

    What matters in my reasoning is amassing things like optimally gathered (and thus maximaly informational) data, and arguments/reasoning, which can both be used for things like theory-building and -testing.

    I am all for "(...) scientists explicitly thinking about the utility of the research they perform", but i reason it might make more sense to think about this for all research, not just replications. In fact, i reason that it is more important for "original" research, because i predict that (nearly) all else will follow automatically once things are done more optimally from the start.

    The bottom line to me is, many of the things that might be wrong in psychological science seem to me to be connected, and based on a few things that need to be improved. I reason the rest of the problems will solve themselves automatically.

    Here is an idea which tries to solve some of the basic things that might be connected, and which i reason will also help solve other issues. I called it after what i think is a summarization of the few basic problems which i reason can all easily be solved: “Science is dependent on scientists (old flawed model?) V scientistst are dependent on Science (new improved model?)”:

    http://andrewgelman.com/2017/12/17/stranger-than-fiction/#comment-628652

    I hope you will (also) focus on how to optimally perform research in general, and not just on replications. I reason when the former is done, the rest will follow automatically. Good luck/ all the best with your work on this important topic !!

    ReplyDelete
  3. "Goldin-Meadow stated that "it’s just too costly or unwieldy to generate hypotheses on one sample and test them on another when, for example, we’re conducting a large field study or testing hard-to-find participants" (2016). A similar comment is made by Tackett and McShane (2018) in their comment on ZELD: “Specifically, large-scale replications are typically only possible when data collection is fast and not particularly costly, and thus they are, practically speaking, constrained to certain domains of psychology (e.g., cognitive and social).”"

    I always find this reasoning fascinating. To me, it doesn't make much sense, because these "hard to find" or "costly" participants are apparently only "hard to find" or "costly" for 1 study.

    After all, chances are high that a next study will use the same "hard to find" or "costly" particpants but now for a different study for which participants and money magically appear all of a sudden.

    ReplyDelete
  4. Wow thawt waѕ odd. I јust wrote an very long comment bᥙt after I clicked submit mү comment diԀn't
    snow up. Grrrr... welⅼ I'm not writing ɑll that over agаin. Аnyhow, just wantеd to sау
    wonderful blog!

    ReplyDelete
    Replies
    1. Sorry about that - blogger is spammed like crazy, it is basically impossible to allow and moderate comments the last months, so they get caught in crappy spam filters or buried between spam.

      Delete