Date: No date given

Loading map...

L. Held (University of Zurich)

‘A new standard for the analysis and design of replication studies’

A new standard is proposed for the evidential assessment of replication studies. The approach combines a specific reverse Bayes technique with prior-predictive tail probabilities to define replication success. The method gives rise to a quantitative measure for replication success, called the sceptical p-value. The sceptical p-value integrates traditional significance of both the original and the replication study with a comparison of the respective effect sizes. It incorporates the uncertainty of both the original and the replication effect estimates and reduces to the ordinary p-value of the replication study if the uncertainty of the original effect estimate is ignored. The framework proposed can also be used to determine the power or the required replication sample size to achieve replication success. Numerical calculations highlight the difficulty of achieving replication success if the evidence from the original study is only suggestive. An application to data from the Open Science Collaboration project on the replicability of psychological science illustrates the methodology proposed.

To be published in Series A; for more information go to the Wiley Online Library.

K. Rice (University of Washington, Seattle), T. Bonnett (National Laboratory for Cancer Research, Frederick) and C. Krakauer (University of Washington, Seattle)

‘Knowing the signs: a direct and generalizable motivation of two-sided tests’

Many well-known problems with two-sided p-values are due to their use in hypothesis tests, with ‘reject–accept’ conclusions about point null hypotheses. We present an alternative motivation for p-value-based tests, viewing them as assessments of only the sign of an underlying parameter, where we can conclude that the parameter is positive, negative or simply say nothing either way. Our approach is decision theoretic, but—unusually—we consider the whole set of possible utility functions available. Doing this we show how, in a specific sense, close analogues of familiar one- and two-sided tests are always the optimal decision. We argue that this simplicity could aid non-experts’ understanding and use of tests—and help them to think critically about whether or not tests are appropriate tools for answering their questions of interest. Several extensions are also considered, showing that the simple idea of determining the signs of parameters yields a rich framework for inference.

To be published in Series A; for more information go to the Wiley Online Library

Keywords: HDRUK

Venue: The Royal Statistical Society

City: London

Country: United Kingdom

Postcode: EC1Y 8LX

Organizer: Royal Statistical Society

Event types:

  • Workshops and courses


Activity log