gms | German Medical Science

65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS)

06.09. - 09.09.2020, Berlin (online conference)

The significance function and confidence valleys for holistic depiction of inference on a parameter

Meeting Abstract

Search Medline for

  • Jeremy Franklin - University of Cologne, Köln, Germany

Deutsche Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie. 65th Annual Meeting of the German Association for Medical Informatics, Biometry and Epidemiology (GMDS), Meeting of the Central European Network (CEN: German Region, Austro-Swiss Region and Polish Region) of the International Biometric Society (IBS). Berlin, 06.-09.09.2020. Düsseldorf: German Medical Science GMS Publishing House; 2021. DocAbstr. 92

doi: 10.3205/20gmds279, urn:nbn:de:0183-20gmds2796

Published: February 26, 2021

© 2021 Franklin.
This is an Open Access article distributed under the terms of the Creative Commons Attribution 4.0 License. See license information at http://creativecommons.org/licenses/by/4.0/.


Outline

Text

Background: Warning against mis- and over-interpretation of null hypothesis significance testing and the p-value, especially the dichotomisation of results as ‘significant’ or ‘non-significant’, have long been expressed and long remained unheeded. Based on previous suggestions (p-value function, confidence curves), a graphical presentation of estimation/testing results is proposed to foster valid interpretation.

Methods: Focussing on inference concerning a parameter X (for instance, X may be a treatment effect in a clinical trial), we use a measure of statistical significance, S = log(p)/log(p0) (vertical axis), where p is the p-value corresponding to the 100(1-p)% confidence interval and p0 is a reference p-value such as 0.05. Plots of S against X can be used to represent inferences on X.

The resulting confidence limits form a U- or V-shaped curve whose base is the point-estimate (at S=0, which should also be plotted). The null effect and the region of clinical irrelevance can be marked on the graph. In order to give an impression of the relative ‘likelihood’ of the various values of X, the thickness of the curve is drawn proportional to the rate of change of p with X. Plots were created using repeated confidence limit calculations in SAS.

Results: Schematic examples are presented for various types of effect (mean difference, binomial odds ratio and risk difference. A real-data example is given for the result of a parallel-groups clinical trial with strata and a clustering effect. The odds-ratio of the primary endpoint, survival without broncopulmonary displasia at gestational age 36 weeks in pre-term newborns, was not significant at the pre-specified alpha level of 0.05 (Figure 1 [Fig. 1]).

Conclusion: The novel graph depicts the compatibility of the range of possible true effect values with the observed data, including the null effect and clinically (ir)relevant effects. P-values and confidence intervals can be read off from the graph, but dichotomous interpretation is avoided. Further, an impression of relative ‘likelihoods’ is given without having to resort to Bayes methods. The graph is particularly informative for asymmetric confidence intervals.

The authors declare that they have no competing interests.

The authors declare that an ethics committee vote is not required.


References

1.
Wasserstein RL, Lazar NA. The ASA Statement on p-Values: Context, Process, and Purpose. The American Statistician. 2016;70:129-133.
2.
Poole C. Beyond the confidence interval. Am J Public Health. 1987;77:195-199.
3.
Wellek S. A critical evaluation of the current “p-value controversy”. Biom J. 2017;59:854-872.
4.
Bender R, Berg G, Zeeb H. Using confidence curves in medical research. Biometrical J. 2005;47:237-247.
5.
Infanger D, Schmidt-Trucksäss A. P value functions: An underused method to present research results and to promote quantitative reasoning. Statistics in Medicine. 2019;38:4189-97.