Below is a link to my paper in the Proceedings of the 2020 Joint Statistical Meetings, "The Significance of Statistical Significance":
ww2.amstat.org/MembersOnly/proceedings/2020/data/assets/pdf/1505342.pdf
ww2.amstat.org/MembersOnly/proceedings/2020/data/assets/pdf/1505342.pdf
Abstract: Hypothesis testing and decision rules are in the news as never before. The reproducibility of experiments, one of the touchstones of the scientific method, is uncertain, while some warn that most scientific results are wrong; see Ioannidis, as well as van der Laan.
At the heart of the controversy is the significance of statistical significance: specifically, the significance of p-values. Poorly crafted decision rules have led to a loss of confidence in p-values, with some proposing to ban this incredibly useful tool altogether. We reject this over-reaction.
We will discuss three aspects of p-values: 1) improving model specification, thereby reducing the probability of a type II error (false negative), by introducing new families of transformations to reduce skewness and excess kurtosis; 2) setting significance level as a decreasing function of sample size, thereby reducing the probability of a type I error (false positive), thus compromising between a fixed significance level and a fixed meaningful effect size; 3) continuous decision rules that assign plausibility levels to the null hypothesis and alternative hypothesis.
At the heart of the controversy is the significance of statistical significance: specifically, the significance of p-values. Poorly crafted decision rules have led to a loss of confidence in p-values, with some proposing to ban this incredibly useful tool altogether. We reject this over-reaction.
We will discuss three aspects of p-values: 1) improving model specification, thereby reducing the probability of a type II error (false negative), by introducing new families of transformations to reduce skewness and excess kurtosis; 2) setting significance level as a decreasing function of sample size, thereby reducing the probability of a type I error (false positive), thus compromising between a fixed significance level and a fixed meaningful effect size; 3) continuous decision rules that assign plausibility levels to the null hypothesis and alternative hypothesis.