Replies: 1 comment
-
Interesting; thanks. It looks like we could add the 'common language effect size' and its CI through the RProbSup package from Ruscio; the A function. When that function is called with the percentile method of determining the CI it gives results pretty that seem to match well what you presented in your paper with the somersd package in stata from Newsom. I'm not sure if there is a reason to prefer the percentile or BCA approaches, but can do some digging. It looks like it should be possible to add this for multiple designs. I've got a bunch of higher-priority updates still to make, but I should be able to get this added in early 2024. For a label for this effect size: common language effect size seems a bit opaque. Strong objections to 'stochastic superiority'? Also, I'm curious about your comment about the limited utility of comparing medians. We're using the approach explained in Bonett & Price (2002) for between-subjects designs and the approach from Bonett & Price (2020) for within-subjects designs. For both, we're using the implementation provided by Bonett in his statpsych package for R. It was my understanding that this approach generalizes well from sample to population and does not require the often-unrealistic assumption of identically-shaped distributions. I've put references below. If you have a different take; happy to hear/understand. Bob Bonett, Douglas G., and Robert M. Price. “Statistical Inference for a Linear Function of Medians: Confidence Intervals, Hypothesis Testing, and Sample Size Requirements.” Psychological Methods 7, no. 3 (2002): 370–83. https://doi.org/10.1037/1082-989X.7.3.370. Bonett, Douglas G., and Robert M. Price. “Interval Estimation for Linear Functions of Medians in Within‐subjects and Mixed Designs.” British Journal of Mathematical and Statistical Psychology 73, no. 2 (May 7, 2020): 333–46. https://doi.org/10.1111/bmsp.12171. |
Beta Was this translation helpful? Give feedback.
-
The MW statistic can be turned into a very good measure of effect size which has been repeatedly rediscovered and renamed. I rather like the name "common language effect size". It is the probability that an observation from one group will be higher than an observation from the other. In a therapeutic trial, for instance, it's the probability of a person on the treatment scoring better than a person on the comparitor.
The problem is that Mann and Whitney didn't compute it in the paper. U is the last step before the measure of stochastic superiority that the paper's title promises. To convert U to the probability you must divide by its highest value, which is N1 x N2.
The difference between medians is also not very useful, because unlike the difference between means, it doesn't generalise – the expected median difference between two observations is not the median difference between the groups. The Hodges-Lehman estimator is the measure of effect size of choice in comparisons of median values.
See
Conroy RM. What hypotheses do “nonparametric” two-group tests actually test? The Stata Journal. 2012;12(2):1–9.
Many good wishes for this very useful module!
Beta Was this translation helpful? Give feedback.
All reactions