Bayesian t-tests vs Anovas in within-subject designs (2 conditions)
I am very enthusiastic about Bayes Factors and Jasp, but there's one thing that troubles me.
Let's assume I have two groups in a within-subject design. So I am testing the simplest model possible - a single factor with 2 levels - against the null model. Shouldn't I get the same Bayes Factor with the Bayesian t-test and the Bayesian one-way ANOVA?
I tried it out for a dataset with a non-significant difference between conditions. With paired Bayesian t-tests in Jasp (version 0.7.5.5), I get BF01 = 2.274
Bayesian Paired Samples T-Test
BF₀₁ error %
b - a 2.274 1.578e -5
Why am I not able to spot this number in the output of the one-way Bayesian Anova?
Model Comparison - dependent
Models P(M) P(M|data) BF M BF₀₁ % error
Null model (incl. subject) 0.500 0.637 1.758 1.000
RM Factor 1 0.500 0.363 0.569 1.758 1.442
Note. All models include subject.
Analysis of Effects - dependent
Effects P(incl) P(incl|data) BF Inclusion
RM Factor 1 0.500 0.363 0.569
I have the feeling that I am missing something obvious. But what?
Tobi
Comments
Are you doing your t-test on the difference in scores? Is this case there can be a discrepancy. But if you do your t-test on the two columns of data, the results should be the same (under the default prior in ANOVA and t-test)....Hmmm I'll look into this and get back to you.
E.J.
Hi E.J., thanks for the quick reply. Yes, I did the t-test on the 2 groups (by comparing the 2 columns), not on the difference of these groups. And yes, I used the default priors.
cheers,
Tobi
Hi Tobi,
OK. Here's what's going on:
(1) In a between-subjects design, the Bayesian t-test (with default priors) gives the same result as the Bayesian ANOVA (with default priors).
(2) In between-subject designs and in within-subject designs, the classical t-tests gives the same result as the corresponding classical ANOVAs.
(3) In your within-subjects design, the correct comparison is between the Bayesian paired t-test and the Bayesian repeated measures ANOVA. With default priors, these do not give the same result. The reason is that the underlying statistical model is slightly different. I hope that Richard Morey can clarify this issue more.
Cheers,
E.J.
Hmm, I see. I have to admit that I find that somewhat worrying, because the discrepancy of ~0.5 (2.274-1.758 = 0.516) does not seem to be that subtle. I wonder which of these approaches gives the better (more reliable) results - the ttest or the Anova?
That discrepancy is subtle. You would draw the same qualitative conclusion in either case. In my opinion, a BF of 2.2 is seriously different from a BF of about 8. As far as the different models go, often there is no better or worse -- the only thing you can debate is the plausibility of the model formulation. Once the model is formulated, the outcome is automatic and follows from the routine application of probability theory.
Cheers,
E.J.
Agreed, in this particular case, it doesn't matter whether the BF is 1.7 or 2.2, because both values are below 3. But what if the values were 2.6 vs 3.1 - then it would make a difference for the interpretation, wouldn't it? One might be tempted to pick the "more pleasant" value (depending on your favorite hypothesis) ... hence, I wonder which of these approaches should be preferred - the ttest or the Anova?
cheers
Tobi
Hi Tobi,
It still would not matter. The BF gives you a continuous measure of evidence. The thresholds are arbitrary conventions. Useful for quick interpretation but one should never lose sight of the underlying continuous values. in other words, God loves the BF=2.6 almost as much as he loves the BF=3.1.
E.J.
"God loves the BF=2.6 almost as much as he loves the BF=3.1."
Almost? I was told that god would love everyone and everything uniformly ;-)
Also, you forget that man is a god in ruins.
Humans don't love evidence that does not exceed arbitrary conventions.
Tobi