Understanding Bayes repeated measures ANOVA
Hello,
I am trying to get a practical understanding of Bayes ANOVA by playing around with some synthetic data. I am struck however by how many relationships that do exist in frequentist ANOVA do not translate, so that i have to revise many of my prior intuitions. Could you confirm that my understanding is correct here?
- Equivalence between t-tests and ANOVA. In frequentist analyses, the p value for the difference between two factors of a level is always identical when run with ANOVA or with a paired t-test. This is not so with Bayes t-tests and Bayes-ANOVAs. I assume this is because of different priors?
- Independence of interactions and main effects. In frequentist ANOVAs, main effects and interactions are evaluated independent of each other, but not in Bayes ANOVA. In fact, typically, the Bayes ANOVA won't see an interaction at all, if the two factors don't already show main effects. I find this a bit problematic for several classical stimulus-response compatibility designs, for example, where one predicts an interaction of two factors, but no main effects (e.g. stimulus side (left, right) vs. response side (left, right)). Is there a way (in JASP) to force the ANOVA to consider the interaction by itself?
- Equivalence between main effects and interactions. In a frequentist repeated measures ANOVA, main effects are computed analogously to interactions (and all main effects and interactions are statistically independent of one another). For example, through a simple re-coding of columns (e.g. switching of columns C and d) in a 2x2, one of the main effects turns into an interaction (with identical p values) and vice versa. I find this equivalence does not exist in Bayes ANOVA, with BFs being different depending on whether something is coded as main effect or interaction. Why is that?
- Between-subjects variability seems to be coded differently. In frequentist repeated measures ANOVA, results will always be identical, if the difference between to conditions is the same for each participant, irrespective of the overall scores for the values in both conditions. Bayes t-tests show the same equivalence, but not Bayes ANOVAs. I don't understand why this is.
It would be great if you would have some advice for me here. I am trying to re-evaluate some of my old results with Bayes analyses but some of the problems above have hampered my progress somewhat. I am also wondering if there is some way of running Bayes ANOVAs in a way that retains some of the above relationships?
Thank you!
Comments
Hi PBach,
I'll just give my two cents here, and I'll forward this post to some others as well -- they might want to chime in. Note that the "Bayesian ANOVA" is really a linear mixed effect model. OK:
"1. Equivalence between t-tests and ANOVA. In frequentist analyses, the p value for the difference between two factors of a level is always identical when run with ANOVA or with a paired t-test. This is not so with Bayes t-tests and Bayes-ANOVAs. I assume this is because of different priors?"
The Bayesian one-way ANOVA with two levels does give the same BF as a Bayesian between-subjects t-test (with the default priors)
"2. Independence of interactions and main effects. In frequentist ANOVAs, main effects and interactions are evaluated independent of each other, but not in Bayes ANOVA. In fact, typically, the Bayes ANOVA won't see an interaction at all, if the two factors don't already show main effects. I find this a bit problematic for several classical stimulus-response compatibility designs, for example, where one predicts an interaction of two factors, but no main effects (e.g. stimulus side (left, right) vs. response side (left, right)). Is there a way (in JASP) to force the ANOVA to consider the interaction by itself?"
Not in JASP. From https://link.springer.com/article/10.3758/s13423-017-1323-7:
"Consistent with the principle of marginality, JASP does not include interactions in the absence of the component main effects; for instance, the interaction-only model “Gender × Pitch” may not be entertained without also adding the two main effects (for details, examples, and rationale see Bernhardt & Jung, 1979, Griepentrog, Ryan, & Smith 1982, McCullagh & Nelder, 1989; Nelder, 1998, 2000; Peixoto, 1987, 1990; Rouder, Engelhardt, et al., in press; Rouder, Morey, et al., in press; Venables, 2000)."
"3. Equivalence between main effects and interactions. In a frequentist repeated measures ANOVA, main effects are computed analogously to interactions (and all main effects and interactions are statistically independent of one another). For example, through a simple re-coding of columns (e.g. switching of columns C and d) in a 2x2, one of the main effects turns into an interaction (with identical p values) and vice versa. I find this equivalence does not exist in Bayes ANOVA, with BFs being different depending on whether something is coded as main effect or interaction. Why is that?"
What I expect is that interactions are more flexible, and able to explain/predict more data patterns. But this is speculation, I'll ask some people who know more.
"4. Between-subjects variability seems to be coded differently. In frequentist repeated measures ANOVA, results will always be identical, if the difference between to conditions is the same for each participant, irrespective of the overall scores for the values in both conditions. Bayes t-tests show the same equivalence, but not Bayes ANOVAs. I don't understand why this is."
Yes I once had exactly this conversation with Richard Morey, who proceeded to explain why this made perfect sense. I'll ask him to reply to this as well.
As an aside, Jeff Rouder will give a lecture on ANOVA on Tuesday that might be relevant: https://www.sowi.uni-mannheim.de/erdfelder/forschung/one-world-cps/
Cheers,
E.J.
Thank you, E.J., for the answers and forwarding my questions on!
"The Bayesian one-way ANOVA with two levels does give the same BF as a Bayesian between-subjects t-test (with the default priors)"
Hmmm, it doesn't for me - I just tried it again in JASP. I am using the repeated measures ANOVA and t-test though. From the t-test, i get a BF10 of 11.96. and a BF10 of 36.38 from the ANOVA. Prior for the t-test is still set to default Cauchy (scale 0.707) -- I don't think i can change the priors for the ANOVA.
If there were minor differences, I wouldn't worry -- but this seems an order of magnitude different. Happy to send you the data if you want. I am sure I am doing something wrong somewhere. Then again: if (4) is true -- and between-subjects variance is treated differently by a Bayes ANOVA than by a Bayes t-test -- then we would expect to see differences I assume?
"Consistent with the principle of marginality, JASP does not include interactions in the absence of the component main effects; for instance, the interaction-only model “Gender × Pitch” may not be entertained without also adding the two main effects (for details, examples, and rationale see Bernhardt & Jung, 1979, Griepentrog, Ryan, & Smith 1982, McCullagh & Nelder, 1989; Nelder, 1998, 2000; Peixoto, 1987, 1990; Rouder, Engelhardt, et al., in press; Rouder, Morey, et al., in press; Venables, 2000)."
I understand that -- but how would you then evaluate designs that *predict* interactions without main effects? As said, in some fields (e.g. stimulus response compatibility) such designs are quite common.
Would it be legitimate, for example, to simply re-code the analysis? Take the above 2x2 response side (left, right) X stimulus side (left, right) repeated measures ANOVA as an example. In a frequentist ANOVA, this could be simply re-coded as a into a 2x2 with factors response side (left, right) and Compatibility (stimulus on same side, different side) -- the results would be statistically (and conceptually) identical, only that the key comparison is reflected as an interaction in the first version and a main effect of Compatibility in the second. My worry though is that I run afoul of the problem described in (3).
Thank you again for all your thoughts already. I am looking forward to learn more. I love JASP and am keen to incorporate Bayes more and more -- but, due to the reasons above, I worry that I am misunderstanding some fundamental parts of it.
Hello,
>>"The Bayesian one-way ANOVA with two levels does give the same BF as a Bayesian between-subjects t-test (with the default priors)"
>Hmmm, it doesn't for me - I just tried it again in JASP. I am using the repeated measures ANOVA and t-test though. From the t-test, i get a BF10 of 11.96. and a BF10 of 36.38 from the ANOVA. Prior for the t-test is still set to default Cauchy (scale 0.707) -- I don't think i can change the priors for the ANOVA.
>If there were minor differences, I wouldn't worry -- but this seems an order of magnitude different. Happy to send you the data if you want. I am sure I am doing something wrong somewhere. Then again: if (4) is true -- and between-subjects variance is treated differently by a Bayes ANOVA than by a Bayes t-test -- then we would expect to see differences I assume?
Indeed. I'll ask those who proposed the methodology again. It was explained to me once but I forgot.
>>"Consistent with the principle of marginality, JASP does not include interactions in the absence of the component main effects; for instance, the interaction-only model “Gender × Pitch” may not be entertained without also adding the two main effects (for details, examples, and rationale see Bernhardt & Jung, 1979, Griepentrog, Ryan, & Smith 1982, McCullagh & Nelder, 1989; Nelder, 1998, 2000; Peixoto, 1987, 1990; Rouder, Engelhardt, et al., in press; Rouder, Morey, et al., in press; Venables, 2000)."
>I understand that -- but how would you then evaluate designs that *predict* interactions without main effects? As said, in some fields (e.g. stimulus response compatibility) such designs are quite common.
There are some theoretical problems with omitting the main effects. I would have to look into it again. I think it would be rare to have a pure interaction (i.e., a perfectly symmetric "x"), but I don't believe this is the reason. Your example is about the Simon effect, arguing it is equally large for the left and the right side? That is a good example. I'll make a note to look into this again. The BayesFactor package does allow you to enter only an interaction, I believe.
>Would it be legitimate, for example, to simply re-code the analysis? Take the above 2x2 response side (left, right) X stimulus side (left, right) repeated measures ANOVA as an example. In a frequentist ANOVA, this could be simply re-coded as a into a 2x2 with factors response side (left, right) and Compatibility (stimulus on same side, different side) -- the results would be statistically (and conceptually) identical, only that the key comparison is reflected as an interaction in the first version and a main effect of Compatibility in the second. My worry though is that I run afoul of the problem described in (3).
Hmm yes. Not sure about this. The repeated measures case is tricky. But we'll get to the bottom of it.
Cheers,
E.J.
thank you, E.J., for continuing to look into it -- I am looking forward to the responses!