Bayesian contingency tables
Hello, all.
I have been having a little look at two articles related to the Bayesian analysis of 2×2 contingency tables as implemented in JASP in the Frequencies module through the contingencyTableBF function in the BayesFactor package for R:
- E. Gûnel & J. Dickey, Bayes Factors for Independence in Contingency Tables. Biometrika, 1974. 61(3): p. 545–557. GD74. https://www.jstor.org/stable/2334738
- T. Jamil, A. Ly, R.D. Morey, J. Love, M. Marsman & E.-J. Wagenmakers, Default “Gunel and Dickey” Bayes factors for contingency tables. Behavior Research Methods, 2017. 49(2): p. 638–652. J+17. https://doi.org/10.3758/s13428-016-0739-8
I am mostly looking at the Poisson sampling scheme.
I'm sure that others here will be knowledgeable about those details. There were a few points I was wondering about.
- Comparing the first fraction of equation 7 in J+17 with second equation given in §4.6 of GD74, they look almost identical. Is that a fluke, or are they supposed to correspond? If the latter, then what happened to the min() function appearing in GD74?
- In JASP and in the discussion of an example in J+17 it seems that the interest is in testing whether proportions in the respective rows are unequal (put another way, as in the JASP GUI: the alternative hypothesis specifies that the column-one group is not equal to the column-two group). Practically speaking it seems that the numerical results are the same if the researcher had instead been interested in comparing proportions in the respective columns (that is, testing hypotheses about the two row groups). Is that correct? If so, it implies that equation equation 7 in J+17 is invariant to row swaps, column swaps and transpose, which — by inspection — it almost seems to be except for singling out of "y1." for special treatment.
- In discussing the "independent Poisson model" in §4.6, GD74 write that "the choice" of a_ij=a=1 and b=4/n.. "shall favour" the alternative hypothesis. (It's not obvious to me why.) Indeed J+17 also report that use of the "default" GD74 priors results in a Poisson model that is "most reluctant" in its support for the null hypothesis. If this is the case, then why not choose a value of "a" that would be more neutral? (What would that value be? Or how could it fairly be chosen?) Even though GD74 analysed the case of a=1, it wasn't apparent to me that they were necessarily recommending it. On the other hand, should this be the mandatory penalty for adopting a Poisson sampling scheme?!! In thinking about this, I wonder about data that would produce BF=1 for the various sampling schemes. (J+17 focussed on data producing either largish or smallish BF.)
- I had a go at implementing equation 7 of J+17 in code (not in R) that should be using 50+ digits of precision, with sample data of [50, 150; 10, 100]. While JASP reported BF = 96.085,266,860,211, my code returned BF = 96.085,266,860,139. Just a comment.
—DIV
Comments
Hello DIV,
Ah, this is a while ago. Some quick responses here. I will note that it takes some time to get to the bottom of this -- I recall this being a bit of a puzzle even when we were working on it.
Cheers,
E.J.
Thank-you for your feedback, E.J..
—DIV
Further on point 2:
—DIV
OK it is clear that this needs another look from me. I am not eager to do this as I recall the Gunel and Dickey paper was not easy to understand (conceptually their approach was clear, but mathematically things weren't completely spelled out)
E.J.