Bayes factor for the comparison of independent correlations
Warning: long text.
Hello everyone,
I am not sure whether this is the right place to post my issue because it is not
directly related to JASP or BayesFactor. Rather, it is concerned with the
construction of Bayes factors in general. I created a Bayes factor for the comparison
of two independent correlation coefficients, which -- as far as I know -- is not
possible using either program. I would like to know whether my approach is in any way
reasonable, and I thought I might ask here, because I do not know anyone personally
who knows Bayes statistics.
In my approach I make use of the sampling distribution of effect size measure Cohen's
q, which is the difference between two Fisher transformed correlation coefficients
(see here). The sampling variance of q is known when the null
hypothesis is true as 1/(N1-3)+1/(N2-3) (here N1 and N2 are the sizes of both
independent samples in which the correlation coefficients were estimated). I verified
this in a simulation, see my code in
this github gist. Thus, I construct the likelihood curve for the point-null
hypothesis as a normal distribution with mean = 0 and variance =
1/(N1-3)+1/(N2-3). This R-Code illustrates how I generate the null hypothesis:
# code for the construction of the null hypothesis likelihood
n1 <- 1500
n2 <- 1500
var_q <- 1/(n1-3)+1/(n2-3)
null_pred <- function(x) dnorm(x, 0, sqrt(var_q)) # predictive density for the null
curve(null_pred, from=-0.5, to=0.5) # plot predicted density for null
I am using a subjective construction of the alternative likelihood. My prior belief
is that large effects (q > |.5|) are unlikely, they might only occur in less than 5%
of all cases. Additionally I think that effects are equally likely to occur in both
directions (r1 > r2 or r2 > r1). Therefore I assume that under the alternative
hypothesis, effects sizes q are distributed normally with mean = 0 and standard
deviation = 0.255 (in this case effects greater than |0.5| occur in ~ 5% of all
cases; see: pnorm(0.5, 0, 0.255)
).
Next I wish to turn this prior into a likelihood that describes how likely each value
of q is under the alternative hypothesis. I employ an approach -- hopefully correctly
-- that Jeff Rouder decribes
here. Effectively, I assume that for each true value of q, the observed values
are distributed normally with a variance of 1/(N1-3)+1/(N2-3). Using integration, the
total predictive density under the alternative hypothesis is obtained by weighing
each q with its prior probability. How this actually works is a point where my
understanding still has some shortcomings; using this logic however I created the
following R-Code to construct the predictive density of the alternative hypothesis:
# two-sided alternative prior
alt <- function(x) dnorm(x, 0, 0.255)
alt_inte <- function(x, effect) {
dnorm(effect, x, sqrt(var_q))*alt(x)
}
# predictive density for the alternative hypothesis:
alt_pred <- function(effect) {
integrate(alt_inte, lower=-Inf, upper=Inf, effect)$value
}
The bayes factor is then obtained by computing the ratio of both likelihoods for an
observed value of q.
eff_size <- 0.1 # small effect size q
BF <- alt_pred(eff_size)/null_pred(eff_size)
The Bayes factors that I get using this approach are somewhat reasonable when I
consider the statistical power for a significance test of a respective effect
size. For example, if I observe a small effect of q = 0.1 and 500 participants were
in both samples, the Bayes factor is very uninformative (BF = .77). Correspondingly,
the power to detect a true small effect of q = 0.1 in a significance test is very low
(~ 35%). If there are 1500 participants in each sample (which roughly corresponds to
a 80% power for the significance test), the Bayes factor for an observed q of 0.1
raises to 5.55. For an effect size of q = 0.2, the Bayes factor is 333,002 in this case.
I would be very thankful if someone could comment on my approach and point out
possible errors or misunderstandings. If there is something seriously wrong with this
approach, can anyone recommend some accessible literatur on constructing Bayes factors? I
am rather new to Bayesian statistics.
Best regards,
Martin
PS: This code can be used to plot the likelihood curves of null and alternative
hypothesis for illustrative purposes:
# plot predictions of null and alternative hypothesis
# create some predictions to plot the alternative hypothesis
effects <- seq(-1, 1, by=0.01)
alt_predicted <- vector(length=length(effects))
for (i in 1:length(effects)) alt_predicted[i] <- alt_pred(effects[i])
# plot hypotheses
lim <- 0.8
eff_size <- 0.1
dev.new(width=9)
curve(null_pred, from=-lim, to=lim, xlab="q", las=1, ylab="density", col="green", lwd=2)
lines(effects, alt_predicted, col="brown", lwd=2)
abline(v=eff_size, lty=2, col="red", lwd=2)
legend(0.2, 4.5, legend=c("Null hypothesis", "alternative hypothesis",
"observed effect"),
cex=1.2, col=c("green", "brown", "red"), lwd=2,
lty=c(1,1,2))
Comments
Hi Papi,
This is an interesting approach. I think Jeffreys (1961) also discusses the comparison between two correlations, but I might misremember. Your approach assumes that "q" captures the relevant information in the data -- to the extent that it doesn't, the result is approximate. I don't have an issue with the general approach though. My only worries are (1) "I assume that for each true value of q, the observed values
are distributed normally with a variance of 1/(N1-3)+1/(N2-3)." Are they? (2) your prior strikes me as relatively modest, so that it is difficult to find good evidence in favor of the null.
Cheers,
E.J.
Hello E.J.,
thanks a lot for your response, which is really helpful. Regarding your worries:
I worried about this as well, because I do not know this for sure in general. Using a simulation approach however, I created differently correlated data and I observed this variance for a variety of effect sizes; cannot prove that it holds in general, though.
Could you elaborate on what you mean by modest? Do you think the distribution should have longer tails to find good evidence for the null? That is probably true, I just thought that large effect sizes for the difference of correlations -- at least in my field -- are very rare and for this reason I thought this might be an appropriate prior. But I was primarily interested in whether the approach in general is acceptable, thanks for your judgement on that.
Cheers,
Martin
Interesting stuff! (and yes I did mean to hint at the fatter tails).
Cheers,
E.J.