# Bayes factor for the comparison of independent correlations

Warning: long text.

Hello everyone,

I am not sure whether this is the right place to post my issue because it is not

directly related to JASP or BayesFactor. Rather, it is concerned with the

construction of Bayes factors in general. I created a Bayes factor for the comparison

of two independent correlation coefficients, which -- as far as I know -- is not

possible using either program. I would like to know whether my approach is in any way

reasonable, and I thought I might ask here, because I do not know anyone personally

who knows Bayes statistics.

In my approach I make use of the sampling distribution of effect size measure Cohen's

q, which is the difference between two Fisher transformed correlation coefficients

(see here). The sampling variance of q is known when the null

hypothesis is true as 1/(N1-3)+1/(N2-3) (here N1 and N2 are the sizes of both

independent samples in which the correlation coefficients were estimated). I verified

this in a simulation, see my code in

this github gist. Thus, I construct the likelihood curve for the point-null

hypothesis as a normal distribution with mean = 0 and variance =

1/(N1-3)+1/(N2-3). This R-Code illustrates how I generate the null hypothesis:

```
# code for the construction of the null hypothesis likelihood
n1 <- 1500
n2 <- 1500
var_q <- 1/(n1-3)+1/(n2-3)
null_pred <- function(x) dnorm(x, 0, sqrt(var_q)) # predictive density for the null
curve(null_pred, from=-0.5, to=0.5) # plot predicted density for null
```

I am using a subjective construction of the alternative likelihood. My prior belief

is that large effects (q > |.5|) are unlikely, they might only occur in less than 5%

of all cases. Additionally I think that effects are equally likely to occur in both

directions (r1 > r2 or r2 > r1). Therefore I assume that under the alternative

hypothesis, effects sizes q are distributed normally with mean = 0 and standard

deviation = 0.255 (in this case effects greater than |0.5| occur in ~ 5% of all

cases; see: `pnorm(0.5, 0, 0.255)`

).

Next I wish to turn this prior into a likelihood that describes how likely each value

of q is under the alternative hypothesis. I employ an approach -- hopefully correctly

-- that Jeff Rouder decribes

here. Effectively, I assume that for each true value of q, the observed values

are distributed normally with a variance of 1/(N1-3)+1/(N2-3). Using integration, the

total predictive density under the alternative hypothesis is obtained by weighing

each q with its prior probability. How this actually works is a point where my

understanding still has some shortcomings; using this logic however I created the

following R-Code to construct the predictive density of the alternative hypothesis:

```
# two-sided alternative prior
alt <- function(x) dnorm(x, 0, 0.255)
alt_inte <- function(x, effect) {
dnorm(effect, x, sqrt(var_q))*alt(x)
}
# predictive density for the alternative hypothesis:
alt_pred <- function(effect) {
integrate(alt_inte, lower=-Inf, upper=Inf, effect)$value
}
```

The bayes factor is then obtained by computing the ratio of both likelihoods for an

observed value of q.

```
eff_size <- 0.1 # small effect size q
BF <- alt_pred(eff_size)/null_pred(eff_size)
```

The Bayes factors that I get using this approach are somewhat reasonable when I

consider the statistical power for a significance test of a respective effect

size. For example, if I observe a small effect of q = 0.1 and 500 participants were

in both samples, the Bayes factor is very uninformative (BF = .77). Correspondingly,

the power to detect a true small effect of q = 0.1 in a significance test is very low

(~ 35%). If there are 1500 participants in each sample (which roughly corresponds to

a 80% power for the significance test), the Bayes factor for an observed q of 0.1

raises to 5.55. For an effect size of q = 0.2, the Bayes factor is 333,002 in this case.

I would be very thankful if someone could comment on my approach and point out

possible errors or misunderstandings. If there is something seriously wrong with this

approach, can anyone recommend some accessible literatur on constructing Bayes factors? I

am rather new to Bayesian statistics.

Best regards,

Martin

PS: This code can be used to plot the likelihood curves of null and alternative

hypothesis for illustrative purposes:

```
# plot predictions of null and alternative hypothesis
# create some predictions to plot the alternative hypothesis
effects <- seq(-1, 1, by=0.01)
alt_predicted <- vector(length=length(effects))
for (i in 1:length(effects)) alt_predicted[i] <- alt_pred(effects[i])
# plot hypotheses
lim <- 0.8
eff_size <- 0.1
dev.new(width=9)
curve(null_pred, from=-lim, to=lim, xlab="q", las=1, ylab="density", col="green", lwd=2)
lines(effects, alt_predicted, col="brown", lwd=2)
abline(v=eff_size, lty=2, col="red", lwd=2)
legend(0.2, 4.5, legend=c("Null hypothesis", "alternative hypothesis",
"observed effect"),
cex=1.2, col=c("green", "brown", "red"), lwd=2,
lty=c(1,1,2))
```

## Comments

Hi Papi,

This is an interesting approach. I think Jeffreys (1961) also discusses the comparison between two correlations, but I might misremember. Your approach assumes that "q" captures the relevant information in the data -- to the extent that it doesn't, the result is approximate. I don't have an issue with the general approach though. My only worries are (1) "I assume that for each true value of q, the observed values

are distributed normally with a variance of 1/(N1-3)+1/(N2-3)." Are they? (2) your prior strikes me as relatively modest, so that it is difficult to find good evidence in favor of the null.

Cheers,

E.J.

Hello E.J.,

thanks a lot for your response, which is really helpful. Regarding your worries:

I worried about this as well, because I do not know this for sure in general. Using a simulation approach however, I created differently correlated data and I observed this variance for a variety of effect sizes; cannot prove that it holds in general, though.

Could you elaborate on what you mean by modest? Do you think the distribution should have longer tails to find good evidence for the null? That is probably true, I just thought that large effect sizes for the difference of correlations -- at least in my field -- are very rare and for this reason I thought this might be an appropriate prior. But I was primarily interested in whether the approach in general is acceptable, thanks for your judgement on that.

Cheers,

Martin

Interesting stuff! (and yes I did mean to hint at the fatter tails).

Cheers,

E.J.