Multiply BF10 factors ?

Michael_Jasper · November 2024

Bayesian correlation of x and y.

Bayesian correlation of z and m.

Given that they don't share any variables can I multiply their BF10 factors?

IF x and z are highly correlated? [this is the complication]

If it has any bearing this is a confirmatory study of a predefined hypothesis which foresaw that x and z would be highly correlated.

EJ · November 2024

You can multiply the BFs if knowledge of the result for x and y would not alter your knowledge for the correlation between z and m. Now you know that x and z are highly correlated, but I don't think this is relevant for the correlation between z and m. The test-relevant parameter is the correlation coefficient, and knowledge about just one of the variables in a correlation does not provide information on that. So I would be inclined to say that you are allowed to multiply the BFs here. A full modeling effort that takes into account the relation between x and z will only lead to a slightly different result.

What is more relevant is this: suppose you analyze the correlation between x and y, and it turns out to be near 0. Does this knowledge affect your expectations for the correlation between z and m? If it does not, then you can multiply the Bayes factors. If it does, then it would be incorrect to multiply the Bayes factors, because the evidence is not independent.

Cheers,

E.J.

Michael_Jasper · November 2024

Thank you Eric.

If the hypothesis predicts the correlation between x and z, which is observed, then independence is breached and one then cannot multiply the BF factors?

-----

N.B.The hypothesis predicts correlations

between x and y,

between z and m,

and between x and z.

Which are then all observed in the data.

EJ · November 2024

More precisely: if the size of the correlation between x and z under H1 affects the knowledge about the size of the correlation between z and m under H1, then independence is breached

EJ

Michael_Jasper · November 2024

Eric (and/or anyone else), would you agree with the validity of saying the following?:

(N.B., "extreme" or "decisive" is justified by references, the former by your own scale Eric)

-------------------------

Combining (by multiplication) Bayes factors for [x vs. y] and [z vs. m] gives a Bayes factor of 1338. Greatly exceeding, by more than an order of magnitude, a commonly used cut-off value (100) for “extreme” or “decisive” evidence that the alternative hypothesis is true over the null hypothesis. These two Bayesian correlations have no shared variables. Although all four variables are highly intercorrelated, as hypothesized (because of hypothesized causalities) by the alternative hypothesis, and so these two Bayesian correlations aren’t independent under the alternative hypothesis, meaning it is improper to combine their Bayes factors. However, such perceptive independence depends on the observer, upon their prior belief(s). Independence is met for a sceptic who doesn’t believe the alternative hypothesis. Even if they anticipated the high correlation between x and m, as knowledge of this correlation doesn’t predict any of the other correlations. So, the sceptic can combine these Bayes factors, getting to a combined Bayes factor exceeding 1000, “decisive” evidence, which may dampen their scepticism. Whether it does or doesn’t, either way, they can then use this Bayes factor exceeding 1000 as prior odds in the Bayesian updating formula, in which case a subsequent Bayes factor, based on additional data, would need to be extremely small to meaningfully reduce the posterior odds. This is because posterior odds are determined by multiplying the prior odds by the new Bayes factor, making strong prior evidence difficult to overturn unless contradictory evidence is exceptionally compelling.

--------------------------------

andersony3k · November 2024

Greetings Michael,

Questions of statistical independence aside, I'm not convinced that this sort of Bayes factor multiplication is capable of addressing your research question. Broadly, the question concerns a conjunction of statistical hypotheses, e.g., "Is it true that both the XY population != 0 AND the MZ population correlation != 0?"

The difficulty is that the product of two BF values creates a 'compensatory' situation: For example, a very strong BF[10] for the XY correlation (which alone is consistent with the alternative hypothesis, XY population correlation != 0) can offset a not-insubstantial BF[10] (which alone is consistent with the null hypothesis).

I've illustrated this in the attached image. I would interpret the result to be INconsistent with your predictions. That is, the result shows that BF[10] XY is very strongly consistent with the statistical alternative hypothesis WHEREAS BF[10] MZ is substantially consistent with the null.

This contrasts with what you would appear to want to conclude, which is that 52003.7278 * 0.1522 = 7919 thus constituting very strong consistency with the conjunctive hypothesis that both of the population correlations (XY and MZ) are non-zero.

EJ · November 2024

Good point! It may be argued that there are at least four hypotheses:

XY>0 and MZ >0
XY>0 and MZ=0
XY=0 and MZ>0
XY=0 and MZ=0

Michael_Jasper · November 2024

On this compensatory issue raised, is it still an issue if both Bayes factors (for both Bayes correlations) are both over 10, and so are both in the realm of "very strong". And I'm explicit about their values, and don't just report their combined (multiplied) Bayes factor. One of the Bayes factors is around 12 and the other is over 100, multiplying them is how get a combined Bayes factor of over 1000.

What do you think? In that case, is using the text above, in an earlier post I made in this thread, ok?

andersony3k · November 2024

@Michael_Jasper As I tried to state above, though I've heard knowledgable people argue otherwise, I believe Bayes factor multiplication is for situations in which all of the following apply:

(i) you have one set of hypotheses (e.g. R[xy] > 0 and R[xy] = 0)

(ii) you have a data set, a, in which the data pattern has mathematically defined likelihoods of occurring, given each of the hypotheses, independently:

BF10a = Likelihood[R[xy]a > 0] / Likelihood[R[xy]a = 0]

(iii) you have an additional data set, b, pertaining to the SAME hypotheses, which therefore gives rise to an additional Bayes factor pertaining to the SAME hypotheses:

BF10b = Likelihood[R[xy]b > 0] / Likelihood[R[xy]b = 0]

Now you can combine the evidence derived from the two data sets by multiplying BF10a and BF10b.

However, what you may NOT do is mix apples with oranges. You may NOT multiply BF10a (or BF10b) with a new Bayes factor computed as "Likelihood[R[mz] > 0 / Likelihood[R[mz] = 0", as "Likelihood[Mean[v] > 0 / Likelihood[Mean[v] = 0", or anything other than "Likelihood[R[xy]b > 0] / Likelihood[R[xy]b = 0]."

Why is there confusion about this? I believe it stems from a widespread looseness in use of the term "hypothesis." Bayesian calculations pertain to statistical hypotheses only. They don't pertain--at least not directly--to theoretical hypotheses or empirical prediction-sets that consist of multiple statistical hypotheses.

Michael_Jasper · November 2024

@andersony3k I think I understand your point, which collapses to

"Bayesian calculations pertain to statistical hypotheses only. They don't pertain....to theoretical hypotheses".

This restriction is new to me, is it well-known? I've looked around and can't find any mention of this restriction, can you (and/or anyone else) please point me to anything? Keen to learn. Thank you.

andersony3k · December 2024

I don't know if it's well-known in this particular linguistic expression. But the bottom line is that you need:

A data pattern, some mutually exclusive hypotheses, and for each hypothesis, a quantification of the likelihood that the observed data pattern would occur if the hypothesis were true.

You've indicated that you think your hypotheses are "Q is true" and "Q is not true."

Suppose you've already calculated the sample r.xy = .6 and the sample r.mz = .5. To begin your Bayesian analysis, you would need to compute the following four numbers:

# S: If Q were true, the likelihood that the sample correlation r.xy is .6.

# T: If Q were not true, the likelihood that the sample correlation r.xy is .6.

# U: If Q were true, the likelihood that the sample correlation r.mz is .5.

# V: If Q were not true, the likelihood that the sample correlation r.mz = .5.

The problem is, you have no way to know S, T, U, or V. So yo would either have to be provided with those values, or re-conceptualize what your hypotheses are.

One way to re-conceptualize is as follows:

# Analysis 1: the hypotheses are the population correlation R.xy > 0, R.xy = 0. The data are r.xy = .6.

# Analysis 2: the hypotheses are the population correlation R.mz > 0, R.mz = 0. The data are r.mz = .5.

These are the kinds of hypotheses--statistical hypotheses--that JASP deals with. And ultimately there's no multiplication of Bayes factors because mathematically, the hypotheses in Analysis 1 have nothing to do with the hypotheses in Analysis 2.

EJ · December 2024

Well, I would still like to argue in favor of the 4 hypotheses I outlined above. And these hypotheses might just be associated to particular people/forecasters:

John's believes that XY>0 and MZ >0
Mary believes that XY>0 and MZ=0
Amy believes that XY=0 and MZ>0
Jim believes that XY=0 and MZ=0

I don't see why we could not compute relative predictive performance for these four forecasters. They might for instance be weatherpersons, and the correlations might be some aspect of future weather. Or the four people might be the only four suspects in a murder trial:

If John is guilty, XY>0 and MZ >0
If Mary is guilty, XY>0 and MZ=0
If Amy is guilty, XY=0 and MZ>0
If Jim is guilty, XY=0 and MZ=0,

where the two correlations can be considered conditionally independent pieces of evidence.

So my main problem with the multiplication is that it compares John (both correlations non-zero) to Jim (both correlations zero). And I would also say that the main result are the two individual correlations.

EJ

andersony3k · December 2024

@EJ I'm open to being persuaded. For me, I think the categories "John," "Mary, "Amy," "Jim," serve as an organizational aid, but for mathematical clarity, I prefer your originally-worded, four hypotheses.

XY>0 and MZ>0
XY>0 and MZ=0
XY=0 and MZ>0
XY=0 and MZ=0

However, there seems to be lots of hidden, underlying complexity. Suppose the data yield a sample XY correlation of positive .995 and a sample MZ correlation of negative .995.

Whereas those two sample correlation coefficients (.995 and -.995) might be more consistent with some of the four hypotheses than with others of those four, the data are REALLY REALLY consistent with a fifth hypothesis which is XY>0 and MZ<0. So it's clear that those four hypotheses don't cover the entire scope of possible outcomes. What's more, intuitively, there's a real question in my mind as to whether the end result of an empirical assessment of the likelihoods related to all five of these hypotheses (actually there are more than that) amount to anything different than simply conducting two tests in JASP:

Test i:

XY != 0, versus XY = 0

Test ii:

MZ != 0, versus MZ = 0

EJ · December 2024

Yes you can conduct those two test, but when you want to combine them then it would be prudent to add those other hypotheses. Of course you can add the negative correlations as well, expanding the hypothesis space. I had hoped you have some substantive knowledge to prune the hypothesis tree.

EJ

andersony3k · December 2024

To simplify,

Given the hypotheses:

H1: XY>0 and MZ>0

H2: XY>0 and MZ=0

H3: XY=0 and MZ>0

H4: XY=0 and MZ=0,

the data may be more consistent with #1 than with any of the others, but that wouldn't imply that the data are also more consistent with H1B: MZ > 0 than with H4B MZ = 0. This is because H1 and H1B are two different hypotheses (as are H4 and H4B). The reason why may become clearer if one takes the additional step of considering what hypothesis, beyond those four, is most consistent with the data, but one need not take that additional step.

Michael_Jasper · December 2024

@EJ RE you saying: "So my main problem with the multiplication is that it compares John (both correlations non-zero) to Jim (both correlations zero). And I would also say that the main result are the two individual correlations."

I understand what you are saying I think, and to report the two individual correlations would be better than reporting only the combined. BUT what about reporting the two individual Bayesian correlations, seeing they both BF10 > 10, reporting this in write up (reporting their explicit values), and then multiplying and reporting this combined value as well. So, not either or. But reporting all three BF10s. Would this be fair to your mind? And in this way you accounted for all the options, and shown this, by going stepwise and reporting all the steps.

EJ · December 2024

@Michael_Jasper : yes you can multiply but it needs to be clear what hypotheses you are comparing.

@andersony3k: Consider four factories, each of whom creates four products: X, Y, M, Z. The factories create these products in different ways. In Factories 1 and 2, products X and Y are generated in part by the same machine (leading to a correlation between values for some quality characteristic, say their weight). Similarly, in Factories 1 and 3, products M and Z are generated in part by the same machine. So we have

Factory 1: XY>0 and MZ>0

Factory 2: XY>0 and MZ=0

Factory 3: XY=0 and MZ>0

Factory 4: XY=0 and MZ=0.

You are now given a sample of products and you are asked to determine which factory produced the product. You would presumably multiply BFs in this case?

EJ

andersony3k · December 2024

@EJ. I think, yes, you would. But my point is that, suppose the result of multiplication shows that the data are most consistent with Factory 1. One would need to exercise caution and refrain from drawing the further inference that (i) the data are more consistent with XY>0 than with XY = 0, or that (ii) more consistent with MZ>0 than with MZ=0.

Howdy, Stranger!

Categories

Multiply BF10 factors ?

Comments

Howdy, Stranger!

Quick Links

Categories

Multiply BF10 factors ?

Comments