Bayes Factor versus P (M1|data) / P (M2|data) ?
Please help me interpret. I am new to JASP and Bayesian stats, so please point out my mistakes and enlighten me
Richard Moery writes on the Bayesfactor blog: "The Bayes factor is the relative predictive success between two hypotheses: it is the ratio of the probabilities of the observed data under each of the hypotheses. If the probability of the observed data is higher under one hypothesis than another, then that hypothesis is preferred."
My question: " If the probability of the observed data is higher under one hypothesis than another, then that hypothesis is preferred" ???
Why ? The Bayes Factor is the evidence in the data. But if my prior distribution was very informed, I may still prefer that distribution, even after the data. It is only after repeated new data that my prior will be "swamped", right?
In other words, what is more useful:
a) the Bayes Factor (the relative probability of the data under two competing models); or
b) the relative probability of the models, given the data? (which is the Bayes Factor * the relative probability of the models, prior to the data)
In yet other words, what good are Bayes Factors?
I know I a making mistakes here, so please correct my thinking. Thanks!
Comments
Hi Pieter,
I can't speak for Richard but I'll do it anyway :-) What I think Richard meant by "preferred" is "receives the most support from the observed data" or perhaps he meant "preferred if the prior odds are equal".
Of course the overall belief in a model depends also on its prior plausibility, as you point out. What the BF gives you is the change in belief, aka the predictive updating factor. This is often useful because although researchers may differ a lot with respect to their prior enthusiasm about a hypothesis, they may agree on the extent to which the data change that enthusiasm. Everybody can take the BF and use it to adjust their own prior opinion.
Cheers,
E.J.
Wow, thanks for the fast response! I am learning (basic) Bayesian statistics, by reading a bunch of papers (as collected on http://alexanderetz.com/understanding-bayes/ , but also others, including some by you). And by playing with JASP.
My goal is to teach students Bayesian statistics first and frequentist only after that (if at all). They study marketing, so it will be mostly market research. In this field, p-values are still king and misinterpretations galore. Everybody interprets confidence intervals as "I can be 95% sure that the true mean is between x and y". And of course low p-values "prove" that hypotheses are true.
Bayesian statistics would (in theory) fit marketing like a glove. Credible intervals go well with how students (and staff!) interpret intervals. Using prior knowlegde is only logical. It goes well with market research is done usually: find usable secondary data, and then (if needed) collect new data yourself. In the classical approach, it is hard to really integrate old and new data, so what you see (if you see it at all) is: "hey! If find X, whereas existing market data says Y. This is remarkable......" sometimes followed by some ad-hoc explanation.
I hope my students will welcome JASP. They do not like SPSS. JASP seems much more "user friendly". FYI, we are talking about students who mostly are not very interested in research and who are new to statistical packages. Using R or Python would be out of the question.
My plan is to start with Bayesian statistics, using JASP. I would prefer to not do frequentist analysis at all, but on the other hand, it still is the dominant method so students should understand some key concepts. Problem is that frequentist statistics is conceptually difficult and prone to misinterpretations. Plus, "it answers the wrong question" because in my field no-one is interested in how likely "data of this sort or more extreme" are under some hypothesis; everybody wants to know: "what do these data tell us?" Sorry for the rambling
Well I strongly agree, of course :-)
We are working on a manual, videos, and a course book; I hope these may be useful to you when they come out (might take a while still).
E.J.
I'd be VERY interested in a course book
Sorry to necrobump, but still relevant. I am teaching some introductory statistics classes. In total we are talking 40 students, subject is basic market research. Some (totally subjective) impressions:
Simple things like clicking hypotheses (in frequentist analyses) (H1>H0; H1 != H0, H1<H0) are already helpful. The menus are logical. They like that analyses are done "immediately" and that there is no separate output file. Less confusion on their side. The output given (tables, plots) are simple but useful.
Sorry for the rambling, thanks for the software, the ideas and the support.
Thanks Pieter. Richard wrote a nice paper on confidence intervals (now that you mention them :-)):
https://learnbayes.org/papers/confidenceIntervalsFallacy/
Cheers,
E.J.
Thank you E.J. I know that paper. :-)