# Bayesian linear regression predictive vs probable models

Hi EJ. Can I ask about Bayesian linear regression. I am running this for models with 8-10 predictors. When I use the beta-binomial (1,1) prior, my results have the highest BF for the best predictive model (ie. highest R2) but the highest probability is for a different model. (When I run it when a uniform prior this is not the case). The r scale is 0.5. I have read your most recent preprint and am wondering if this is because beta-binomial prior assignment acts as an automatic correction for multiplicity? So my next question is regarding reporting this in results - do I report both the best predictive model and the highest probability model, or have I done this analysis wrong and my priors are biasing the results? Thankyou very much.

## Comments

Dear BrittJane [sorry for the tardy reply -- I've responded elsewhere as well but I'll do this here too for completeness],

The model with the highest R2 is *not* the model that predicts best: it is the model with the best fit. This is always the model with all predictors included. So for model selection, R2 is not very informative. Predictive ability is given by the BFs. The model order is based on posterior probability, so if you are not using the uniform assignment you might find that, because of differences in prior model probability, the best predicting model (the one with the best BF against any other model) is not the one in which you should have most posterior belief. You might want to report your results both under the uniform prior and the beta prior -- if the outcome is very different this suggest you'll have to interpret the result with caution.

Cheers,

E.J.

Thankyou EJ, that was very helpful. You have specified elsewhere in your papers that the beta prior is best to use (eg. citing the Scott 2010 paper as well), hence I have steered away from uniform priors. However, if the results differ, does it indicate this prior was incorrect, or simply that I should comment on it?

Dear BrittJane,

In general, I think commenting is the more responsible option. This is part of statistics that is still in flux. At the same time, it is a nice robustness check.

Cheers,

E.J.