Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

Change removal p value of stepwise logistic regression in JASP?

Hi All,

When I used stepwise logistic regression in JASP, I noticed that the last fitted model (i.e., I have 8 models, the 8th model) is not always the best model: the p value of the whole model is higher than 0.05, and the p values of several predictors are higher than 0.05. This makes me a bit confused: Because in stepwise linear regression, the last one is always the best one, both the p value of the whole model and the p value of each predictor is lower than 0.05. Also, I don't see the option of "Method Specification" as in linear regression to give the chance to change the removal p value. So, my question are:

  1. Is it possible to provide the option of " Method Specification" for stepwise logistic regression, as in stepwise linear regression, so that we can constrain the p value level of each predictor to be lower than 0.05, making the features selection be automatic?
  2. Is the module of logistic regression in JASP not fully automatic now? Do we have to check the p value of each model, each predictor manually, to remove the "insignificant p value based" predictors, and gradually decide which model we should choose?


Thank you very much,

Best regards,

Jian

Comments

  • edited October 2020

    Hi Jian,

    sorry for the delayed response! Here is an answer to your two questions and a more general answer:

    1. We have not provided a "method specification", but if you do want it you should submit an issue on our github. Then, the programmers will see it and may (might!) implement your request.
    2. The method used in logistic regression is AIC selection, which is based on the overall likelihood of the model with an additional penalty for the number of parameters. It does not look at p-values specifically. So it is "automatic" in that it chooses the model with the best fit according to AIC.

    In general, we are not fans of stepwise procedures for regression, there are several problems with it and it changes the statistical properties of the parameters. For example, it makes the p-values hard to interpret (see, for example, this excellent answer on stack exchange: https://stats.stackexchange.com/questions/179941/why-are-p-values-misleading-after-performing-a-stepwise-selection) and there are few protections against overfitting (but at least the AIC has a penalty for the number of parameters -- p-value selection is even worse in my opinion).

    Erik-Jan

  • Hi Erik-Jan,

    Thank you for your answer and very sorry for my delayed reply.

    1. Now I understand that JASP uses AIC for the model selection, so for this situation the p-values -- as discussed on stack exchange -- are biased and won't give the traditional p-value meaning. So, my new question is (also asked on that stack exchange discussion by other people): we know "it chooses the model with the best fit according to AIC" for the whole model, should one consider all the variables left in the model as having a true regression coefficient different form zero instead?
    2. Given that you are not fans of stepwise procedures for regression, any alternative way is recommended by you to select the predictive variables in the model (linear regression or logistic regression) and identify the final model?

    Jian

Sign In or Register to comment.