feature selection in random forest regression
Hi there,
Is there possible to conduct feature selection (i.e., the selection of the most important variables in predicting the outcome) when I conduct random forest regression with JASP.
Best
Faming
Hi there,
Is there possible to conduct feature selection (i.e., the selection of the most important variables in predicting the outcome) when I conduct random forest regression with JASP.
Best
Faming
Comments
I'll ask the experts...
E.J.
Hi Faming,
You can view the importance of the features in predicting the target variable by clicking the option “Feature importance” in the interface, which produces a table with the mean decrease in accuracy (when the feature is excluded) and the total increase in node purity (these values can only be interpreted relatively to each other) per frature. The features are ordered from most important to least important in this table.
Best,
Koen
Dear Koen,
Thanks for your response. Yes, I can rank the relative importance of features based on the “Feature importance” in the interface. However, I want to know which set of features is most important in predicting the outcome variables. In the R package of "randomForest", the "rfcv" function (This function shows the cross-validated prediction performance of models with sequentially reduced number of predictors (ranked by variable importance) via a nested cross-validation procedure) can achieve this. I did not find a similar function in JASP. Do you have any ideas? Thanks in advance.
Best
Faming
Dear Faming,
Unfortunately, I believe we currently do not yet support cross validation for random forest analyses, however this is on the agenda. We will take your request into account and see if we can implement it at the time!
Best,
Koen