oversampling in machine learning module in JASP
I'm working on a project to discriminate disease condition from non-disease condition by a set of clinical data. However, I have far more non-disease samples than disease ones(around 100:1, total number around 3000). I knew that we could do oversampling by combine the disease data to a subset of non-disease data, but is there a convenient way to do so in JASP or I have to divide the sample by myself?
Also I would like to know is there also convenient ways to do k-fold cross-validation in the machine learning modules in methods other than KNN classification (I noted there has k-fold validation in KNN but not the other modules)
Thanks for anyone who can help!
Comments
I've attended our expert to this, but I am not sure why a simple approach such as logistic regression would not work very well in this case.
Cheers,
E.J.