koenderks

Yes, this looks good! You now have a balanced test set and you can use the 'testIndicator' variable as the 'Test set indicator' (as shown below) to index the test set when training any of the algorithms in the machine learning module! https://forum.…

in balanced test set Comment by koenderks October 2023

Hi may01dz, You cannot do this in the machine learning module directly but requires the manual inspection of the data and specification of a variable in the data set that indicates which observations belong to the test set (indicated by 1, you have …

in balanced test set Comment by koenderks October 2023

Dear Faming, Unfortunately, I believe we currently do not yet support cross validation for random forest analyses, however this is on the agenda. We will take your request into account and see if we can implement it at the time! Best, Koen

in feature selection in random forest regression Comment by koenderks October 2022

Hi Faming, You can view the importance of the features in predicting the target variable by clicking the option “Feature importance” in the interface, which produces a table with the mean decrease in accuracy (when the feature is excluded) and the t…

in feature selection in random forest regression Comment by koenderks October 2022

They might have the same values for all variables? I can’t be sure without the data set. Best, koen

in dendrogram Comment by koenderks July 2022

You can export the clusters assigned to each row in the data by clicking “export predictions/clusters” and filling in a column name for the row (to be added to the data set) that will contain these cluster assignments. Best, Koen

in K-means Comment by koenderks July 2022

Hi, You can visualize the weighting scheme via a plot in the k-nn analysis. This will plot the weight as a function of the relative distance of the neighbors. See also https://epub.ub.uni-muenchen.de/1769/1/paper_399.pdf about more exact info abou…

in Weighted KNN Comment by koenderks July 2022

I would say this partially depends on whether parameters in the algorithm are optimized under "Training parameters". In the k-nearest neighbors algorithm, the optimal number of neighbors is trained on the training set and after that optimi…

in Machine learning Comment by koenderks June 2022

I’m not sure that I follow. If you fix the seed then the results should be the same every time you run the analysis :)

in Machine learning Comment by koenderks June 2022

Each time you run the analysis it randomly selects a training(, validation) and test set to use, so it is expected that the results will be different across runs. You can disable this behavior of the analysis by fixing the seed in the training param…

in Machine learning Comment by koenderks June 2022

Yes the best way to go is to uncheck the ‘scale predictors’ box. This way the raw data is used for everything. If you are missing a feature in the evaluation metrics table, could you please suggest it via out github page: https://github.com/jasp-s…

in Predictions with regression tools Comment by koenderks April 2022

Hi profwriter, Sorry for the late reply. I believe what you are saying is correct: not influencing purchase (2) Is coded here as the value 1 in the logistic regression. That means the logistic regression is about not influencing purchase (2). You …

in Interpret logistic regression Comment by koenderks March 2022

Hi Manon, Can you try the the following steps: Enter your raw data (no z-scores) in the "Variables" box in the cluster analysis. Go to "Advanced options" and make sure "scale variables" is off. Click to box "Set s…

in Help for standard deviations K-means clustering Comment by koenderks February 2022

Hi Manon, If the cluster memberships are in the variable "macro VS micro" then this variable should be dragged to the "Split" box. You should then insert all the variables you have used in your initial cluster analysis in the box…

in Help for standard deviations K-means clustering Comment by koenderks February 2022

Hi Manon, Here I am again. From the top of my head these 0’s are taken into account. However, it is hard to see without looking at what you are looking at. Is it possible for you to save your analyses and upload them as a .jasp file (the default jas…

in difficulties with my k-means clustering analysis Comment by koenderks February 2022

Hi Manon, This can be achieved by exporting the clusters to your data set by clicking “Add predictions to data” and filling in a name for the new column. Then, you can go to the descriptives analysis and use this new column to split the data (i.e., …

in Help for standard deviations K-means clustering Comment by koenderks February 2022

Hi Andrearicci, Whenever you have performed a cluster analysis and have obtained clusters, you can add a variable with the cluster memberships to your data set by clicking “Add predictions to data” and filling in a column name for the new column. Th…

in Screen Sharing Session to Get support for a Cluster Analysis ? Comment by koenderks February 2022

Dear Manon, Let me try to answer your questions! 1- Is there any easy-to-use guide about k-mean clustering with detailled informations (and maybe step-by-step procedure) ? I'm still struggling with the understanding of the software. Currently ther…

in K-means clustering - Beginner help Comment by koenderks February 2022

Hi Johan, It seems to me like you want to be able to state after your sample that the misstatement in the population is lower than 5,000,000 (10 percent of the population size/value). If you want to make this statement with a certain amount of confi…

in Monetary Unit Sample - Poisson Comment by koenderks November 2021

Hi Mateus, I understand what you mean now! I actually don't think you can currently make the exact plot that you want in JASP. However, there are some ways in which we can cheat ourselves to something very similar. The first alternative I guess wou…

in Scatter Plot Comment by koenderks August 2020

Hi Mateus, I'm having a little trouble understanding what kind of plot it is that you desire. From your second plot, it seems like you have already obtained a grouped scatter plot that represents the distribution of the clusters, as well as their de…

in Scatter Plot Comment by koenderks August 2020

Hi vps2020, Good spot. That is because, by default, the predicted values are scaled/standardized to have a mean of zero and a standard deviation of one (see picture below). This is good practice in training a ml model, but this causes the predicted …

in Random Forest Regression Comment by koenderks April 2020

Hi vps2020, Great that you like the program and the machine learning module! To elaborate on your question regarding the relative importance of the variables in a random forest regression model, these can be requested in JASP via the "Variable …

in Random Forest Regression Comment by koenderks April 2020

Alright, good! Sorry for the hassle, in the next release of JASP Bain will get rid of it "Beta" status and these issues will be fixed.