koenderks
About
- Username
- koenderks
- Joined
- Visits
- 147
- Last Active
- Roles
- Member
Comments
-
Hi Emilie, No, there is no way. You need a datafile in which all data for a participant are in a single row.
-
You are right, this is a bug. I have fixed it for the next release.
-
Can you post a screenshot of the entire prediction output, so all tables?
-
Specifically, the test set indicator represents whether each case is included (1) or excluded (0) from the test set (not the model). The cases for which the test set indicator is 0 are used for training and those for which it is 1 are used for evalu…
-
You would nee to create this indicator outside of JASP. But this is typically only required if the classes are unbalanced in the training and/or test set. If the classes are relatively balanced then you can also let JASP sample a fixed % of data as …
-
Hi, It is applied before the dataset is split as train/test. Would the results be different if it was done afterwards? For descaling predictions, a feature request with more detailed info can be made at https://github.com/jasp-stats/jasp-issues/issu…
-
You should be able to edit the range of the x-axis in this plot and set it to some value close to 1. See https://jasp-stats.org/wp-content/uploads/2021/09/Graph_Editor-1.gif for a gif on how this can be done.
-
Hi YCWang, We use the model_parts function from the DALEX R package to compute the variable/feature importance, more info on this method can be found here: https://ema.drwhy.ai/featureImportance.html. By default, the loss function is 1 - (minus) AUC…
-
No need, thanks! I've added some lines to the help files in an existing pull request.
-
It is briefly mentioned under the "Input --> Tables" section as "Explain predictions: Shows the decomposition of the model’s prediction into contributions that can be attributed to different explanatory variables". When clicki…
-
There is a way to see the relative impact of each feature in a machine learning model by clicking “explain predictions” in the interface. Under the hood, this uses the breakdown algorithm (https://ema.drwhy.ai/breakDown.html) instead of the shap alg…
-
Yes, this looks good! You now have a balanced test set and you can use the 'testIndicator' variable as the 'Test set indicator' (as shown below) to index the test set when training any of the algorithms in the machine learning module! https://forum.…
-
Hi may01dz, You cannot do this in the machine learning module directly but requires the manual inspection of the data and specification of a variable in the data set that indicates which observations belong to the test set (indicated by 1, you have …
-
Dear Faming, Unfortunately, I believe we currently do not yet support cross validation for random forest analyses, however this is on the agenda. We will take your request into account and see if we can implement it at the time! Best, Koen
-
Hi Faming, You can view the importance of the features in predicting the target variable by clicking the option “Feature importance” in the interface, which produces a table with the mean decrease in accuracy (when the feature is excluded) and the t…
-
They might have the same values for all variables? I can’t be sure without the data set. Best, koen
-
Hi, You can visualize the weighting scheme via a plot in the k-nn analysis. This will plot the weight as a function of the relative distance of the neighbors. See also https://epub.ub.uni-muenchen.de/1769/1/paper_399.pdf about more exact info abou…
-
I would say this partially depends on whether parameters in the algorithm are optimized under "Training parameters". In the k-nearest neighbors algorithm, the optimal number of neighbors is trained on the training set and after that optimi…
-
I’m not sure that I follow. If you fix the seed then the results should be the same every time you run the analysis :)
-
Each time you run the analysis it randomly selects a training(, validation) and test set to use, so it is expected that the results will be different across runs. You can disable this behavior of the analysis by fixing the seed in the training param…
-
Yes the best way to go is to uncheck the ‘scale predictors’ box. This way the raw data is used for everything. If you are missing a feature in the evaluation metrics table, could you please suggest it via out github page: https://github.com/jasp-s…
-
Hi profwriter, Sorry for the late reply. I believe what you are saying is correct: not influencing purchase (2) Is coded here as the value 1 in the logistic regression. That means the logistic regression is about not influencing purchase (2). You …
-
Hi Manon, Can you try the the following steps: Enter your raw data (no z-scores) in the "Variables" box in the cluster analysis. Go to "Advanced options" and make sure "scale variables" is off. Click to box "Set s…
-
Hi Manon, If the cluster memberships are in the variable "macro VS micro" then this variable should be dragged to the "Split" box. You should then insert all the variables you have used in your initial cluster analysis in the box…
-
Hi Manon, Here I am again. From the top of my head these 0’s are taken into account. However, it is hard to see without looking at what you are looking at. Is it possible for you to save your analyses and upload them as a .jasp file (the default jas…
-
Hi Manon, This can be achieved by exporting the clusters to your data set by clicking “Add predictions to data” and filling in a name for the new column. Then, you can go to the descriptives analysis and use this new column to split the data (i.e., …
-
Hi Andrearicci, Whenever you have performed a cluster analysis and have obtained clusters, you can add a variable with the cluster memberships to your data set by clicking “Add predictions to data” and filling in a column name for the new column. Th…
-
Dear Manon, Let me try to answer your questions! 1- Is there any easy-to-use guide about k-mean clustering with detailled informations (and maybe step-by-step procedure) ? I'm still struggling with the understanding of the software. Currently ther…
-
Hi Johan, It seems to me like you want to be able to state after your sample that the misstatement in the population is lower than 5,000,000 (10 percent of the population size/value). If you want to make this statement with a certain amount of confi…