Select Cases in JASP
Is it possible to select n cases at random from a JASP data file? SPSS has such a function and I am attempting to replicate an assignment for students in my statistics class.
Is it possible to select n cases at random from a JASP data file? SPSS has such a function and I am attempting to replicate an assignment for students in my statistics class.
Comments
I'll ask around :-)
Hi Eric,
One way to do this in jasp is via the compute column functionality and the filter.
You can compute a new column by pressing the "+" symbol on the right hand side of the data view. After you ticked "+" you have to enter:
- a name, say, "selector",
- the type of your column, which should be a nominal
- whether you'd like to compute a column using point and click, (default) or you can use R.
1.a Compute column: Click interface
I'll elaborate on the point and click interface first, which on the right-hand side has a function called binomDist(y). If you press it, it'll provide you with a binomDist(trials, prob). For trials you can enter 1 and for prob, you want something that matches up the n you'd like. For instance, if you have 40 students and you want to select 30, then you'd set prob to 0.75. After you press compute column you see a "selector" being filled with 0 and 1. Note that it's not guaranteed that with prob = 0.75 you get exactly 30 out of 40, because the binomial distribution is random.
1.b Compute column: R code
Skip this part, if you've done 1a. If you prefer R code over the click interface and you'd like to select 30 out of 40 students, then you use the following code
selector <- rep(0, 40)selector [sample(40, 30)] <- 1selectorThis will create a selector variable with 40 entries and 30 ones at random positions. Alternatively, if you don't want to count all students, but have a column available called ID referring to the student IDs then you can use the code
selector <- rep(0, length(ID))selector [sample(length(ID), 30)] <- 1selectorinstead.
2. Filtering
Now you have a "selector" variable, you can now turn cases on and off using the filter. To do so, go to the computed column "selector" in the data viewer and click on the label. This will open a filter menu and you can then click on the tick in front of 0, which turns it into a cross. Your data set is now filtered and only 30 students are active in the analysis.
Let me know if this works for you.
Cheers,
Alexander
And as a small aside to Alexander's excellent post: you can also enter all of this directly as a filter.
Either the point&click interface or the R-script entry. This should save you the step where you later put a filter on the generated column.