Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

Does the "R" method for creating computed variables use a kind of pseudo R coding?

When I use the "R" option for creating a computed variable, I see that I don't use normal R methods to refer to variables within data frame. For example, with the Tooth Growth data set, if I create a variable called "z" and then make z equal to the square root of the variable "len," I specify "sqrt(len") instead of the usual R methods such as "sqrt(data[,"len"])" or "sqrt(data$len)". So does jasp's R option (for computed variables) use what might be called a "pseudo R" coding? Or, perhaps, is there some special R package being employed?

R

Comments

  • Actually, I meant to indicate something like "sqrt(data[i,"len"]" within an i loop.

    R

  • I'll ask our expert. Sorry for the tardy response, I stopped getting emails from the Forum for some reason.

  • It's not just "pseudo R" coding, it's just plain R code with some restrictions (for example, some functions are not allowed). We do make some custom functions available, like z-score transformations, but in principle, with the R option you can write your own R code to create a column or filter. The last statement is used as the "returned value", so it's also fine to split code over multiple lines.

  • Still, regarding the following data frame:

    mydata = data.frame(

       person = c(1, 2, 3), 

       q = c(10, 100, 1000),

       r = c(20, NA, 2000),

       s = c(30, 300, 3000)

    )


    In base R, one way to get a mean for each person is . . .

    mydata$MyNewVar2 <- NA

    for (i in c(1:nrow(data))) {

         mydata[i, "MyNewVar2"] <- mean(as.numeric(mydata[i,c("q", "r", "s")]), na.rm = TRUE)  

    }


    Another way is as follows . . .

    mydata$MyNewVar3 <- apply(mydata[, c("q", "r", "s")], 1, mean, na.rm = TRUE)


    Using dplyr, one can write . . .

    mydata = mutate(rowwise(mydata), MyNewVar = mean(c(q, r, s), na.rm = TRUE))


    It seems that jamovi "R" syntax is closest to the of dplyr R. Is it exactly dplyr R, or is it something else? (It definitely is not base R.)


    Thanks.

    R

  • I'm not sure what jamovi does, but in R we simply execute the R code and try to use the return value.


    So if this is the dataset loaded in JASP:

    mydata = data.frame(
      person = c(1, 2, 3), 
      q = c(10, 100, 1000),
      r = c(20, NA, 2000),
      s = c(30, 300, 3000)
    )
    

    then doing

    MyNewVar2 <- rep(0, length(q))
    for (i in c(1:nrow(data))) {
        MyNewVar2 <- mean(as.numeric(c(q[i], r[i], s[i])]), na.rm = TRUE) 
    }
    MyNewVar2
    

    is equivalent to what you wrote above. Here is a screenshot:

    You can do the same with apply, albeit that we don't make a dataset object available so you'd have to create your own first:

    Hope that helps!

  • Thanks.

    One thing I notice is that with your R code, you are creating a new vector rather than a new data-frame column (it was not obvious to me that JASP only wanted a new vector).

    However, it's still the case that if I use drag and drop to compute (q + r + s) / 3, JASP informs me that the corresponding R code is "(q + r + s) / 3" -- no looping or anything like that! (See the screen shot below.)

    Moreover, if I then make use of JASP's option to use "R" (rather than drag-and-drop) to recreate the same variable (this time as MyNewVar3 instead of MyNewVar2) -- and if I simply enter the "R" code, (q + r + s) / 3 -- it works! (See the screen shot, below.) Yet there is no looping, no na.rm, no nothing (except (q + r + s) / 3).

    So what's the explanation for this?


    R

  • Well, the drag and drop is under the hood translated to R code. If you want, this can be shown to you. In R, many operations (e.g., +, -, *, /) are vectorized if the vectors have the same length. For example, you can add two vectors. So to compute the rowwise mean you can indeed do

    q <- 1 * 10 ^ (1:3) # does c(1 * 10^1, 1 * 10^2, 1 * 10^3)
    r <- 2 * 10 ^ (1:3)
    s <- 3 * 10 ^ (1:3)
    (q + r + s) / 3
    

    albeit that missing values (NA) are propagated.

  • OK. Thanks. Here are the conclusions I've drawn with regard to instructing students and others on the use of R to compute new variables in JASP:


    "

    The JASP data set is accessed, *not* as an R data frame, but as a collection of *individual vectors*. Thus for example, if there's an existing JASP data column, x, and you want to create a new column in which each new value equals x + 5, you can use the expression: x + 5

    Because the x variable you interact with is *not* part of a data frame, you should not attempt anything resembling: df[ , "x"] + 5

    That said, if you use happen to have used R computations within JASP to create your own R data frame (which, to re-emphasize, won't constitute a JASP data set), then you can manipulate the data frame's elements the way you normally would in R.

    "

    R

Sign In or Register to comment.