Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

Applying functions (e.g. mean, sd) across variables

I'd like to apply some standard functions across rows. For example, to get the mean of some scale items on a survey I could use the R code (Q1 + Q2 + Q3) / 3, but if any one of those is missing, the result would be missing. I'd like the result to be the mean of the non-missing values. I'd like to do this in general, so that I could get the median, sd, etc. Is this possible? Thanks!

Comments

  • edited May 27

    I know that this is unlikely to be helpful, as I am not savvy in either DataMatrix (1) or R, but doesn't DataMatrix have NumPy as dependency? In numpy, you could exclude the missing values with np.nanmean() or filter out the missing values altogether with np.isnan(). Alternatively, most functions in Pandas -which I tend to use most- exclude missing values by default.

    (1) I wasn't sure if your question was only related to R or also DataMatrix, given its category.

  • Hi Bob,

    Like @cesco, I'm not sure whether your question concerns the Python DataMatrix library or something else (perhaps an R data frame?). However, if this is about DataMatrix, then you can simply use the mean property of a column, which will only use non-nan numeric values.

    For more information, see also:

    Cheers!

    Sebastiaan

    There's much bigger issues in the world, I know. But I first have to take care of the world I know.
    cogsci.nl/smathot

Sign In or Register to comment.