Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

A question regarding resampling in within-subject designs using distribution parameters

I'm currently reviewing a paper that uses an analysis that goes beyond my statistical expertise--but I suspect it is invalid. It concerns a resampling analysis, but I nevertheless hope you JASPian/ Bayesian folk can shed some light on it.

It's a within-subject design with 12 participants. Each of these participants has some effect, let's call it the dv(per-participant-observed). (To respect the author's anonymity, I'll stay vague.) If I understand correctly, the authors then do the following:

  • They first determine the mean observed effect across the 12 participants. Let's call this dv(grand-observed).
  • Next, for each participant, they randomly shuffle the data 1000 times, and determine a 1000 surrogate effects. Lets call these dv(per-participant-surrogate).
  • Next, they repeat the following procedure again a 1000 times:
    • For each participant, randomly select a(n undefined) number of dv(per-participant-surrogate) values and average those. Let's call the result dv(per-participant-surrogate-average).
    • Take the average across participants of this dv(per-participant-surrogate-average). Let's call the result dv(grand-surrogate).

Still with me?

So we end up with 1000 dv(grand-surrogate) values. Based on this, they determine the mean and standard deviation of the surrogate distribution. And then they look how far in the tail of this distribution dv(grand-observed) is. The resulting p value is a whopping p=.00000000000000000001 (N=12!).

I have doubts about two things:

  • Is it valid to do this kind of hierarchical resampling, first at the individual participant level, and then at the across-participant level? The implications of this approach hurt my brain.
  • Is it valid to use distribution parameters to estimate p values, when the observed value is so far out into the tail? Shouldn't you rather just look at the percentage of surrogate values that are smaller/ larger than the observed value? (And not bother with the mean and standard deviation of the distribution?)

I would be grateful if anyone could shed some light on this!



Sign In or Register to comment.