Nonparametric Friedman test - repeated measures and repeated subjects
Hi all,
I would appreciate some advice on the most appropriate nonparametric test to evaluate the impact on some measures when subsampling my original datasets.
Number of subjects: 9
Number of original datasets: 14 (some subjects have multiple sessions; 3 subjects have 2 sessions; 1 subject has 3 sessions, by session I mean the data was acquired on a different day; I don't have matching sessions because I am still acquiring the data).
For each dataset, I generated 6 subsets: meaning that for each of the 14 original datasets I have 6 subsets (in total 14 original dataset and 84 subsets). I want to evaluate the impact of subsampling my original dataset on 6 different features that I extracted from the correspondent histograms. I used the nonparametric Friedman test to evaluate the effect of subsampling on each histogram metric. However, I am concerned about the fact that within my dataset I have repetitions of the subject. Is there a way to account for that?
Thank you in advance for your time!
Comments
Hi ARF,
I do not know why you would want to subsample, but I will take this as a given. One issue is whether the subsamples are partially overlapping, as you'd get from a standard bootstrap approach. If so, this overlap needs to be taken into account, complicating matters. If you just split the original data into nonoverlapping subsets then each subset gives you a feature. For simplicity you could focus on the point-estimate and then analyze the results with a repeated measures ANOVA/linear mixed model?
Cheers,
E.J.
Hi E.J,
Sorry, I missed the notification of your reply. I really appreciate you taking the time to answer my question! :)
I am interested in subsampling because acquiring the full dataset requires 15 minutes. As so, I was testing if I select fewer data points what would be the impact on the histogram features that I want to extract on a further step by using that subset.
I had that detail into consideration, so my subsets do not overlap.
But should I use repeated measures ANOVA with such a small sample (total N) and unbalanced samples in each session?
Maybe I need to add 2 within-subject factors? subset (= full; subset1; subset2) and session (= session1; session2; session3; session4, to account for repetitions of the same subject)?
Sorry, I feel that I lack experience in applied biostatistics and that I am always violating some assumption by using parametric testing.
Thank you!
Hello ARF,
Well, this is somewhat of an usual situation and I'm not sure! Low N is never helpful, but you seem to have many observations per subject...you might want to plot the effects of interest per subject perhaps
Cheers,
E.J.
Hi EJ,
Thanks a lot for your feedback! :)
Cheers,
ARF