Am I making violations through the use of this test?
It's my first time using JASP as well as Bayesian statistics, so excuse my lack of knowledge. I am trying to figure out whether method A or method B is faster for finding a certain target in a scene. If one of the two methods is indeed faster than the other, I also want to see if this relationship where A is faster than B (or vice-versa) increases as a result of the rarity of target presented (there are 100 possible targets in my data, and each has a different percent chance for showing up in any given scene a person encounters). In other words, figure out whether the methods are different, and whether there is an interaction with target rarity.
There are many limitations in the data I have collected, though. Participants are free to repeat the search task as many times as they want (practice effects are not a concern). Thus, some participants have only a single trial (trial representing a single search for a target in a scene), and others may have hundreds. Participants are also free to use either method A or B as they choose. So some people have data whereby they have only ever used one of the two methods. For those that use both methods, people tend to prefer method B far more often than method A, thus I have different amounts of data for each method. And of course, not every participant has encountered every possible target (as some are very very rare to occur). The one good thing is that I have hundreds of thousands of participants collected for data. Should I be removing participants who have only used one of the two methods, and is it a problem that participants have contributed unequal amounts of data for each method? (Just as an example, participant X may have 20 total trials, 19 of which used method B, and they encountered 10 of the 100 possible targets.) Given all the uncertainty in the data, I was told to try using Bayesian methods.
Can someone point out what analysis I should be using to investigate these issues without falling into traps such as unequal variance and missing data? Is Bayesian Correlation Pairs a statistically sound way to investigate method A vs. method B? And how do I also use JASP to investigate an interaction?
Thanks very much for any help.
Comments
Hi Paul,
This sounds like a fantastic data set. In, say, Bayesian ANOVA it is not an issue if you have more participants in condition A than in condition B, but when we take the mean of the participants then it is an issue that some means are based on more data (and hence are more reliable) than other means. I think this analysis is best done using a Bayesian framework (of course :-)) but it seems to me that a flexible program like JAGS or Stan is more suited to the job. This would probably require you to contact and collaborate with a Bayesian statistician or methodologist. Another piece of advice: do not spoil this data set by looking at it and constructing hypotheses on the basis of all of the data. You have hundreds of thousands of participants, and I urge you to split the data set in two: the first half you just analyze as you would normally, but then you still have the hold-out sample for a confirmatory check.
Cheers,
E.J.
That does sound pretty awesome. Is this data from Airport Scanner by any chance? I saw a talk about that game a while ago (at VSS maybe, I'm not sure).
Check out SigmundAI.eu for our OpenSesame AI assistant!
@EJ Thanks for the information! I really should be taking it upon myself to be learning Bayesian statistics in depth it seems. Or find someone who knows more than me who would like to collaborate.
@sebastiaan Wow, it is actually. Nice guess!