Joining files from two different experiments

quartz_pool · July 2019

Hi Pascal,

I've been enjoying my work with your mousetrap package a lot! At the moment, I'm kind of trying to go along with your code & logic from your recent paper "Design factors in mouse-tracking: What makes a difference?", which is all kinds of interesting. I'm learning a lot.

I have run into a problem that I'm just too dense to solve, though:

I absolutely don't get how to join the .csv files from two different experiments in a manner that allows me to perform the comparisons you go through in your supplementary "exp2_exp1_comparison.Rmd". Among other things, I tried starting off by changing the "subject_id" so that they are consecutive (non-duplicated) across both experiments, but nothing seems to work.

Maybe this rings a bell with you and here's to hoping that you might have some pointers for me!

quartz_pool

eduard · July 2019

Hi,

I don't know what the format of the datasets have for the mouse-tracking package, but if it is a "regular" R dataframe. You can follow this description here:

https://stackoverflow.com/questions/8169323/r-concatenate-two-dataframes

So, for example rbind(dataset1,dataset2)

Just make sure that in each dataset there is an identifier for that set, e.g. a variable experiment that is exp1 for the first set and exp2 for the second one.

Does that make sense?

Eduard

Pascal · July 2019

Hi,

glad to hear that you enjoy working with the mousetrap package. :)

Regarding the paper you mentioned: yes, you are looking in the right place (for anyone else interested: the R markdown file can be found on OSF at https://osf.io/s4wy5/).

Within the file, the relevant code is this:

raw_data1 <- read.csv("../data/exp1.csv",stringsAsFactors = FALSE)
raw_data1$study <- "study1"
raw_data1 <- subset(raw_data1,group=="click")

raw_data2 <- read.csv("../data/exp2.csv",stringsAsFactors = FALSE)
raw_data2$study <- "study2"
raw_data2 <- subset(raw_data2,group=="default")

raw_data <- bind_rows(raw_data1,raw_data2)

As Eduard suggested, I created an identifier variable there (called study) that codes from which experiment the data stems.

To combine the datasets, I did not use the rbind function but instead the bind_rows function from the dplyr package as it also allows you to combine datasets where the column order differs (and where one dataset may even contain columns that don't exist in the other dataset). However, in order for the merging to result in a useful dataset you have to make sure that the names of the columns that contain the same data across datasets have the same name.

Regarding the subject identifier: yes, you should make sure that there are no overlapping identifiers across the datasets. One easy way to do this (if the identifier does not have to be a number) could be to do the following in the combined dataset:

raw_data$subject_nr_combined <- paste(raw_data$study,raw_data$subject_nr,sep="_")

But there are many different ways of doing this.

Hope this helps - feel free to ask if you have more questions.

Best,

Pascal

quartz_pool · July 2019

Hi eduard, hi Pascal,

another heartfelt thank you to you both for taking the time to help me out, especially with this low-level kind of stuff.

Both of your solutions work a treat and while I can say that i was kind of on the right track, it's always incredible how the simplest things can stump a newbie.

Best,

quartz_pool

Howdy, Stranger!

Categories

Joining files from two different experiments

Comments

Howdy, Stranger!

Quick Links

Categories

Joining files from two different experiments

Comments