Transferring data from a participant to JATOS server
I have a question about JATOS and data transferring. I might have missed the information somewhere else, but I am not really sure what I’m looking at. I am currently planning to run a self-paced reading experiment using the script from the end of this thread (https://forum.cogsci.nl/discussion/1235/solved-self-paced-reading-task). Due to the nature of the task, the size of the dataset per participant is quite large (about 9.2 MB in JSON format). I have managed to figure out the 5 MB restriction on data per one study run and have increased it (by following these instructions https://forum.cogsci.nl/discussion/6390/file-sizes-too-large).
However, the data transfer is still unusually long for 10 MB. I have tested the data transfer on multiple upload speeds (4 Mbps – 68 s, 19 Mbps – 32 s, 93 Mbps – 21 s). Of course, the speed varied, but for practical reasons, 20 seconds on a high-speed connection is not ideal, since I assume most of my participants are more likely closer to internet speeds that require up to a minute for data transfer. In that case, I cannot rely on them to leave the window open and wait for the end of the transfer. Additionally, it leaves a larger risk of interrupting the connection and data being lost. I am running this on Digital Ocean droplet (8 GB / 4 AMD CPUs; 160 GB NVMe SSDs; 5 TB transfer).
Finally, my question is: is the slow data transfer due to some other restriction on JATOS I can’t manage to detect, or is it simply due to the connection and the size of the data? Do you have some advice on how to handle this issue? I am attaching the OS study files and some of the JSON data I obtained during this testing through this link: https://drive.google.com/drive/folders/1gITLlTmEEIavSgJ_b8Ya9jxUYB4CG_dE?usp=sharing
Thanks in advance!
Comments
10 MB is pretty large for result data (even for OpenSesame ones). I recommend you try one more time to reduce the size. I had a look at your example result file and saw many repeating fields.
But to answer your question: No JATOS has no other restriction to limit the data transfer - there is no throttling build in. JATOS tries to handle the incoming result data as fast as possible. I sometimes do stress tests with JATOS and use e.g. 10 MB result data and JATOS can handle them pretty fast (definitely faster than 21s you had with 93 Mbps upload speed). I guess there is something different slowing down your data. I'm curous: How long does it take on a local JATOS where you have no Internet that slows you down?
What you can give a try is turning off the study logs. JATOS calculates a hash from each incoming result data and keeps this hash in the study log. If you have large result data calculating this hash uses some CPU and can take some time. You can put
jatos.studyLogs.enabled = false
in your production.conf to turn off the study logs.Best,
Kristian
I had another idea: are you using a MySQL with your JATOS? Then I'd strongly recommend to turn off the binary log (https://www.jatos.org/JATOS-with-MySQL.html#optional-deactivate-mysqls-binary-log).
Kristian
Thank you very much for the quick response. In the meantime, I reduced the number of logged variables significantly and reduced it to 1 MB per participant. I did not test this extensively yet, but even on the smallest possible droplet on Digital Ocean, it transferred very quickly, barely showing the data transfer screen. When I tested the original file (logging all variables) on a local JATOS, and it took 18 seconds to send data. This seems to go along with your hypothesis about something else slowing the data, since it is not a matter of connection.
Also thank you for the suggestions about study logs and binary logs of MySQL. I tried to disable the study logs, and it did not seem to help much, although I did test this on a smaller droplet than the one mentioned in the original post. Maybe it really is the issue with the data.
Best,
Ksenija