Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

Avoid race conditions when working with Batch Session Data

Hi,

JATOS' batch session data seems like very powerful way to implement things like counterbalancing. In the case of counterbalancing, you could start by defining a list of (say) 30 conditions in the batch session data, and then at the start of each experimental session get the available conditions from the server, select one, remove it from the list, and update the available conditions on the server (as described here).

This seems to work quite well but I'm afraid that it will run into race conditions when recruiting many participants through a service such as Prolific, because they all start around the same time. And in that case it can happen that two near-simultaneous sessions get the same list of available conditions, each remove one, and then when the available conditions are updated on the server, the second session actually overwrites the update of the first session. I hope that makes sense.

So my question is: is there a straightforward way to avoid race conditions like this when working with batch session data?

— Sebastiaan

Comments

  • krikri
    edited April 2021

    Hi Sebastiaan,

    The batch session (like the group session) handle concurrent data updates and therefore avoid race conditions already. It was actually the most difficult part in the batch session to implement. And since I'm by no means an expert in this field I used a simple system based on optimistic concurrency control with versioning (each batch session has a version stored together with it) together with database transactions.

    You can watch it in action if you click in Chrome's Dev Tools -> Network -> request 'open' -> Messages.

    Each session state has a version and each session update ('patches') has to wait for an 'SESSION_ACK' from the server side to be acknowledged. On the server side the version is compared to the one stored in the database and only if they are equal the update is executed. All others (race condition) return a 'SESSION_FAIL'. This is the reason why it's especially important by dealing with the batch session (or group session) to care for failed calls: all writing session functions return a Promise that can be use to catch fails, e.g.:

    jatos.batchSession.set("b", "koala")
       .then(() => console.log("Batch Session was successfully updated"))
       .catch(() => console.log("Batch Session synchronization failed"));
    

    Btw. the versioning can be switched off. If you are sure that your updates are conflict-free, this can be beneficial since versioning slows down updating the batch session.

    I tried to come up with a simple example to show a race condition happening with versioning turned off and then not happening anymore with versioning turned on - but I couldn't find one :(.

    I hope this all makes sense to you.

    Kristian

  • Hi @kri ,

    Thanks for your reply. So if I understand correctly, you're saying batch-session-data operations are themselves protected from race conditions. Right? I assumed as much, but it's good to see this confirmed.

    To clarify: I'm talking about race conditions that result from multiple clients interfering with each other by sending multiple batch-session-data operations. Let's say that there are three conditions available on the server, and each clients needs to get one and remove it from the list of available conditions. Then the following might (but should not) happen:

    client 1: gets conditions from server (1, 2, 3) and randomly selects 1
    client 2: gets conditions from server (1, 2, 3) and randomly selects 2
    client 1: updates conditions on server to (2, 3)
    client 2: updates conditions on server to (1, 3)  # overwriting the previous update
    

    So in the end the server thinks that conditions 1 and 3 are stil available, even though only condition 3 should still be available.

    Do you see what I mean? This can happen, right? It's a race condition of sorts, but at a higher level than what you're talking about (if I understand correctly). How would you deal with this?

    — Sebastiaan

  • krikri
    edited March 2021

    Your example is covered in the build-in race condition protection. More detailed it looks like:

    1) client 1: gets conditions from server (1, 2, 3) and randomly selects 1
    2) client 2: gets conditions from server (1, 2, 3) and randomly selects 2
    3) client 1: sends updated conditions (2, 3) to server
    4) client 2: sends updated conditions (1, 3) to server
    5) server: updates conditions to (2, 3) because they came first
    6) server: rejects conditions (1, 3) because their version is not up-to-date
    7) client 1: receives SESSION_ACK
    8) client 2: receives SESSION_FAIL, gets the updated condition and can now make a new update
    

    Client 2 gets a SESSION_FAIL and should then react, do the random selection again and send them again.

    But there is a chance for a race condition if one isn't careful while programming the experiment. Here is an example were this happens. If client 2 waits for some time until it updates the session and in the meantime client 1's update is completely done (with SESSION_ACK) then the result are false data in the server's database.

    1) client 1: gets conditions from server (1, 2, 3) and randomly selects 1
    2) client 2: gets conditions from server (1, 2, 3) and randomly selects 2
    3) client 1: sends updated conditions (2, 3) to server
    4) server: updates conditions (2, 3) and sends them to all clients
    5) client 2: receives the updated conditions (2, 3)
    6) client 2: sends updated conditions (1, 3)
    

    The problem between 5) and 6): client 2 is working with stale data. It already received the updated conditions but continues to work with the old ones.

    A fix to this problem is to always check if you work on the latest data before you send any updates. jatos.js helps here with jatos.onBatchSession. The callback function gets called whenever the batch session gets updated (from other clients).

    1) client 1: gets conditions from server (1, 2, 3) and randomly selects 1
    2) client 2: gets conditions from server (1, 2, 3) and randomly selects 2
    3) client 1: sends updated conditions (2, 3) to server
    4) server: updates conditions (2, 3) and sends them to all clients
    5) client 2: receives updated conditions (2, 3)
    6) client 2: notices (via jatos.onBatchSession) that the conditions changed
    7) client 2: randomly selects 2 
    8) client 2: sends updated conditions (3) to server
    9) server: updates conditions (3) and sends them to all clients
    

    There is still a very small possibility for a race condition: between 7) and 8) a third client could update the conditions. JATOS and jatos.js don't care for this one yet. One could add a locking mechanism or let the users care for the versioning themselves - but this would slow down the session or make it difficult to use.

    Kristian

    PS: There is a StackOverflow question discussion this.

  • Hi both,

    this is obviously an interesting feature to add and it’s nice to have a very robust tool that works without any hiccups. But I just want to add a very practical perspective:

    It seems to me that the number of times that this is an actual problem is really low. And not only because this is only relevant when two people access the study almost simultaneously, also for less interesting reasons. We’re going though great pains to ensure balanced numbers of subjects, but the truth is that in online studies many people won’t even finish the study. Or data quality will be bad so you’ll have to exclude them anyway. And none of the checks that you’re talking about here can be made to depend on (the highly idiosyncratic) data quality.

    Or, if neither of these happen, then in the absolute worst case, we’ll end up having to pay an extra couple of participants.

    I don’t want to suggest we settle for a sloppy solution, but it does seem to me like we’re killing a mosquito with a bazooka.

    Best

    Elisa

  • (Unless, that is, there’s a situation that I haven’t thought about where this really is crucial)

  • @elisa

    But I just want to add a very practical perspective: (...) in the absolute worst case, we’ll end up having to pay an extra couple of participants. I don’t want to suggest we settle for a sloppy solution, but it does seem to me like we’re killing a mosquito with a bazooka.

    I like the pragmatic way you think ;-)

    In any case it seems that @kri already implemented most of the bazooka, so this wouldn't be something that would have to be implemented in JATOS. I'm just trying to wrap my head around how the batch-session data feature works so that we can use it properly in OSWeb.

  • krikri
    edited April 2021

    I try to come up with some best practices for dealing with the batch/group session and avoiding data conflicts / race conditions. So far I have:

    • Always catch fails. A fail can happen if there is some kind of server or network problem - but also if there if there are conflicting data. One has to handle this: get the fresh session and recalculate whatever you were doing and retry the session update. E.g.:
    function updateBatchSession() {
        var a = calculateA();
        jatos.batchSession.set("a", a)
            .then(() => console.log("Batch Session was successfully updated"))
            .catch(() => {
                console.log("Batch Session update failed")
                setTimeout(() => updateBatchSession(), 1); // retry with a 1 sec delay
            });
    }
    
    • Always work with fresh session data (or avoid stale session data). Whenever you want to use something from the session get it shortly before using it.
    var a = jatos.batchSession.get("a");
    doSomethingWithA(); // don't do other operations between getting 'a' and using 'a'
    
    • Update as less as possible in the session. Avoid using jatos.batchSession.setAll or jatos.groupSession.setAll. If you write only punctual in the session, chances are lower that you overwrite something accidentally. E.g. if you want to add an element to an array don't overwrite the whole array - use jatos.batchSession.add("/array/-", "newElem") to add it to the end.
    • If participant's data are independent put them in different objects in the session. Avoid updating the same object from different participants, if not necessary. This way you won't have conflicting updates. Also turn off versioning (see next point).
    // participant A writes into 'a'
    jatos.batchSession.set("a", "some data");
    
    
    // participant B writes into 'b'
    jatos.batchSession.set("b", "some other data");
    
    • If you are sure your session update are non-conflicting / will never lead to a race condition turn off versioning with jatos.batchSessionVersioning = false; or jatos.groupSessionVersioning = false; . This speeds up session updates.


  • Hi @kri ,

    Thanks for this explanation. I think I understand it better now. It's more complex (but also more elegant) than I initially thought.

    I attached an experiment that, at the start of the session gets the first element from the pending array. This element serves as the condition for the experiment. The condition is then removed from the pending array and appended to the started array. (The move() function doesn't work for array elements, right?) Then, at the end of the experiment, the condition is appended to the finished array.

    If at the start of the experiment the pending array is empty, then the participants gets a message that he or she can no longer participate in the experiment.

    From the point of view of the researcher, the Batch Session Data needs to be populated beforehand with something like the json data below, where the pending array should be changed so that it contains all the conditions that will be executed one after another. So here, participant 1 will do condition a, participant 2 will do condition b, etc.

    If a participant starts a session but doesn't finish it, then the researcher has to manually move the condition from the started array back to the pending array. I suppose this could be automated too somehow, but it strikes me as unnecessarily complicated.

    {
        "pending": ["a", "b", "a", "b"],
        "started": [],
        "finished": []
    }
    

    What do you think about this?

    — Sebastiaan


  • Hi Sebastiaan,

    What do you think about this?

    I think it looks fine. But let's ask @Eli for her opinion, she did several studies with counter balancing.

    Btw. you can use jatos.batchSession.move to move elements from one array to another:

    And don't forget to catch the fail in case you have a race condition, e.g.

    function chooseCondition() {
        var condition = jatos.batchSession.find("/pending/0");
        jatos.batchSession.move("/pending/0", "/started/0")
            .then(() => {
                console.log("Condition is " + condition));
                return condition;
            }
            .catch(() => {
                console.log("Batch Session update failed");
                setTimeout(() => chooseCondition(), 1);
            });
    }
    

    I didn't try this code, but I'm pretty confident it will work. I'm not sure the setTimeout is necessary, but somehow I don't want an potential endless loop without a wait in between to not block things :).

    Best,

    Kristian

  • edited April 2021

    But let's ask @Eli for her opinion, she did several studies with counter balancing.

    I would not ask that people have to include the names of their conditions in the batch, as you just defined it. Imagine somebody wants to test 300 people.. it will be annoying to have a string with (say) 150 times "a" (which is not be biggest deal) and then have to manually modify it (which is a bigger problem).

    Instead, I would write the name of the condition and the number of counts associated to it somehow.

    Or did I misunderstand something?

  • Thanks @kri and @elisa ! I added some information about this to documentation.

    Imagine somebody wants to test 300 people.. it will be annoying to have a string with (say) 150 times "a" (which is not be biggest deal) and then have to manually modify it (which is a bigger problem).

    Yes, I considered that too. In the end, I decided to do it this way because it's the most general solution that also allows you to control the order of the conditions.

  • Nice you made docs about this. It's a common question. Now we have something that we can just link.

  • edited April 2022

    Hi, @sebastiaan !

    I am about to launch an online study in which I am very interested in counterbalance some condition per participants using batch session data. I have develop another possible method using only an array with conditions and 2 counters: one for already started conditions and other for participants that have already finished the experiment with a given condition. When the starter counter reach a given number, the condition in the array is moved to another array, and is no longer elegible for selection.

    Whit this method, if I want to run an experiment with a very big sample you don't have to use a very big array of conditions for every participant. And the 2 array counters allows me to track the number of conditions already started or finished. If I see that participants abort the experiment or something, it is easier manipulate one of the counters than keep the track of a very big array.

    I think that batch session is a very powerfull tool, and is easier to implement than I expected, but I spend some time trying to figure it out how it works with the documentation. If you want I can give you the example and if you agree, it is possible to improve the documentation on that part!

    Thanks you very much for all your work!

    -Fran

  • Hi @frangfr ,

    Thanks for offering to help, and I agree that using a counter is also a good solution. The easiest way to suggest improvements to the documentation is by directly editing the relevant sections on GitHub:

    — Sebastiaan

Sign In or Register to comment.

agen judi bola , sportbook, casino, togel, number game, singapore, tangkas, basket, slot, poker, dominoqq, agen bola. Semua permainan bisa dimainkan hanya dengan 1 ID. minimal deposit 50.000 ,- bonus cashback hingga 10% , diskon togel hingga 66% bisa bermain di android dan IOS kapanpun dan dimana pun. poker , bandarq , aduq, domino qq , dominobet. Semua permainan bisa dimainkan hanya dengan 1 ID. minimal deposit 10.000 ,- bonus turnover 0.5% dan bonus referral 20%. Bonus - bonus yang dihadirkan bisa terbilang cukup tinggi dan memuaskan, anda hanya perlu memasang pada situs yang memberikan bursa pasaran terbaik yaitu http://45.77.173.118/ Bola168. Situs penyedia segala jenis permainan poker online kini semakin banyak ditemukan di Internet, salah satunya TahunQQ merupakan situs Agen Judi Domino66 Dan BandarQ Terpercaya yang mampu memberikan banyak provit bagi bettornya. Permainan Yang Di Sediakan Dewi365 Juga sangat banyak Dan menarik dan Peluang untuk memenangkan Taruhan Judi online ini juga sangat mudah . Mainkan Segera Taruhan Sportbook anda bersama Agen Judi Bola Bersama Dewi365 Kemenangan Anda Berapa pun akan Terbayarkan. Tersedia 9 macam permainan seru yang bisa kamu mainkan hanya di dalam 1 ID saja. Permainan seru yang tersedia seperti Poker, Domino QQ Dan juga BandarQ Online. Semuanya tersedia lengkap hanya di ABGQQ. Situs ABGQQ sangat mudah dimenangkan, kamu juga akan mendapatkan mega bonus dan setiap pemain berhak mendapatkan cashback mingguan. ABGQQ juga telah diakui sebagai Bandar Domino Online yang menjamin sistem FAIR PLAY disetiap permainan yang bisa dimainkan dengan deposit minimal hanya Rp.25.000. DEWI365 adalah Bandar Judi Bola Terpercaya & resmi dan terpercaya di indonesia. Situs judi bola ini menyediakan fasilitas bagi anda untuk dapat bermain memainkan permainan judi bola. Didalam situs ini memiliki berbagai permainan taruhan bola terlengkap seperti Sbobet, yang membuat DEWI365 menjadi situs judi bola terbaik dan terpercaya di Indonesia. Tentunya sebagai situs yang bertugas sebagai Bandar Poker Online pastinya akan berusaha untuk menjaga semua informasi dan keamanan yang terdapat di POKERQQ13. Kotakqq adalah situs Judi Poker Online Terpercayayang menyediakan 9 jenis permainan sakong online, dominoqq, domino99, bandarq, bandar ceme, aduq, poker online, bandar poker, balak66, perang baccarat, dan capsa susun. Dengan minimal deposit withdraw 15.000 Anda sudah bisa memainkan semua permaina pkv games di situs kami. Jackpot besar,Win rate tinggi, Fair play, PKV Games