Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Supported by

Large Bayes Factor changes with exclusion of single subject (Bayesian ANOVA)

Hi all,

I have run both frequentist and Bayesian 2x2x3 repeated measures ANOVA's for my analysis of some reaction time data. At one point I noticed an interesting discrepancy between the two: in the frequentist analysis, a main effect (of SESSION) was not significant (p = 0.09), but it received fairly strong support in the Bayesian analysis (BF10 of only the model: 56, inclusion BF: 40).

This struck me as odd: I have had only encountered the opposite situation before (significant p-value, BF in favor of null). So I decided to look at the individual subject data (for a plot, see data.png file). It turns out there is one subject out of 26 that showed a much stronger effect than everyone else (one of the orange lines in the figure). I have no reason to exclude this subject, but because I'm still learning about Bayesian statistics I decided to try and see what would happen if I reran the analyses without this subject.

To my surprise, while the p-value changed only moderately (from 0.09 to 0.14), the Bayes factor plummeted from 57 to around 2. So by excluding just one subject I have gone from "strong" to "anecdotal" evidence for H1.

I don't really have a good intuition for Bayes factors yet, but this large change seems quite odd, especially compared to the frequentist analysis. Also, I am really in doubt how to interpret and report this effect now, if at all.

I'm left wondering under what circumstances such discrepancies between frequentist and Bayesian statistics can occur? Could it be that Bayes factors are indeed in a sense more sensitive to "outliers"?

Any input would be much appreciated. In case that's helpful, I've attached two .jasp files, each containing the frequentist and Bayesian ANOVAs either for all data (all.jasp) or for the dataset without this one subject (without.jasp).

-Leon

Comments

  • That's interesting. I'll have to look at this a little later (some deadlines now, remind me if I haven't responded in a week), but some of the discrepancies may be due to violations of assumptions (homogeneity of variances). I'll take a look.

  • Hmm OK, that is a rather big effect of leaving out this single participant. Of course, you could also argue "how it is that the p-value only changes from .09 to .14 when I leave out this huge outlier in my relatively small data set?" but that's not constructive, and the change in the BF remains surprisingly large. To understand the problem better, perhaps you could break it down into a t-test. For a t-test, the t-value and sample size will determine the BF (hence the "Summary Stats module"). Do you find a similar discrepancy? (i.e., eliminate the noncrucial ANOVA variables and let's talk t-test instead)
    E.J.

  • Hi E.J.,

    Thanks so much for the super swift reply! I agree that you could be equally surprised about the behavior of the p-value---it was more the discrepancy that struck me.

    Interestingly, averaging over the other variables and just doing t-tests indeed leads to a completely different picture. The p-values are the same of course, but now the BFs in both cases are also modest and don't diverge much anymore:

    • All subjects: p = 0.091, BF10 = 0.795 (see all_t-test.jasp)
    • Without this one subject: p = 0.139, BF10 = 0.588 (see without_t-test.jasp)

    Still, there thus remains a discrepancy between frequentist and Bayesian tests: for the former it doesn't matter how you specify the model, but for the latter it matters a lot.

    I take it this is because the "Bayesian t-test" and "Bayesian ANOVA" are just descriptive names, but the underlying tests differ quite a lot? That is, the t-test/ANOVA are equivalent in the frequentist framework in the case of one factor with two levels, but this does not hold for the Bayesian analogs.

    Would you say this means that (main) effects in Bayesian ANOVAs should generally be followed up with Bayesian t-tests? Or are the results of Bayesian t-tests and ANOVAs generally similar, which would mean that my data are just weird / violate certain assumptions?

    -Leon

  • I'll need some time to digest this. Will get back to you later.

  • So I've tried reproducing this with simulated data, and I can't get it to work. See the following (change the seed for different data):

    library(BayesFactor)
    library(ggplot2)
    
    set.seed(8)
    N = 20
    effect = .2
    sigma.sub = 1
    sigma.err = .2
    y.mat = outer(rnorm(N,0,sigma.sub),c(-effect/2,effect/2), "+") + rnorm(2*N,0,sigma.err)
    
    y = as.data.frame.table(y.mat)
    summary(aov(Freq ~ Var2 + Error(Var1/Var2), dat = y))
    
    anovaBF(Freq ~ Var2 + Var1, whichRandom = "Var1", dat = y)
    
    ggplot(y, aes(y = Freq)) +
      geom_point(aes(x = Var2), size = 5) +
      geom_line(aes(x = Var2), group = y$Var1)
    
    # Modify data
    
    y0.mat = y.mat
    lowest = which.min(y0.mat[,1])
    y0.mat[lowest,2] = quantile(y0.mat[,2], p = .9)
    y0 = as.data.frame.table(y0.mat)
    
    summary(aov(Freq ~ Var2 + Error(Var1/Var2), dat = y0))
    
    anovaBF(Freq ~ Var2 + Var1, whichRandom = "Var1", dat = y0)
    
    ggplot(y0, aes(y = Freq)) +
      geom_point(aes(x = Var2), size = 5) +
      geom_line(aes(x = Var2), group = y0$Var1)
    

    Notice that the Bayes factors seem to track the p value as one would expect.

  • Dear E.J. and Richard,

    Thank you so much for taking another look at my post. Indeed, Richard's simulation seems to behave differently, so it must be something particular about my data?

    I've included the data (aov_data.csv) and the code that gets me these results (I mainly use the BayesFactor package directly; I only posted the JASP results first because I thought that would be more convenient). Sorry about the length; I wanted it to be reproducible and was unable to make it shorter. In case it's still not reproducible: The p and BF values I obtained are also included as comments inbetween the code.

    Please let me know if there's any more information I can provide you with, and thanks again for all your help so far!

    -Leon

    library(BayesFactor)
    #> Loading required package: coda
    #> Loading required package: Matrix
    #> ************
    #> Welcome to BayesFactor 0.9.12-2. If you have questions, please contact Richard Morey (richarddmorey@gmail.com).
    #> 
    #> Type BFManual() to open the manual.
    #> ************
    
    # Change this to match your filepath
    df <- read.csv("/Users/lcreteig/Desktop/aov_data.csv")
    
    df$session <- factor(df$session) # 2 levels: session A or B, (different brain stimulation)
    df$stimulus <- factor(df$stimulus) # 2 levels: on-screen target presented on LEFT or RIGHT
    df$block <- factor(df$block) # 3 different post-measurements
    # "latency" is the change in reaction time (ms) from baseline (so 0 is no change; baseline data not included)
    
    dfWithout <- df[df$subject != "S01", ] # make new data frame without "outlier" subject
    df$subject <- factor(df$subject) # original data has 26 subjects
    dfWithout$subject <- factor(dfWithout$subject) # new data has one less
    
    # All factors 
    # this matches the .jasp files in the 1st post
    
    ## Frequentist ANOVA
    
    ### all subjects
    freq <- aov(latency ~ session*stimulus*block + Error(subject/(session*stimulus*block)), dat = df)
    summary(freq)$'Error: subject:session'[[1]]$'Pr(>F)'
    #> [1] 0.09083114         NA
    
    ####without one subject
    freqWithout <- aov(latency ~ session*stimulus*block + Error(subject/(session*stimulus*block)), dat = dfWithout)
    summary(freqWithout)$'Error: subject:session'[[1]]$'Pr(>F)'
    #> [1] 0.1392397        NA
    
    ## Bayesian ANOVA
    
    ### all subjects
    bf <- anovaBF(latency ~ session*stimulus*block + subject, data = df, 
                  whichModels = "withmain", whichRandom = "subject") 
    extractBF(bf)$bf[1]
    #> [1] 60.51355
    
    #### without one subject
    bfWithout <- anovaBF(latency ~ session*stimulus*block + subject, data = dfWithout, 
                         whichModels = "withmain", whichRandom = "subject") 
    extractBF(bfWithout)$bf[1]
    #> [1] 1.962064
    
    # Only SESSION
    # this matches the .jasp files in the 1st post
    
    # Average over all the other variables, except subject and session
    dfSession <- aggregate(.~subject+session, df, mean)
    dfSessionWithout <- aggregate(.~subject+session, dfWithout, mean)
    
    ## Frequentist ANOVA
    
    ### all subjects
    freqSession <- aov(latency ~ session + Error(subject/session), dat = dfSession)
    summary(freqSession)$'Error: subject:session'[[1]]$'Pr(>F)'
    #> [1] 0.09083114         NA
    
    #### without one subject
    freqSessionWithout <- aov(latency ~ session + Error(subject/session), dat = dfSessionWithout)
    summary(freqSessionWithout)$'Error: subject:session'[[1]]$'Pr(>F)'
    #> [1] 0.1392397        NA
    
    ## Bayesian ANOVA
    
    ### all subjects
    bfSession <- anovaBF(latency ~ session + subject, data = dfSession, 
                  whichModels = "withmain", whichRandom = "subject") 
    extractBF(bfSession)$bf[1]
    #> [1] 0.9600066
    
    #### without one subject
    bfSessionWithout <- anovaBF(latency ~ session + subject, data = dfSessionWithout, 
                         whichModels = "withmain", whichRandom = "subject") 
    extractBF(bfSessionWithout)$bf[1]
    #> [1] 0.680306
    

    Session info
    devtools::session_info()
    #> Session info --------------------------------------------------------------
    #>  setting  value                       
    #>  version  R version 3.4.0 (2017-04-21)
    #>  system   x86_64, darwin15.6.0        
    #>  ui       X11                         
    #>  language (EN)                        
    #>  collate  C                           
    #>  tz       Europe/Amsterdam            
    #>  date     2017-12-01
    #> Packages ------------------------------------------------------------------
    #>  package      * version  date       source         
    #>  BayesFactor  * 0.9.12-2 2015-09-19 CRAN (R 3.4.0) 
    #>  Matrix       * 1.2-9    2017-03-14 CRAN (R 3.4.0) 
    #>  MatrixModels   0.4-1    2015-08-22 CRAN (R 3.4.0) 
    #>  Rcpp           0.12.10  2017-03-19 cran (@0.12.10)
    #>  backports      1.0.5    2017-01-18 CRAN (R 3.4.0) 
    #>  coda         * 0.19-1   2016-12-08 CRAN (R 3.4.0) 
    #>  devtools       1.12.0   2016-12-05 CRAN (R 3.4.0) 
    #>  digest         0.6.12   2017-01-27 CRAN (R 3.4.0) 
    #>  evaluate       0.10     2016-10-11 CRAN (R 3.4.0) 
    #>  gtools         3.5.0    2015-05-29 CRAN (R 3.4.0) 
    #>  htmltools      0.3.6    2017-04-28 CRAN (R 3.4.0) 
    #>  knitr          1.15.1   2016-11-22 CRAN (R 3.4.0) 
    #>  lattice        0.20-35  2017-03-25 CRAN (R 3.4.0) 
    #>  magrittr       1.5      2014-11-22 cran (@1.5)    
    #>  memoise        1.1.0    2017-04-21 CRAN (R 3.4.0) 
    #>  mvtnorm        1.0-6    2017-03-02 CRAN (R 3.4.0) 
    #>  pbapply        1.3-2    2017-03-01 CRAN (R 3.4.0) 
    #>  rmarkdown      1.5      2017-04-26 CRAN (R 3.4.0) 
    #>  rprojroot      1.2      2017-01-16 CRAN (R 3.4.0) 
    #>  stringi        1.1.5    2017-04-07 cran (@1.1.5)  
    #>  stringr        1.2.0    2017-02-18 cran (@1.2.0)  
    #>  withr          1.0.2    2016-06-20 CRAN (R 3.4.0) 
    #>  yaml           2.1.14   2016-11-12 CRAN (R 3.4.0)
    

Sign In or Register to comment.

agen judi bola , sportbook, casino, togel, number game, singapore, tangkas, basket, slot, poker, dominoqq, agen bola. Semua permainan bisa dimainkan hanya dengan 1 ID. minimal deposit 50.000 ,- bonus cashback hingga 10% , diskon togel hingga 66% bisa bermain di android dan IOS kapanpun dan dimana pun. poker , bandarq , aduq, domino qq , dominobet. Semua permainan bisa dimainkan hanya dengan 1 ID. minimal deposit 10.000 ,- bonus turnover 0.5% dan bonus referral 20%. Bonus - bonus yang dihadirkan bisa terbilang cukup tinggi dan memuaskan, anda hanya perlu memasang pada situs yang memberikan bursa pasaran terbaik yaitu http://45.77.173.118/ Bola168. Situs penyedia segala jenis permainan poker online kini semakin banyak ditemukan di Internet, salah satunya TahunQQ merupakan situs Agen Judi Domino66 Dan BandarQ Terpercaya yang mampu memberikan banyak provit bagi bettornya. Permainan Yang Di Sediakan Dewi365 Juga sangat banyak Dan menarik dan Peluang untuk memenangkan Taruhan Judi online ini juga sangat mudah . Mainkan Segera Taruhan Sportbook anda bersama Agen Judi Bola Bersama Dewi365 Kemenangan Anda Berapa pun akan Terbayarkan. Tersedia 9 macam permainan seru yang bisa kamu mainkan hanya di dalam 1 ID saja. Permainan seru yang tersedia seperti Poker, Domino QQ Dan juga BandarQ Online. Semuanya tersedia lengkap hanya di ABGQQ. Situs ABGQQ sangat mudah dimenangkan, kamu juga akan mendapatkan mega bonus dan setiap pemain berhak mendapatkan cashback mingguan. ABGQQ juga telah diakui sebagai Bandar Domino Online yang menjamin sistem FAIR PLAY disetiap permainan yang bisa dimainkan dengan deposit minimal hanya Rp.25.000. DEWI365 adalah Bandar Judi Bola Terpercaya & resmi dan terpercaya di indonesia. Situs judi bola ini menyediakan fasilitas bagi anda untuk dapat bermain memainkan permainan judi bola. Didalam situs ini memiliki berbagai permainan taruhan bola terlengkap seperti Sbobet, yang membuat DEWI365 menjadi situs judi bola terbaik dan terpercaya di Indonesia. Tentunya sebagai situs yang bertugas sebagai Bandar Poker Online pastinya akan berusaha untuk menjaga semua informasi dan keamanan yang terdapat di POKERQQ13. Kotakqq adalah situs Judi Poker Online Terpercayayang menyediakan 9 jenis permainan sakong online, dominoqq, domino99, bandarq, bandar ceme, aduq, poker online, bandar poker, balak66, perang baccarat, dan capsa susun. Dengan minimal deposit withdraw 15.000 Anda sudah bisa memainkan semua permaina pkv games di situs kami. Jackpot besar,Win rate tinggi, Fair play, PKV Games