Within a multi-factorial design, I have factor A that gives a significant main effect (e.g. p=.009) in the traditional repeated measures ANOVA, but BF10 in favour of the null (e.g. .31) in the Bayesian versions using JASP. When I label one of the other (strongly supported) factors (factor in the design as 'nuisance', factor A is then supported (e.g. BF10 = 6.15). The factor I'm labeling as nuisance here is of less interest than factor A, but it isn't of zero interest, so I'm not sure if and when its appropriate to do so.

It would also be useful to understand more clearly why factor B is influencing the comparison of the factor A model vs. the null (when the two don't interact) - this makes me less confident about interpreting any factor's model in the context of a multi-factorial design if its potentially going to be obscured by other factors in the design.

I've seen this with other data sets too, so any help appreciated!

]]>- Is it right to say that based on the BFInclusion = 46.659, the data are 46.659 more likely if we consider that disgust has an effect on the DV than if we consider that it has no effect at all?

-Is the formula to compute BFInclusion similar to those used to compute BF10?

-If I have to present results in a paper, do you recommend to present the BMA rather than the full model comparison? Or is it better to present both of the analyses?

Thank you very much in advance

]]>My question is should I tick X and Y as nuisance in the model? And Why?

Thanks in advance for your help!

]]>JASP is great, thanks for developing it! An issue that has persistent over the past and current versions occurs when the number of cells in a repeated-measures ANOVA is relatively high.

When running a 3x9 rm-ANOVA, I get the following in JASP 0.8.2:

Error message, including stack trace:

This analysis terminated unexpectedly. Error in complete.cases(x, y): not all arguments have the same length Stack trace analysis(dataset = NULL, options = options, perform = perform, callback = the.callback, state = state) .resultsPostHoc(referenceGrid, options, dataset, fullModel) t.test(unlist(postHocData[listVarNamesToLevel[[k]]]), unlist(postHocData[listVarNamesToLevel[[i]]]), paired = T, var.equal = F) t.test.default(unlist(postHocData[listVarNamesToLevel[[k]]]), unlist(postHocData[listVarNamesToLevel[[i]]]), paired = T, var.equal = F) complete.cases(x, y) To receive assistance with this problem, please report the message above at: https://jasp-stats.org/bug-reports

The message in 0.7.1.12 used to be `Invalid comparison with complex values`

, and has been reported on GitHub before (raised in September 2015; closed in March 2017 in anticipation of whether it was still a bug in the latest version).

Please note that the issue does not seem to be with the data: Running 2x9 rm-ANOVAs on the same data does work, and I've had the same issue with a completely different dataset. The only common factor seems to be the number of cells.

**UPDATE (2017-09-14, 12:34 UTC+0)**: The same data does work with 2x9 rm-ANOVAs on JASP 0.7.1.12, but not on 0.8.2. On 0.8.2 it does work in a 3x6 design.

Any idea what might be going on?

Cheers,

Edwin

]]>I conducted a factor analysis in JASP. The method seemed to have worked, however, I am missing important information how the analysis proceeds. What is the method of extraction (maximum likelihood, principle axis factoring, something else)? And what are the parameters chosen for this method? How many iterations were done? How were missing values treated?

Unfortunately, no helpfile for factor analysis exists (yet). Can I extract this information somehow?

Thanks! ]]>

I conducted an Experiment with one between-subjects factor (Group) and two within-subjcets factors (MemoryCue, Source). When analysing my results, a mixed ANOVA gives me two main effects for MemoryCue and Source and a sig. interaction between the two, F(1, 128) = 11.92, p < .001.

When I conduct a Bayesian ANOVA using the BayesFactor package I get the following:

Input:

```
# Bayesian ANOVA
FR_bf = anovaBF(SourceRecall ~ Group*MemoryCue*Source + Subject, data = data_final, whichRandom="Subject")
FR_bf = sort(FR_bf, decreasing =TRUE)
FR_bf
```

The output looks as follows:

```
# Output
Bayes factor analysis
--------------
[1] MemoryCue + Source + MemoryCue:Source + Subject : 2.071683e+111 ±3.53%
[2] MemoryCue + Source + Subject : 1.827539e+111 ±2.5%
[3] Group + MemoryCue + Source + MemoryCue:Source + Subject : 1.630972e+110 ±12.65%
[4] Group + MemoryCue + Source + Subject : 1.317137e+110 ±2.54%
[5] Group + MemoryCue + Group:MemoryCue + Source + MemoryCue:Source + Subject : 7.646137e+107 ±4.55%
[6] Group + MemoryCue + Group:MemoryCue + Source + Subject : 7.20429e+107 ±8.43%
[7] Group + MemoryCue + Source + Group:Source + MemoryCue:Source + Subject : 1.263392e+107 ±2.86%
[8] Group + MemoryCue + Source + Group:Source + Subject : 1.154352e+107 ±2.84%
[9] MemoryCue + Subject : 3.594377e+105 ±2.23%
[10] Group + MemoryCue + Group:MemoryCue + Source + Group:Source + MemoryCue:Source + Subject : 6.333031e+104 ±2.96%
[11] Group + MemoryCue + Group:MemoryCue + Source + Group:Source + Subject : 5.811523e+104 ±3.46%
[12] Group + MemoryCue + Subject : 2.818147e+104 ±9.25%
[13] Group + MemoryCue + Group:MemoryCue + Source + Group:Source + MemoryCue:Source + Group:MemoryCue:Source + Subject : 1.253457e+102 ±2.71%
[14] Group + MemoryCue + Group:MemoryCue + Subject : 1.249367e+102 ±2.94%
[15] Source + Subject : 145095.3 ±2.37%
[16] Group + Source + Subject : 9694.612 ±9.93%
[17] Group + Source + Group:Source + Subject : 7.090814 ±1.43%
[18] Group + Subject : 0.06024572 ±2.97%
Against denominator:
SourceRecall ~ Subject
---
Bayes factor type: BFlinearModel, JZS
```

When comparing model 1(including the interaction) with model 2 (only 2 main effects) I get the follwing:

```
> FR_bf[1]/FR_bf[2]
Bayes factor analysis
--------------
[1] MemoryCue + Source + MemoryCue:Source + Subject : 1.133592 ±4.32%
Against denominator:
SourceRecall ~ MemoryCue + Source + Subject
---
Bayes factor type: BFlinearModel, JZS
```

Thus, no evidence for an interaction.

So, why is there such a huge discrepancy between the frequentist ANOVA and the bayesian ANOVA results regarding the interaction?

Thanks for your help.

Cheers,

Ivan

Two questions about the effect size used in JASP:

- What is an effect size (delta)? Is it Glass's delta?
- How do you convert Cohen's d into delta? (this is straightforward if delta is Glass's delta)

Cheers,

Hannah

PS. Thank you Team JASP - the program is fantastic!!

]]>From what I understand, one can set the Cauchy prior width acordding to a speculated effect size when preforming a Bayesian t-test, such that if I think my effect should be around 0.6, I would set the Cauchy prior width to 0.6 (correct me if I'm wrong).

What I still don't understand is the Beta* prior width for correlations - can it be used in the same manner?

If I expect an r=~0.6, would I set the Beta* prior width to 0.6?

Or in other words, how do I translate an expected effect size/correlation to Beta* prior width?

Thanks,

Mattan

I'm wondering if it is possible to generate indirect effects for mediation analysis in JASP? If not, would it be a possible feature in the program in the near future?

Regards,

NT

how sensitive is a model comparison using BF for (multi-) collinearity? Is there any mechanism within bayesian model comparison that is sensitive to collinearity? If so, is it OK to choose models with high BF and high collinearity?

Best wishes,

Ulrich Dettweiler & Christoph Becker

]]>When I run and rerun Bayesian repeated measures ANOVAs in JASP using an identical dataset (Excel csv file), I find that the results will occasionally be different. This isn't dramatic, e.g., 0.046 vs. 0.041, but they are inconsistent.

I'm wondering if anyone knows why this might be and how I can fix it?

Thanks!

Sarah

In my experiment, a reaction time task with 2 conditions, I have tested 3 groups (N=22, N=23, N=21).

I am using a mixed factorial design and have run a Bayesian Repeated Measures ANOVA to find out whether performance in the task depends on condition and group (Repeated Measures Cells = Condition 1, Condition 2; Between-Subject Factors = Group).

**My questions are:**

1) Can I use this analysis with such sample sizes?

2) Right now, I have kept the default prior of 0.5 for "r scale fixed effects" as there isn’t previous research in this area using this task and/or conditions . I assume that by using the default prior, JASP assumes that I'm testing for medium-size effects of 'condition' and 'group'?

Here is the main output table from JASP:

3) Am I right to assume that there is strong evidence for a main effect of condition, but that the BF of around 1 for group can be regarded as noise / data are insensitive? Furthermore, there is no evidence for or against an interaction (for interaction = 0.9; against = 1.11)? Given that the main effect of group does not add much improvement compared to the null model, I think I need to go with the model with just 'condition' as a main effect. However, when I select "Compare to best model", then there is the main effects model (condition + group) on the very top. Does that mean the best model is actually the one with both main effects despite the fact that 'group' doesn't add much improvement?

Here is the "Compare to best model" table:

Thanks a lot for any advice!

]]>I have read that we could use the default priors contained in the BayesFactor package. It worries me that I have read that using Bayesian inference with default priors, is like eating cake without sugar and flour (can't remember the metaphor tbh). Somehow they say it's pointless to do Bayesian inference without adjusting priors.

Also about the interpretation, I read on Richard's blog that the magnitude of the BF depends on what you were expecting. How can I get from previous papers on my topic, the effect size of my factors? He gives an example saying:

Is 10 kilometers a long way? It is if you're walking, it isn't if you've just started a flight to Australia.

I can see how this 10 is different in this example, but how can I know, from previous literature without Bayesian inference, which factors have big effects?

]]>I have a question about the interpretation of an interaction. I have attached the output as a picture. Here I would argue that a one factor model of Time (pre vs. post intervention) is best supported by the data. Adding Group as an additional factor makes the BF smaller (by approx. a factor 5). Here comes the tricky part. How should I interpret the two factor model + interaction compared to the one factor model of Time, and specifically: does the data support an interaction? It is slightly better (by a factor 1.25). I would argue that it doesn't add a lot compared to Time Only. My supervisor argues that the two factor+interaction model isn't worse compared to the Time Only model (actually slightly better). So, how should I interpret this correctly. Is there evidence for an interaction or isn't there? Thanks in advance for any advice on this!

]]>After finding the models with anovaBF, I extracted the two I wanted to compare:

```
bfMainEffects = lmBF(grow2 ~ AF + Zone, data = field)
bfInteraction = lmBF(grow2 ~ AF + Zone + AF:Zone, data = field)
bf = bfMainEffects / bfInteraction
bf
```

[1] AF + Zone : 26.58475 ±1.74%

Against denominator:

Bayes factor type: BFlinearModel, JZS

```
chains = posterior(bfMainEffects, iterations = 10000)
summary(chains)
```

Iterations = 1:10000

Thinning interval = 1

Number of chains = 1

Sample size per chain = 10000

```
Mean SD Naive SE Time-series SE
mu 1.072916 0.01051 0.0001051 1.051e-04
AF-foliar -0.059056 0.01656 0.0001656 1.700e-04
AF-roots 0.081018 0.01406 0.0001406 1.496e-04
AF-woods -0.021962 0.01375 0.0001375 1.375e-04
Zone-Zo1 0.009428 0.01404 0.0001404 1.404e-04
Zone-Zo2 -0.144139 0.01429 0.0001429 1.438e-04
Zone-Zo3 0.134711 0.01392 0.0001392 1.430e-04
sig2 0.034913 0.00267 0.0000267 2.711e-05
g_AF 0.565173 3.58640 0.0358640 3.586e-02
g_Zone 1.469067 11.78765 0.1178765 1.179e-01
2. Quantiles for each variable:
2.5% 25% 50% 75% 97.5%
mu 1.05256 1.0659519 1.07281 1.07995 1.092959
AF-foliar -0.09195 -0.0702200 -0.05905 -0.04786 -0.026719
AF-roots 0.05304 0.0715381 0.08098 0.09063 0.108607
AF-woods -0.04835 -0.0312259 -0.02203 -0.01263 0.004812
Zone-Zo1 -0.01759 -0.0001058 0.00926 0.01882 0.037344
Zone-Zo2 -0.17235 -0.1536586 -0.14409 -0.13449 -0.116400
Zone-Zo3 0.10754 0.1251910 0.13486 0.14416 0.161969
sig2 0.03014 0.0330497 0.03475 0.03661 0.040530
g_AF 0.05575 0.1331341 0.23853 0.46521 2.653797
g_Zone 0.14177 0.3328165 0.58771 1.18625 7.008814
```

This is how I interpret it:

When measuring from the roots, grow2 will increase 0.081 and the values can be found within a credibility interval between 50% and 97.5%

I want to know if I'm mixing the classical interpretation with the Bayesian, or if I can interpret it like this.

]]>I saw that you have multiple regression as JASP.

Are you considering adding multivariate regression (two or more DVs)?

Thank you

Christos

]]>

- Whenever possible, avoid using statistical significance or p values; simply omit any mention of nullhypothesis

significance testing (NHST).

But I have found many papers saying that they obtained certain BF supporting the null or alternative hypothesis. Is it possible to obtain all the information needed for publishing a paper, by using **only ** the BayesFactors package? I haven't tried JASP so I don't know how much they differ from each other in terms of the information one can get using each one.

]]>In Assumption checks for the test of sphericity with AnovaRM with two levels, an annotation indicates "When the repeated measure has only two levels, the assumption of sphéricity is always met". Have you any references, please?

Thanks in advance.

Regards.

]]>I would really appreciate any help with the following issue.

I have contrasting results from RM ANCOVA and Bayesian RM ANCOVA and I would like to have a feedback on my reading of the outputs.

There is a significant interaction Stim x Hand using the RM ANCOVA (there seems to be a three-way interaction as well but for the time being let's focus on the two-way interaction).

Then, I run a Bayesian RM ANCOVA on the same data and here it is the output:

Given that I am interested in the interaction Stim x Hand, my next move is comparing the model with the main effects (Stim + Hand + Order) vs. the model with the same effects and the interaction (Stim + Hand + Stim*Hand + Order) --> BF10 main effects and interaction / BF10 main effects --> 0.1/0.015 = 6.67.

I would say that the model including the interaction term is more in favour of H1 (moderate evidence) compared to the model without interaction. However, the BF10s are all less than 1!

Therefore, the model with interaction does not support H1.

Here it emerges the discrepancy with the RM ANCOVA.

I have the feeling that I am missing something here.

Could anyone help?

Many thanks in advance

Best wishes

Francesco

]]>I've just started reading into Bayesian statistics and I'm pretty smitten by it even though I haven't fully understood all of it yet.

I'm a PhD student and I'm considering using Bayesian statistics for my sample. I'm looking at humans in Isolated and Confined Environments (ICE), which means that by default the data is extremely difficult to obtain and the sample size is tiny. The issue is that the total possible population is exactly 11 people, of whom 1 has withdrawn from the study and 1 may need to be excluded from the data. The person who may need to be excluded has developed major psychiatric problems and has been evacuated. Even though I will be able to collect more data from them, this will be from their home country/town and not from within the ICE, so it makes no sense to include that data in the the ICE analysis.

I do have an age & gender-matched control group.

So, would Bayesian statistics even be appropriate? I've found a paper (Stiger et al., 1998) that suggests you can use normal ANOVA with small samples and ordinal data if you use a Huyn-Feldt correction, but I can't seem to find a direct answer for Bayesian statistics...

Thank you in advance!

eniseg the newbie

]]>One of our Reviewers now suggests that we misinterpreted the Bayes factor. To be honest, we are a bit confused/uncertain on how to reply. Although the comment does make sense to us, we did inspire our phrasing on an earlier paper using the JASP Bayes factor and were wondering if we could consult your expertise on this?

Specifically, we mentioned:

"The BF01 was 11.499, suggesting that these data are 11.499 more likely to be observed under the null hypothesis."

The reviewer commented:

"According to my understanding, the Bayes factors tell the relative odds that the (in this case) null hypothesis is correct relative to the alternative hypothesis, given these data. Namely, it is not the probability of the data (given the hypothesis) which is what null hypothesis testing tell us, but the probability of the hypothesis given the data."

I was wondering what was actually happening when you select "Manual (no. of samples 10000)" - in stead of leaving it to the Auto setting - in the advanced options of Bayesian RM ANOVA. Is this an analogues bootstrapping setting, like "bayesboot". And if so does this override the need for normally distributed data in your sample?

Thanks of the help, loving JASP so far!!!!

Larry ]]>

Following those Tweets on the JASP Twitter I was wondering if I could get an additional opinion on my statistical decision-making?

I studied a team which spent a whole year at an isolated and confined research station. The team was made up of 11 people, 10 of whom participated in the study. 1 team member had to leave the station early due to psychological complications so they did not complete the study – their data will be used as a case study instead of as a group study.

I did cognitive assessments with my team at three time points (autumn, polar night, midnight sun) and well-being questionnaires at two additional points (after arrival, spring).

My current statistical approach has been: if the data are non-normally distributed, I use a Friedman's test and Wilcoxon for follow up as suggested by Field (2009, p. 579-580) **in R**. For normally distributed data in this within-subjects design, I chose a parametric ANOVA with a Huyn-Feldt correction **in JASP**. The Huynh-Feldt correction decreases the ANOVA’s chance of erroneously finding an effect that is not present at all (Type I error) despite having my small sample and allows me to use ordinal DV, such as my questionnaire data (Stiger et al., 1998). For ANOVA effect size, I will report omega squared (ω2) because it is reliable with small sample sizes (Levine & Hullet, 2002). I've also been reporting the Vovk-Sellke maximum p-ratio.

I remember enquiring on this forum and EJ saying small sample size was not an issue so I supplemented the above frequentist statistics with Bayesian analyses in JASP. I've usually done a within-subjects ANOVA with paired samples t-tests for follow up. There are no previous studies on teams like mine from which I could derive information to form a subjective prior.

For the JASP Bayesian ANOVA I've been reporting BF10, BF(M), BF(01), P(M), P(M|data), %error. For the t-tests I have been reporting –additionally– the credibility interval. I've been illustrating my PhD chapter with pizza plots for Bayes and bar or line graphs for frequentist stats.

Does this sound okay? Should I do anything else?

]]>I'm really new to your program and Bayesian analysis and have been wondering: are assumptions pertaining to the data (i.e. normality, homogeneity of variances) the same for the Bayesian versions of t-tests, ANOVA etc.? If so, are they in some way more or less robust against violations of these assumptions? Thank you for your reply. ]]>

First, on page 30 it says: "Consider the 15th model, which is [...]. The Bayes factor is the comparison of this model against the null model with no age effect, and this age-effect model is less preferable to the null model by 0.69-to-1"

The line on the output looks like this:

[15] a + s : 1.181082 ±1.16%

And here is an image of the console:

**How do I get that .69-to-1?**

Second, to report Bayes factors, I read the classification of Harold Jeffreys (1961). He said that BF_{10} > 100 is Extreme evidence for H1. Are the numbers of the second column this BF_{10}? Would this mean that all models from 1 to 14 are strong evidence in favour of the H1?

Could model 1 be interpreted as strong evidence of an effect of age and distance, and is it correct to say that model 18 shows _extreme evidence_of the lack of distance effect?

Third, how can I report this last part? Saying that the best model predicts the data 1.0404717e+18 times better than the model including distance?

Sorry for the very long post. Any reading suggestions on how to report and interpret the results in R using BayesFactor is welcome.

]]>