Assumption in correlation analysis
Hi
In JASP correlation analysis there is "multivariate normality" under "assumption checks". What is the difference between "multivariate normality" and "bivariate normality"? How are they used differently?
Per
Hi
In JASP correlation analysis there is "multivariate normality" under "assumption checks". What is the difference between "multivariate normality" and "bivariate normality"? How are they used differently?
Per
Comments
I guess if you analyze 10 variables you can check multivariate normality -- or if you have a partial correlation where addition of the nuisance variables gives more than two variables in your analysis.
EJ
To assess the statistical significance of Pearson's correlation coefficient, the assumption is to have bivariate normality, but this assumption is difficult to assess. Therefore, in practice, a property of bivariate normality is relied upon; that is, if bivariate normality exists, both variables will be normally distributed. However, this does not work in reverse; two normally distributed variables do not mean you have bivariate normality, but it is a level of assurance that can be lived with.
But do I understand JASP correct, that pair wise option is intended to be used when there are only two variables, but the multivariate option is used when assessing more than two variables are used or for partial correlations?
Per
To assess the statistical significance of Pearson's correlation coefficient, one assumption is to have bivariate normality, but this assumption is difficult to assess. Therefore, in practice, a property of bivariate normality is relied upon; that is, if bivariate normality exists, both variables will be normally distributed. However, this does not work in reverse; two normally distributed variables do not mean you have bivariate normality, but it is a level of assurance that can be lived with.
Do I understand correct that in JASP we can employ the pair wise normality (Shapiro) when we check two variables but the the multivariate (Shapiro) when we we check “overall” more than two variables?
Somebody??
It seems to me that the Shapiro-Wilk test that we use is for multivariate normality; in the case of a simple correlation between X and Y, we would test bivariate normality -- but not the normality of each variable separately (this would also return two values, which is not the case). I'll ask the team.
Hi @PerPalmgren ,
When you only specify two variables, the two tests are equivalent - the multivariate normality (which has as many dimensions as specified variables) then becomes the bivariate normality check. I guess that when all variables are related (although then you would be better off with a linear regression I suppose), you would need to check the multivariate assumption check (but again, a residual plot in linear regression would be better), but if they are supposed to be separately run correlation tests, and only some pairs matter, the bivariate normality suffices.
Cheers
Johnny