The textbook example examines the association between exam score (Exam), test anxiety (Anxiety) and hours revising (Revise) using the ExamAnxiety.dat
data. We also use the Big Liar data (liar.dat
) to practice non-parametric correlations.
As always, ignore anything related to R Commander in the textbook.
a. Import the data
library(tidyverse)
examdata <- as.tibble(read.delim("ExamAnxiety.dat"))
names(examdata) <- casefold(names(examdata))
Correlations in R
Table 6.2 in the textbook is helpful.
Function | Pearson | Spearman | Kendall | p-value | CI | Multiple Correlations |
---|---|---|---|---|---|---|
cor() | ✅ | ✅ | ✅ | ❌ | ❌ | ✅ |
cor.test() | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
rcorr() | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ |
Here is the function for cor(x,y, use = "everything", method = "correlation type")
The default is a Pearson correlation. Use cor()
to fill in the table below.
a. Anxiety and Number of Revisions
cor(examdata$anxiety, examdata$revise)
[1] -0.7092493
b. Anxiety and Exam Score
cor(examdata$anxiety, examdata$exam)
[1] -0.4409934
c. Exam Score and # of Revisions
cor(examdata$revise, examdata$exam)
[1] 0.3967207
Calculate the R^2 all at once. In order to do so, the dataframe can only include the variables that you want to correlate. In this case, we want to remove gender.
examdata2 <- examdata %>% select(revise, exam, anxiety)
cor(examdata2)
## revise exam anxiety
## revise 1.0000000 0.3967207 -0.7092493
## exam 0.3967207 1.0000000 -0.4409934
## anxiety -0.7092493 -0.4409934 1.0000000
R^2 is a type of effect size. Here you can times the result by 100 to get the % of variance accounted for in one variable from another.
(cor(examdata2)^2)*100
The coefficient of determinations ranges from 15-50%.
## revise exam anxiety
## revise 100.00000 15.73873 50.30345
## exam 15.73873 100.00000 19.44752
## anxiety 50.30345 19.44752 100.00000
Anxiety & Revise | Anxiety and Exam Score | Revise and Exam Score | |
---|---|---|---|
Pearson | ❓ | ❓ | ❓ |
r^2 % | ❓ | ❓ | ❓ |
Let’s test the significance using cor.test(x,y, alternative = "string", method = "correlation type", conf.level = 0.95)
. Report the exact p values and the confidence intervals (95%) for each of the pairs. Complete the table. You can copy some values from above.
a. Anxiety and Number of Revisions
cor.test(examdata$anxiety, examdata$revise)
##
## Pearson's product-moment correlation
##
## data: examdata$anxiety and examdata$revise
## t = -10.111, df = 101, p-value < 2.2e-16
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.7938168 -0.5977733
## sample estimates:
## cor
## -0.7092493
b. Anxiety and Exam Score
cor.test(examdata$anxiety, examdata$exam)
##
## Pearson's product-moment correlation
##
## data: examdata$anxiety and examdata$exam
## t = -4.938, df = 101, p-value = 3.128e-06
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## -0.5846244 -0.2705591
## sample estimates:
## cor
## -0.4409934
c. Exam Score and # of Revisions
cor.test(examdata$revise, examdata$exam)
##
## Pearson's product-moment correlation
##
## data: examdata$revise and examdata$exam
## t = 4.3434, df = 101, p-value = 3.343e-05
## alternative hypothesis: true correlation is not equal to 0
## 95 percent confidence interval:
## 0.2200938 0.5481602
## sample estimates:
## cor
## 0.3967207
Anxiety & Revise | Anxiety and Test Score | Revise and Test Score | |
---|---|---|---|
correlation | ❓ | ❓ | ❓ |
r^2 % | ❓ | ❓ | ❓ |
t (df) | ❓ | ❓ | ❓ |
p value | ❓ | ❓ | ❓ |
CI | ❓ to ❓ | ❓ to ❓ | ❓ to ❓ |
The data needs to be in a matrix format and requires the Hmisc
package which you might need to install. Generally, this function is not worth the hassle.
Load Hmisc
assuming you already installed it.
library(Hmisc)
exammatrix <- as.matrix(examdata2)
rcorr(exammatrix)
## revise exam anxiety
## revise 1.00 0.40 -0.71
## exam 0.40 1.00 -0.44
## anxiety -0.71 -0.44 1.00
##
## n= 103
##
##
## P
## revise exam anxiety
## revise 0 0
## exam 0 0
## anxiety 0 0