To calculate correlation by group in R, you can use functions from dplyr package.This provides functions to manipulate and summarize data by groups.
You can use group_by() function to create group based on categorical variable. The summarize() function to apply the cor() function to the variables.
The following method shows how you can do it with syntax.
Method: Calculate Corrlation by Group
df %>%
group_by(column1) %>%
summarize(cor=cor(column2, column3))
The following example shows how we can calculate corrlation by group in R.
Calculate Corrlation by Group
Let’s see how we can calculate corrlation by group in R:
# Import library
library(dplyr)
# Create data frame
df <- data.frame(Machine_name=c("A","B","C","D","A","B","C","D"),
Pressure=c(78.2, 78.2, 71.7, 80.21, 80.21, 82.56, 72.12, 73.85),
Temperature=c(35, 36, 36, 38, 32, 32, 31, 34))
# Calculate correlation between Pressure and Temperature grouped by 'Machine_name'
d <- df %>%
group_by(Machine_name) %>%
summarize(cor=cor(Pressure, Temperature))
# Print summary statistics
print(d)
Output:
# A tibble: 4 × 2
Machine_name cor
<chr> <dbl>
1 A -1
2 B -1
3 C -1
4 D 1
The output shows corrlation between Pressure and Temperature column of data frame which is group by Machine_name column.