To calculate correlation by group in R, you can use functions from the dplyr package. This package provides functions to manipulate and summarize data by groups.
In this article, we will explore how to calculate correlation by group in R with examples.
Method: Calculate Correlation by Group
You can use the group_by() function to create groups based on a categorical variable and the summarize() function to apply the cor() function to the variables.
Here’s the syntax:
df %>%
group_by(column1) %>%
summarize(cor=cor(column2, column3))
The following example shows how we can calculate correlation by group in R.
Calculate Correlation by Group
Let’s see how we can calculate correlation by group in R:
# Load necessary library
library(dplyr)
# Create data frame
df <- data.frame(Group=c("A","A","A","B","B","B","C","C","C"),
Variable1=c(12.39,11.25,12.15,13.48,13.78,12.89,12.21,12.58,11.45),
Variable2=c(78,89,85,84,81,79,77,85,82))
# Calculate correlation by group
correlation_by_group <- df %>%
group_by(Group) %>%
summarize(cor=cor(Variable1, Variable2))
# Display correlation by group
print(correlation_by_group)
Output: 👇️
# A tibble: 3 × 2
Group cor
<chr> <dbl>
1 A 0.327
2 B -0.500
3 C 0.866
In this example, the group_by()
function groups the data by the Group
column, and the summarize()
function calculates the correlation between Variable1
and Variable2
for each group.