To calculate correlation by group in R, you can use functions from the dplyr package. This package provides functions to manipulate and summarize data by groups.

In this article, we will explore how to calculate correlation by group in R with examples.

Method: Calculate Correlation by Group

You can use the group_by() function to create groups based on a categorical variable and the summarize() function to apply the cor() function to the variables.

Here’s the syntax:

df %>%
  group_by(column1) %>% 
  summarize(cor=cor(column2, column3)) 

The following example shows how we can calculate correlation by group in R.

Calculate Correlation by Group

Let’s see how we can calculate correlation by group in R:

# Load necessary library
library(dplyr)

# Create data frame
df <- data.frame(Group=c("A","A","A","B","B","B","C","C","C"),
                 Variable1=c(12.39,11.25,12.15,13.48,13.78,12.89,12.21,12.58,11.45),
                 Variable2=c(78,89,85,84,81,79,77,85,82))

# Calculate correlation by group
correlation_by_group <- df %>%
  group_by(Group) %>%
  summarize(cor=cor(Variable1, Variable2))

# Display correlation by group
print(correlation_by_group)

Output: 👇️

# A tibble: 3 × 2
  Group    cor
  <chr>  <dbl>
1 A     0.327
2 B    -0.500
3 C     0.866

In this example, the group_by() function groups the data by the Group column, and the summarize() function calculates the correlation between Variable1 and Variable2 for each group.