To calculate summary statistics by group in R, you can use **tapply()** function or create function manually using **group_by()**
**summarise()** function from **dplyr** package.

The following methods show how you can do it with syntax.

**Method 1: Use tapply() Function**

```
tapply(data, summary)
```

**Method 2: Create Function Manually**

```
library(dplyr)
d <- df %>%
group_by(column1) %>%
summarize(min = min(column2),
q1 = quantile(column2, 0.25),
median = median(column2),
mean = mean(column2),
q3 = quantile(column2, 0.75),
max = max(column2))
```

The following examples show how to calculate summary statistics by group in R.

## Use tapply() to Calculate Summary Statistics

Let’s see how we can calculate summary statistics using **tapply()** function:

```
# Create data frame
df <- data.frame(Machine_name=c("A","B","C","D","A","B","C","D"),
Pressure=c(78.2, 78.2, 71.7, 80.21, 80.21, 82.56, 72.12, 73.85),
Temperature=c(35, 36, 36, 38, 32, 32, 31, 34))
# Calculate summary statistics of 'Pressure' grouped by 'Machine_name'
s <- tapply(df$Pressure, df$Machine_name, summary)
# Print summary statistics
print(s)
```

Output:

```
$A
Min. 1st Qu. Median Mean 3rd Qu. Max.
78.20 78.70 79.20 79.20 79.71 80.21
$B
Min. 1st Qu. Median Mean 3rd Qu. Max.
78.20 79.29 80.38 80.38 81.47 82.56
$C
Min. 1st Qu. Median Mean 3rd Qu. Max.
71.70 71.81 71.91 71.91 72.02 72.12
$D
Min. 1st Qu. Median Mean 3rd Qu. Max.
73.85 75.44 77.03 77.03 78.62 80.21
```

The output shows summary statistics values of **Pressure** column which group by **Machine_name** column of dataframe.

## Create Function to Calculate Summary Statistics by Group

Let’s see how we can use **group_by()** and **summarize()** function from **dplyr** package to create function to calculate summary statistics by group:

```
# Import library
library(dplyr)
# Create data frame
df <- data.frame(Machine_name=c("A","B","C","D","A","B","C","D"),
Pressure=c(78.2, 78.2, 71.7, 80.21, 80.21, 82.56, 72.12, 73.85),
Temperature=c(35, 36, 36, 38, 32, 32, 31, 34))
# Calculate summary statistics of 'Temperature' grouped by 'Machine_name'
d <- df %>%
group_by(Machine_name) %>%
summarize(min = min(Temperature),
q1 = quantile(Temperature, 0.25),
median = median(Temperature),
mean = mean(Temperature),
q3 = quantile(Temperature, 0.75),
max = max(Temperature))
# Print summary statistics
print(s)
```

Output:

```
$A
Min. 1st Qu. Median Mean 3rd Qu. Max.
78.20 78.70 79.20 79.20 79.71 80.21
$B
Min. 1st Qu. Median Mean 3rd Qu. Max.
78.20 79.29 80.38 80.38 81.47 82.56
$C
Min. 1st Qu. Median Mean 3rd Qu. Max.
71.70 71.81 71.91 71.91 72.02 72.12
$D
Min. 1st Qu. Median Mean 3rd Qu. Max.
73.85 75.44 77.03 77.03 78.62 80.21
```

The output shows summary statistics of **Temperature** column which group by **Machine_name** column of dataframe.