To calculate summary statistics in R, you can use two different function in R.

The following methods show how you can do it with syntax.

Method 1: Use summary() Function

summary(data)

Method 2: Use summarize() Function from dplyr Package

library(dplyr)

summary <- df %>%
  summarize(
    Mean = mean(colum1),
    Median = median(colum1),
    Min = min(colum1),
    Max = max(colum1),
    StdDev = sd(colum1),
    Variance = var(colum1),     
  )

The following examples show how use this methods to calculate summary statistics in R.

Use summary() Function

Let’s see how we can use summary() function to calculate summary statistics of dataframe.

# Create dataframe
df <- data.frame(Start_date=as.Date(c("2000-05-21","2000-05-22","2000-05-23","2000-05-24","2000-05-25","2000-05-26")),
                 Machine_name = c("Machine1","Machine2","Machine1","Machine3","Machine2","Machine3"),
                 Value = c(108,120,135,95,98,105),Reading= c(110,97,91,89,80,85))

# Calculate summary statistics of dataframe
d <- summary(df)

# Show summary statistics of dataframe
print(d)

Output:

   Start_date         Machine_name           Value           Reading     
 Min.   :2000-05-21   Length:6           Min.   : 95.00   Min.   : 80.0  
 1st Qu.:2000-05-22   Class :character   1st Qu.: 99.75   1st Qu.: 86.0  
 Median :2000-05-23   Mode  :character   Median :106.50   Median : 90.0  
 Mean   :2000-05-23                      Mean   :110.17   Mean   : 92.0  
 3rd Qu.:2000-05-24                      3rd Qu.:117.00   3rd Qu.: 95.5  
 Max.   :2000-05-26                      Max.   :135.00   Max.   :110.0 

Here the output shows summary statistics of numeric columns of dataframe.

Use summarize() Function from dplyr

Let’s see how we can use summarize() function from dplyr package to calculate summary statistics:

# Import library
library(dplyr)

# Create dataframe
df <- data.frame(Start_date=as.Date(c("2000-05-21","2000-05-22","2000-05-23","2000-05-24","2000-05-25","2000-05-26")),
                 Machine_name = c("Machine1","Machine2","Machine1","Machine3","Machine2","Machine3"),
                 Value = c(108,120,135,95,98,105),Reading= c(110,97,91,89,80,85))

# Get statistical values
summary_reading <- df %>%
  summarize(
    Mean_reading = mean(Reading),
    Median_reading = median(Reading),
    Min_reading = min(Reading),
    Max_reading = max(Reading),
    StdDev_reading = sd(Reading),
    Variance_reading = var(Reading),     
  )

# Print statistical values
print(summary_reading)

Output:

  Mean_reading Median_reading Min_reading Max_reading StdDev_reading Variance_reading
1           92             90          80         110       10.50714            110.4

As the output shows statistics values for Reading column of dataframe.