There are multiple methods to find outliers in R, but a common method is using the Interquartile Range (IQR). The values that fall outside 1.5 times the interquartile range (IQR) below the first quartile (Q1) or above the third quartile (Q3) are identified as outliers.
The following method shows how you can do it with syntax.
Method: Use IQR() Function
q1 <- quantile(df$column, 0.25) # Compute the first quartile (25th percentile)
q3 <- quantile(df$column, 0.75) # Compute the third quartile (75th percentile)
iqr <- IQR(df$column) # calculate the interquartile range
outliers <- subset(df, df$column < (Q1 - 1.5 * IQR) | df$column > (Q3 + 1.5 * IQR))
This example shows how to find outliers in R using the quartile() function.
The following example shows how to find outliers in R using quantile() function .
Find Outliers Using IQR() Function in R
Let’s see how we can find outliers for one of the columns of a data frame in R:
# Create data frame
df <- data.frame(Machine_name=c("A","B","C","D","E","F","G","H"),
Pressure1=c(78.2, 28, 71.7, 80.21, 72.7, 30, 84.21, 76.2),
Temperature1=c(31, 33, 36, 37, 36, 33, 37, 31),
Status=c(TRUE,TRUE,FALSE,TRUE,FALSE,TRUE,TRUE,TRUE))
# Find Q1, Q3, and interquartile range for Pressure1 column
Q1 <- quantile(df$Pressure1, 0.25)
Q3 <- quantile(df$Pressure1, 0.75)
IQR <- IQR(df$Pressure1)
# Subset data where points value is outside 1.5*IQR of Q1 and Q3
outliers <- subset(df, df$Pressure1 < (Q1 - 1.5 * IQR) | df$Pressure1 > (Q3 + 1.5 * IQR))
# Print outliers
print(outliers)
Output:
Machine_name Pressure1 Temperature1 Status
2 B 28 33 TRUE
6 F 30 33 TRUE
Here the output shows rows which taken as outliers based Pressure1 column of data frame.