rowsums() function along with is.na() and ncol() is used to remove all na rows in R. filter function of dplyr is also used to remove rows with na values. A data frame is passed as an argument to filter Function.
Using the na.omit() function and is.na(), we can remove all rows with na in R. Data frame is passed as an argument, and these functions check for missing values of na values in rows of a data frame and remove na from a data frame.
In this tutorial, we will discuss how to remove rows with na in R and remove rows with missing values in R.
Remove Rows with NA in R using is.na() function
Using the rowsums() function along with is.na() function in R, it removes rows with NA values in a data frame.
Let’s practice with an example to understand how to remove NA rows from a data frame.
Create a data frame in R using the data.frame() function.
Create a data frame
emp_info <- data.frame(
name = c("Tom","Keore","Kim","Harsh","Ola","Gui","Ted","Mike"),
age = c(27,34,NA,34,32,25,29,NA),
salary = c(4500,NA,8900,5433,NA,2350,6500,NA)
)
# Print the data frame
emp_info
In the above R code, it creates a data frame of 3 columns and 8 rows. Few columns in a data frame have NA value for a few observations.
name age salary
1 Tom 27 4500
2 Keore 34 NA
3 Kim NA 8900
4 Harsh 34 5433
5 Ola 32 NA
6 Gui 25 2350
7 Ted 29 6500
8 Mike NA NA
To remove rows with NA in R, use the following code.
# using rowsums() along with is.na() to remove na rows
df2 <- emp_info[rowSums(is.na(emp_info)) == 0,]
# Print the data frame
df2
In the above R code, we have used rowSums() and is.na() together to remove rows with NA values.
The output of the above R code removes rows numbers 2,3,5 and 8 as they contain NA values for columns age and salary.
name age salary
1 Tom 27 4500
4 Harsh 34 5433
6 Gui 25 2350
7 Ted 29 6500
Remove Rows with NA using na.omit() function
Using na.omit() function, we can remove the rows having NA values. A data frame is passed as an argument to na.omit() function.
Let’s take the above emp_info data frame to illustrate removing of rows with NA values.
# Using na.omit to remove na rows of data of data frame
df1 <- na.omit(emp_info);
# Print the data frame
df1
The output of the above R code removes NA rows from the data frame in R.
name age salary
1 Tom 27 4500
4 Harsh 34 5433
6 Gui 25 2350
7 Ted 29 6500
Remove All Rows with NA in R
rowSums() function along with is.na(), ncol() can be used to remove all rows with NA values in data frame.
Let’s create a new data frame to illustrate removing the entire row having NA values.
# Create a data frame
num_data <- data.frame(a = c(2, 8, 3, NA, 3, 2), # Create example data
b = c("A", NA, "C", NA, "B", "D"),
c = c(3, 7, NA, NA, 4, 5))
# Print the data frame
num_data
In the above R code, it creates a data frame of three columns and 6 rows.
Let’s find the rows with missing values or NA values in a data frame.
# Using the rowSums() along with is.na, ncol to remove all rows with NA
df3 <- num_data[rowSums(is.na(num_data)) != ncol(num_data),]
# Print the data frame
df3
The output of the above R code removes rows having NA values for all columns. It removes row number 4 as the row has all NA values.
a b c
1 2 A 3
2 8 <NA> 7
3 3 C NA
5 3 B 4
6 2 D 5
Remove NA from Data Frame in R
R provides many methods to remove rows with NA values in the data frame. One of the methods is using the drop_na() function of the tidyr package.
Use the drop_na() function of tidyr package to remove NA rows from a data frame.
Let’s use the emp_info data frame for illustration purposes.
# Using drop_na() function of tidyr package to remove na rows
library(tidyr)
df4 <- emp_info %>% drop_na()
# Print the data frame
df4
In the above R code, import library tidyr. A data frame object is passed to drop_na() function using %>% operator.
The output of the above R code after removing NA rows from the data frame is:
name age salary
1 Tom 27 4500
2 Harsh 34 5433
3 Gui 25 2350
4 Ted 29 6500
Remove NA Rows only from Data Frame using filter function.
dplyr package has a filter function that is used to filter out the rows from the data frame.
Using the filter function which takes a data frame as an argument and checks for the row having na values using is.na() and ncol() function.
Let’s use the above emp_info data frame for illustration purposes.
# using the dplyr to remove rows having only NA values
library("dplyr")
df5 <- filter(num_data, rowSums(is.na(num_data)) != ncol(num_data))
# Print the data frame
df5
In the above R code, we have used the dplyr library. Using the filter function along with is.na() and ncol() function, it finds rows with only na values from a data frame and removes it.
Row number 4 in the above data frame has all na values.
The output of the above R code is:
a b c
1 2 A 3
2 8 <NA> 7
3 3 C NA
4 3 B 4
5 2 D 5
Using the complete.cases() to remove na rows
Use the complete.cases() function in R to remove na rows from a data frame.
Let’s use the above emp_info data frame.
# Using the complete.cases() to remove na rows
df6 <- emp_info[complete.cases(emp_info),]
# Print the data frame
df6
In the above R code, complete.cases() function takes a data frame as an argument and removes na rows from a data frame.
The output of the above R code is:
name age salary
1 Tom 27 4500
4 Harsh 34 5433
6 Gui 25 2350
7 Ted 29 6500
Conclusion
I hope the above article on how to remove rows with NA in r using the na.omit(), complete.cases(), is.na() is helpful.
R programming language provides different ways to achieve functionality to remove all rows with na values from a data frame.