The duplicated() function in R is used to get duplicate values in vector or data frame.
The following method shows how to use duplicated function in R with syntax.
Method: Use duplicated() Function
dataframe[duplicated(dataframe$column name), ]
The following example shows how to use this method with example.
First create data frame which having duplicate rows in it:
# Define data frame
df <- data.frame(Machine_name=c("A","B","C","D","A","A","G","B"),
Temperature=c(12,9,14,13,18,28,22,23),
Pressure=c(20,25,27,29,30,32,39,40))
# Print data frame
print(df)
Output:
Machine_name Temperature Pressure
1 A 12 20
2 B 9 25
3 C 14 27
4 D 13 29
5 A 18 30
6 A 28 32
7 G 22 39
8 B 23 40
The output shows data frame with duplicate rows.
The following examples shows how to use duplicated() function in R.
Get Duplicate Rows Using duplicated() Function in R
Let’s see how we can use duplicated() function on data frame:
# Define data frame
df <- data.frame(Machine_name=c("A","B","C","D","A","A","G","B"),
Temperature=c(12,9,14,13,18,28,22,23),
Pressure=c(20,25,27,29,30,32,39,40))
# Find duplicate rows
df[duplicated(df$Machine_name), ]
Output:
Machine_name Temperature Pressure
5 A 18 30
6 A 28 32
8 B 23 40
As we can see in output there are 3 duplicate rows in data frame.
Count Duplicate Rows Using sum() Function in R
Here is example to count duplicate rows in data frame:
We can count duplicate rows using sum() function.The syntax for this function is sum(duplicated(dataframe$column)).
# Define data frame
df <- data.frame(Machine_name=c("A","B","C","D","A","A","G","B"),
Temperature=c(12,9,14,13,18,28,22,23),
Pressure=c(20,25,27,29,30,32,39,40))
# Count duplicate rows
sum(duplicated(df$Machine_name))
Output:
[1] 3
Using duplicate() function you can easily find duplicate rows in dataset also, count them.