To calculate jaccard similarity in R, you can define function for Jaccard Similarity. The Jaccard similarity index measures the similarity between two sets of data. It can range from 0 to 1. The higher the number, the more similar the two sets of data.The Jaccard Similarity calculated as divide number of observations in both sets to number in either set.
The following method shows how you can do it with syntax.
Method: Declare Function For Jaccard Similarity
jaccard_similarity <- function(set1, set2) {
intersect_set <- length(intersect(set1, set2))
union_set <- length(union(set1, set2))
return(intersect_set / union_set)
}
The following example shows how to calculate jaccard similarity in R.
Calculate Jaccard Similarity
Let’s see how we can use function to calculate jaccard similarity in R:
# Define dataframe
df <- data.frame(Pressure=c(12.39,11.25,12.15,13.48,13.78,12.89,12.21,12.58),
Temperature=c(7,7,1,8,8,7,7,5),
Humidity=c(5,7,1,2,7,8,9,4))
# Calculate jaccard similarity
jaccard_similarity <- function(set1, set2) {
intersect_set <- length(intersect(set1, set2))
union_set <- length(union(set1, set2))
return(intersect_set / union_set)
}
# Call the function
d <- jaccard_similarity(df$Temperature, df$Humidity)
# Print jaccard similarity
print(d)
Output:
[1] 0.5714286
Here the output shows jaccard similarity between two columns of dataframe.