Bivariate analysis in R involves analyzing the relationship between two variables. To perform Bivariate analysis you can perform operation like finding corrlation coefficient and perform regression, visualizations like scatter plots.

To perform bivariate analysis you need to follow below steps:

## Step 1: Load required libraries

library(tidyverse)


Here we load tidyverse library which is required for further use.

You can load in-built dataset or create dataframe to perform bivariate analysis on it:

# Create data frame
df <- data.frame(Machine_name=c("A","B","C","D","E","F","G","H"),
Pressure=c(12.39,11.25,12.15,13.48,13.78,12.89,12.21,12.58),
Temperature=c(78,89,85,84,81,79,77,85),
Status=c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE))

print(df)


Output:

  Machine_name Pressure Temperature Status
1            A    12.39          78   TRUE
2            B    11.25          89   TRUE
3            C    12.15          85  FALSE
4            D    13.48          84   TRUE
5            E    13.78          81  FALSE
6            F    12.89          79  FALSE
7            G    12.21          77   TRUE
8            H    12.58          85  FALSE


Here the output shows dataframe that we created in above code.

## Step 3: Correlation Analysis

Let’s find correlation between two columns of dataframe using cor() function:

# Calculate correlation coeficient
c <- cor(df$Pressure,df$Temperature)

# Display correlation coeficient
print(c)


Output:

[1] -0.3579008


Here the output show correlation between Pressure and Temperature column of dataframe.

## Step 4: Regression Analysis

To perform linear regression analysis you can use lm() function:

# Fit simple linear regression model
l <- lm(Pressure ~ Temperature,data=df)

# Calculate summary of linear regression model
s <- summary(l)

# Display summary of linear regression model
print(s)


Output:

Call:
lm(formula = Pressure ~ Temperature, data = df)

Residuals:
Min       1Q   Median       3Q      Max
-0.87778 -0.55523 -0.08842  0.38541  1.10292

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.23874    6.02198   3.029   0.0231 *
Temperature -0.06866    0.07313  -0.939   0.3840
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.8061 on 6 degrees of freedom
Multiple R-squared:  0.1281,	Adjusted R-squared:  -0.01722
F-statistic: 0.8815 on 1 and 6 DF,  p-value: 0.384


Here we perform linear regression on dataframe.

## Step 5: Visualization

To show correlation between columns of dataframe you can plot scatter chart using plot() function:

#create scatterplot of Pressure vs.Temperature
plot(df$Pressure, df$Temperature, pch=16, col='steelblue',
main='Pressure vs. Temperature',
xlab='Pressure', ylab='Temperature')


Output:

Here the above snippet shows scatter plot which shows correlation between Pressure and Temperature column of dataframe.