Bivariate analysis in R involves analyzing the relationship between two variables.

To perform bivariate analysis, you can perform operations like finding the correlation coefficient, performing regression, and creating visualizations like scatter plots.

In this article, we will explore how to perform bivariate analysis in R.

## Step 1: Load Required Libraries

To begin, load the necessary libraries. Here, we use the **tidyverse** library, which is required for further use.

```
library(tidyverse)
```

## Step 2: Load Dataset

You can load an in-built dataset or create a dataframe to perform bivariate analysis on it. Hereβs an example of how to create a dataframe:

```
# Create data frame
df <- data.frame(Machine_name=c("A","B","C","D","E","F","G","H"),
Pressure=c(12.39,11.25,12.15,13.48,13.78,12.89,12.21,12.58),
Temperature=c(78,89,85,84,81,79,77,85),
Status=c(TRUE,TRUE,FALSE,TRUE,FALSE,FALSE,TRUE,FALSE))
print(df)
```

Output: ποΈ

```
Machine_name Pressure Temperature Status
1 A 12.39 78 TRUE
2 B 11.25 89 TRUE
3 C 12.15 85 FALSE
4 D 13.48 84 TRUE
5 E 13.78 81 FALSE
6 F 12.89 79 FALSE
7 G 12.21 77 TRUE
8 H 12.58 85 FALSE
```

In the code above, we have defined a dataframe with four columns: Machine_name, Pressure, Temperature, and Status.

## Step 3: Correlation Analysis

Let’s find the correlation between two columns of the dataframe using the **cor()** function:

```
# Calculate correlation coefficient
c <- cor(df$Pressure, df$Temperature)
# Display correlation coefficient
print(c)
```

Output: ποΈ

```
[1] -0.3579008
```

Here, the output shows the correlation between the **Pressure** and **Temperature** columns of the dataframe.

## Step 4: Regression Analysis

To perform linear regression analysis, you can use the **lm()** function:

```
# Fit simple linear regression model
l <- lm(Pressure ~ Temperature, data=df)
# Calculate summary of linear regression model
s <- summary(l)
# Display summary of linear regression model
print(s)
```

Output: ποΈ

```
Call:
lm(formula = Pressure ~ Temperature, data = df)
Residuals:
Min 1Q Median 3Q Max
-0.87778 -0.55523 -0.08842 0.38541 1.10292
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 18.23874 6.02198 3.029 0.0231 *
Temperature -0.06866 0.07313 -0.939 0.3840
---
Signif. codes: 0 β***β 0.001 β**β 0.01 β*β 0.05 β.β 0.1 β β 1
Residual standard error: 0.8061 on 6 degrees of freedom
Multiple R-squared: 0.1281, Adjusted R-squared: -0.01722
F-statistic: 0.8815 on 1 and 6 DF, p-value: 0.384
```

In this example, we perform linear regression on the dataframe.

## Step 5: Visualization

To show the correlation between columns of the dataframe, you can plot a scatter chart using the **plot()** function:

```
# Create scatterplot of Pressure vs. Temperature
plot(df$Pressure, df$Temperature, pch=16, col='steelblue',
main='Pressure vs. Temperature',
xlab='Pressure', ylab='Temperature')
```

Output: ποΈ

Here, the above snippet shows a scatter plot that displays the correlation between the **Pressure** and **Temperature** columns of the dataframe.