The iris dataset is built-in datset in R, it has data on 150 iris flowers, with measurements for four features: sepal length, sepal width, petal length, and petal width.

In this article we see how to load, explore, summarize and visualize iris dataset in R.

Load the Iris Dataset

To load the iris dataset we use data() function:

# Load the iris dataset
data(iris)

Let see how we can get first six rows from iris dataset:

# Get first few rows of dataset
head(iris)

The following output shows first six rows from iris dataset.

Output:

 Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1          5.1         3.5          1.4         0.2  setosa
2          4.9         3.0          1.4         0.2  setosa
3          4.7         3.2          1.3         0.2  setosa
4          4.6         3.1          1.5         0.2  setosa
5          5.0         3.6          1.4         0.2  setosa
6          5.4         3.9          1.7         0.4  setosa

Summarize the Iris Dataset

To summarize the iris dataset we use summary() function:

# Get statistical values of column of dataset
summary(iris)

The below output shows quick summary for each variable of dataset.

Output:

 Sepal.Length    Sepal.Width     Petal.Length    Petal.Width          Species  
 Min.   :4.300   Min.   :2.000   Min.   :1.000   Min.   :0.100   setosa    :50  
 1st Qu.:5.100   1st Qu.:2.800   1st Qu.:1.600   1st Qu.:0.300   versicolor:50  
 Median :5.800   Median :3.000   Median :4.350   Median :1.300   virginica :50  
 Mean   :5.843   Mean   :3.057   Mean   :3.758   Mean   :1.199                  
 3rd Qu.:6.400   3rd Qu.:3.300   3rd Qu.:5.100   3rd Qu.:1.800                  
 Max.   :7.900   Max.   :4.400   Max.   :6.900   Max.   :2.500  

Get Dimension of the Iris Dataset

To get number of rows and column of iris dataset we use dim() function:

# Shows rows and columns
dim(iris)

The below output shows total number of rows and columns of iris dataset.

Output:

[1] 150   5

Get Column Names of the Iris Dataset

To get column names of iris dataset we use names() function:

# Shows column names
names(iris)

The following output shows column names of iris dataset.

Output:

[1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species" 

Visualize the Iris Dataset

There are multiple function in R used to visualize dataset.Let see these function one by one.

Let’s create histogram using hist() function:

# Plot histogram for values of petal length
hist(iris$Petal.Length,
     col='green',
     main='Histogram',
     xlab='Length',
     ylab='Frequency')

The following snippet shows histogram for petal length variable.

Output:

Histogram

To create scatterplot we use plot() function:

# Create scatterplot of petal width vs. petal length
plot(iris$Petal.Width, iris$Petal.Length,
     col='red',
     main='Scatterplot',
     xlab='Petal Width',
     ylab='Petal Length',
     pch=19)

The below snippet displays shows scatterplot of Petal width vs petal length.

Output:

Scatterplot

We can plot boxplot using boxplot() function:

# Create boxplot of petal width by Species
boxplot(Petal.Length~Species,
        data=iris,
        main='Petal Length by Species',
        xlab='Species',
        ylab='Petal Length',
        col='steelblue',
        border='black')

The output shows boxplot for petal width grouped by species.

Output:

Boxplot

Using all these function we can visualize dataset.