A Data frame is a primary data structure for handling tabular data sets like a spreadsheet. R Data frame is a two-dimensional data structure and displays data in the format of a table. Data frames are an atomic data structure in R.
R Data frames are like matrices except that the columns are allowed to be of different types. R data frames stores heterogeneous data types, which means it stores different types of data in it.
In this tutorial, we will discuss R data frames, how to create data frames in R, access elements of the data frame.
Create Data Frame in R
Using the data.frame() function, we can create a data frame in R.
It converts the collection of vectors or a matrix into a data frame.
Creating an empty data frame in R
To create an empty data frame with variable names and types, use data.frame() function. It creates an empty structure of a data frame.
# Create the empty data frame
student_data <- data.frame(
Id = numeric(),
Name = character(),
Age = numeric()
)
str(student_data)
# Print the data frame
student_data
In the above R code, using data.frame() function, it create an empty data frame.
The output of the above r code gets the structure of the data frame and prints the data frame.
'data.frame': 0 obs. of 3 variables:
$ Id : num
$ Name: chr
$ Age : num
[1] Id Name Age
<0 rows> (or 0-length row.names)
Creating a data frame using data.frame() function in R
Let’s consider an example to create a data frame using the data.frame() function.
student_data <- data.frame(
name = c("Tom", "Aron", "Gary","Jeannie"),
gender = c('Male', 'Male','Male', 'Female'),
age = c(18, 20, 17, 21))
# Print the data frame
student_data
The output of the above R data frame is:
name gender age
1 Tom Male 18
2 Aron Male 20
3 Gary Male 17
4 Jeannie Female 21
To get the structure of the data frame, use the str() function.
str(student_data)
The output of the data frame structure is:
'data.frame': 4 obs. of 3 variables:
$ name : chr "Tom" "Aron" "Gary" "Jeannie"
$ gender: chr "Male" "Male" "Male" "Female"
$ age : num 18 20 17 21
Creating a data frame from vectors in R
A Data frame can be created using the vectors. To construct a data frame from the vector data, create a vector that corresponds to each column of the data.
name <- c("Tom", "Aron", "Gary","Jeannie")
gender <- c("Male", "Male", "Male", "Female")
age <- c(18, 20, 17, 21)
Use data.frame() function to combine all the vectors to create a data frame.
The data.frame() function creates an object called student_data and stores values of the variable’s name, gender, and age
student_data <- data.frame(name, gender, age)
# Display class of a data frame
class(student)
[1] "data.frame"
Using the names() function, you can get the name of the variables in the data frame.
# display the name of the variables from the data frame
names(student_data)
[1] "name" "gender" "age"
Using the str() function to get the structure of the data frame in R.
# Display the structure of a data frame
str(student_data)
'data.frame': 4 obs. of 3 variables:
$ name : chr "Tom" "Aron" "Gary" "Jeannie"
$ gender: chr "Male" "Male" "Male" "Female"
$ age : num 18 20 17 21
Creating a data frame from list() in R
You can create a data frame in R using the list() function.
Let’s practice!
Consider the above example of student data.
Create a list using the vectors.
name <- c("Tom", "Aron", "Gary","Jeannie")
gender <- c("Male", "Male", "Male", "Female")
age <- c(18, 20, 17, 21)
# Create a list from a vector
student_data.list <- list(
name = name,
gender = gender,
age = age
)
class(student_data.list)
To get the class of the list, use the class() function. The output is as
[1] "list"
Create a data frame from the list
# Create a data frame from list
student_data <- data.frame(student_data.list)
# Display the structure of a data frame
str(student)
The output of the above r data frame structure
'data.frame': 4 obs. of 3 variables:
$ name : chr "Tom" "Aron" "Gary" "Jeannie"
$ gender: chr "Male" "Male" "Male" "Female"
$ age : num 18 20 17 21
Get the dimension of a data frame using the dim() function
# Display dimension of a data frame
dim(student_data)
[1] 4 3
To get attributes of a data frame, use the attributes() function
attributes(student_data)
$names
[1] "name" "gender" "age"
$class
[1] "data.frame"
$row.names
[1] 1 2 3 4
Summarize the Data Frame in R
Using the Summary() function, you can get the summary of a data frame in R.
Let’s consider an example to get the summary of the data frame.
employee_data <- data.frame (
Name = c("Tom", "Andrea", "Aaron"),
Exp = c(8, 15, 10),
Salary = c(100000, 180000, 145000)
)
# Print the data frame
employee_data
# Display the summary of data frame
summary(employee_data)
In the above R code, the summary() function displays the summary of the data frame.
Name Exp Salary
1 Tom 8 100000
2 Andrea 15 180000
3 Aaron 10 145000
Name Exp Salary
Aaron :1 Min. : 8.0 Min. :100000
Andrea:1 1st Qu.: 9.0 1st Qu.:122500
Tom :1 Median :10.0 Median :145000
Mean :11.0 Mean :141667
3rd Qu.:12.5 3rd Qu.:162500
Max. :15.0 Max. :180000
Access Elements of Dataframe in R
You can access the elements of the data frame by specifying the row or column number.
df[i,]
returns i row of data framedf
,df[,j]
returns j column of data framedf
anddf[i,j]
returns (i,j) element of data framedf
.
Accessing Rows using index
Let’s consider an example to create a sales data frame and access the elements of the sales data.
sales_data <- data.frame (
Name = c("Ebook", "Book", "Video"),
Revenue = c(25000, 15000, 100000),
Profit = c(10000, 7000, 45000)
)
To get the first row of data from the data frame in R, use the following code
# Returns the first row of data
sales_data[1,]
Name Revenue Profit
1 Ebook 25000 10000
Access the dataframe column by index
You can get data frame column data by specifying the column by index.
# Returns the 2nd column of data
sales_data[,2]
[1] 25000 15000 100000
In the above R code, it returns the vector data from the data frame.
To get column data from the data frame, use drop = FALSE.
# returns 2nd column of data frame
sales_data[, 2, drop = FALSE]
Revenue
1 25000
2 15000
3 100000
Access Row and Column in Dataframe r
You can access and gets the row and column data from the data frame by specifying the row and column number.
Let’s practice!
# Returns value from 2nd row and 3rd column
sales_data[2,3]
[1] 7000
To get the first two rows of data and the third column of the data frame, use the following code.
# Returns the first two rows of data and third column of data frame
sales_data[1:2,3]
[1] 10000 7000
You can extract the non-adjacent rows and columns of the data frame using the following r code.
# Returns the elements from first and Third row of sales data frame
sales_data[c(1,3),]
Name Revenue Profit
1 Ebook 25000 10000
3 Video 100000 45000
Access Dataframe Column by Name
You can get access the data frame column data by name of the variable specified in the square bracket.
# Returns the Revenue column of data frame
sales_data["Revenue"]
Revenue
1 25000
2 15000
3 100000
You can also access the data frame column data using the $ symbol and the name of the variable.
# Retrieve the "Profit" column of data frame
sales_data$Profit
[1] 10000 7000 45000
You can access the column of the data frame in r using the double-quotes.
# Returns Name column of data frame
sales_data[["Name"]]
[1] "Ebook" "Book" "Video"
Add Rows to Dataframe in r
You can add a row to a data frame using rbind() function.
Let’s practice with an example of Sales data from the above R code.
sales_data <- data.frame (
Name = c("Ebook", "Book", "Video"),
Revenue = c(25000, 15000, 100000),
Profit = c(10000, 7000, 45000)
)
# Create new row with data
new_row <- c("Audio",80000,55000)
# Add row to data frame using rbind()
sales_data <- rbind(sales_data,new_row)
# Print the data frame
sales_data
Name Revenue Profit
1 Ebook 25000 10000
2 Book 15000 7000
3 Video 1e+05 45000
4 Audio 80000 55000
Removing rows from a data frame
Using the negative index for rows or concatenate function c(), remove rows from a data frame.
# Create a data frame
visitor_info <- data.frame(
country = c("US","UK","IN","AUS"),
visitors = c(10000,808,120,340),
social = c(3200,1220,450,120),
direct = c(1500,430,200,100)
)
# Print the data frame
visitor_info
# remove row using the concatenate c() function
visistor_info <- visitor_info[-c(2),]
In the above R code to remove row from a data frame,
using data.frame() it creates a data frame with data.
using concatenate function c(), it removes rows from a data frame. We have specified a row number for remove from data frame.
country visitors social direct
1 US 10000 3200 1500
2 UK 808 1220 430
3 IN 120 450 200
4 AUS 340 120 100
country visitors social direct
1 US 10000 3200 1500
3 IN 120 450 200
4 AUS 340 120 100
Add Column to Dataframe in r
You can add a column to the data frame in R using a simple assignment or using the cbind() function.
Let’s consider an example to add a new column to the visitor_info data frame using a simple assignment.
# Create a data frame
visitor_info <- data.frame(
country = c("US","UK","IN","AUS"),
visitors = c(10000,808,120,340),
social = c(3200,1220,450,120),
direct = c(1500,430,200,100)
)
# Print the data frame
visitor_info
# Create new vector refer
refer <- c(800,120,80,60)
# Add refer column to data frame
visitor_info$refer <- refer
# Print the data frame
visitor_info
In the above R code to add a column to the data frame in R, we create a new vector and assign a vector to the data frame variable.
The output of the above r code is:
country visitors social direct
country visitors social direct
1 US 10000 3200 1500
2 UK 808 1220 430
3 IN 120 450 200
4 AUS 340 120 100
country visitors social direct refer
1 US 10000 3200 1500 800
2 UK 808 1220 430 120
3 IN 120 450 200 80
4 AUS 340 120 100 60
Another method to add a column to an existing data frame is using the cbind() function.
# Create new vector as linkedin
linkedin <- c(1000,250,100,45)
# Use the cbind() function to add column to existing data frame
visitor_info <- cbind(visitor_info,linkedin)
# Prints the data frame
visitor_info
The output of the above r code to add a new column to the existing data frame in r is:
country visitors social direct refer linkedin
1 US 10000 3200 1500 800 1000
2 UK 808 1220 430 120 250
3 IN 120 450 200 80 100
4 AUS 340 120 100 60 45
Conclusion
I hope the article on r data frames, how to create a data frame in r, access elements of a data frame in r using different methods, and modifying the data frame by adding a new row or column is helpful to you.