The with() and within() functions allow you to work with data frame columns as if they were variables in your workspace. They’re convenience functions that save typing and make code cleaner;instead of df$column, you can just use column inside the function.
Key Difference:
- with(): Evaluates expressions, returns result (doesn’t modify original)
- within(): Modifies columns, returns modified copy of data frame
When to Use with() and within()
You’ll use these functions when:
- Simplifying column references in calculations
- Avoiding repeated
df$notation - Creating new columns based on existing ones
- Keeping scope clean without polluting workspace
- Making code more readable
Basic Syntax
# with() - Calculate, don't modify
result <- with(df, column1 + column2)
# within() - Create/modify columns
df_modified <- within(df, {
new_column <- column1 + column2
another_col <- column3 * 2
})
Key Differences
| Feature | with() | within() |
|---|---|---|
| Modifies data frame | No | Yes (returns copy) |
| Return value | Calculation result | Modified data frame |
| Best for | Quick calculations | Creating new columns |
| Use case | Analysis | Data transformation |
Setup Example
# Define dataframe
df <- data.frame(Humidity=c(78,79,75,79,74,81,82,73),
Temperature=c(12,9,14,13,18,28,22,23),
Pressure=c(20,25,27,29,30,32,39,40))
# Print data frame
print(df)
Output:
Humidity Temperature Pressure
1 78 12 20
2 79 9 25
3 75 14 27
4 79 13 29
5 74 18 30
6 81 28 32
7 82 22 39
8 73 23 40
The output shows dataframe that created using data.frame() function.
Use with() Function
We can use with() function to multiply two variables of dataframe:
# Define dataframe
df <- data.frame(Humidity=c(78,79,75,79,74,81,82,73),
Temperature=c(12,9,14,13,18,28,22,23),
Pressure=c(20,25,27,29,30,32,39,40))
# Multiply values between Humidity and Temperature
with(df, Humidity*Temperature)
Output:
[1] 936 711 1050 1027 1332 2268 1804 1679
This with() function multiply temperature and humidity column of dataframe and create vector with out affecting original dataframe.
Use within() Function
Now use within() function to multiply two variables of data frame and assign results to new column of dataframe.
# Define dataframe
df <- data.frame(Humidity=c(78,79,75,79,74,81,82,73),
Temperature=c(12,9,14,13,18,28,22,23),
Pressure=c(20,25,27,29,30,32,39,40))
# Multiply values between Temperature and Pressure
within(df, a <- Temperature*Pressure)
Output:
Humidity Temperature Pressure a
1 78 12 20 240
2 79 9 25 225
3 75 14 27 378
4 79 13 29 377
5 74 18 30 540
6 81 28 32 896
7 82 22 39 858
8 73 23 40 920
This creates a new column in a copy of the original data frame without affecting the original.
Common Mistakes to Avoid
Mistake 1: Confusing with() and within()
# ❌ WRONG - Using with() when you need to modify df
result <- with(df, {
new_col <- Temperature * Pressure
df # Returns original df, not modified!
})
# ✅ CORRECT - Use within() to modify and return df
result <- within(df, {
new_col <- Temperature * Pressure
})
Mistake 2: Not assigning within() result
# ❌ PROBLEM - Modifies nothing
within(df, {
new_col <- Temperature * Pressure
})
# df is unchanged! Must assign the result
# ✅ CORRECT - Assign the result
df <- within(df, {
new_col <- Temperature * Pressure
})
Mistake 3: Trying to modify in with()
# ❌ WRONG - with() doesn't modify original
with(df, new_col <- Temperature * Pressure)
# new_col exists temporarily, df unchanged
# ✅ CORRECT - Use within()
df <- within(df, new_col <- Temperature * Pressure)
Mistake 4: Scoping issues with functions
# ❌ PROBLEM - Variables outside df not accessible
value <- 100
result <- with(df, Temperature + value) # May error
# ✅ SOLUTION - Be explicit with environment
result <- with(df, Temperature + value, where = parent.frame())
Pro Tips
-
Simpler alternative with dplyr:
# Instead of with/within, use mutate() df %>% mutate(new_col = Temperature * Pressure) -
Use within() for multiple operations:
df <- within(df, { col1 <- Temperature * Pressure col2 <- Humidity / 100 col3 <- col1 + col2 }) -
with() for temporary calculations:
# Calculate something without polluting workspace avg_temp <- with(df, mean(Temperature, na.rm = TRUE)) -
Check what columns are available in scope:
with(df, ls()) # Shows available columns
Comparison: with/within vs dplyr
# OLD WAY - with/within
df <- within(df, new_col <- A * B)
# MODERN WAY - dplyr (more consistent)
df <- df %>% mutate(new_col = A * B)
# dplyr is now preferred for data transformation!