Chapter 17 Scoping & Environments
What You’ll Learn:
- How R finds objects
- Lexical scoping rules
- Environment hierarchy
- Common scoping errors
- Global vs local variables
Key Errors Covered: 12+ scoping errors
Difficulty: ⭐⭐⭐ Advanced
17.1 Introduction
Scoping determines how R finds objects, and it’s often surprising:
x <- 10
my_func <- function() {
x # Where does this x come from?
}
my_func() # Returns 10!
#> [1] 10Let’s understand scoping to avoid confusion.
17.2 Environment Basics
💡 Key Insight: Environments Are Like Named Lists
# Create an environment
my_env <- new.env()
# Add objects
my_env$x <- 10
my_env$y <- 20
# Access objects
my_env$x
#> [1] 10
# List contents
ls(my_env)
#> [1] "x" "y"
# Every function has an environment
f <- function() {
x <- 5
environment()
}
f_env <- f()
ls(f_env)
#> [1] "x"
# Current environment
environment() # Usually global environment
#> <environment: R_GlobalEnv>
# Parent environment
parent.env(my_env)
#> <environment: R_GlobalEnv>Key points: - Environments contain named objects - Every function creates a new environment - Environments have parents - R searches up the chain
17.3 Error #1: object not found (scoping)
⭐⭐ INTERMEDIATE 🔍 SCOPE
17.3.2 What It Means
The object was created in a function’s environment and isn’t accessible outside.
17.3.3 Understanding Scope
# Global scope
x <- 10
func1 <- function() {
# Function scope
y <- 20
# Can see global x
cat("x:", x, "\n")
# Can see local y
cat("y:", y, "\n")
}
func1()
#> x: 10
#> y: 20
# x exists globally
x
#> [1] 10
# y doesn't (was in function scope)
exists("y") # FALSE
#> [1] TRUE
# Example 2: Nested functions
outer_func <- function() {
z <- 30
inner_func <- function() {
# Can see z from parent environment
z
}
inner_func()
}
outer_func()
#> [1] 30
# z doesn't exist here
exists("z") # FALSE
#> [1] TRUE17.3.4 Common Causes
17.4 Lexical Scoping
💡 Key Insight: Lexical Scoping
R uses lexical (static) scoping: where a function was defined matters, not where it’s called.
x <- 10
func1 <- function() {
x # Uses x from where function was defined
}
func2 <- function() {
x <- 20 # Different x
func1() # What does this return?
}
func2() # Returns 10 (not 20!)
#> [1] 10
# Why? func1 looks for x where it was defined (global env)
# Not where it was called (inside func2)
# Example 2: Function factories
make_multiplier <- function(n) {
function(x) {
x * n # n from make_multiplier's environment
}
}
times_3 <- make_multiplier(3)
times_5 <- make_multiplier(5)
times_3(10) # 30
#> [1] 30
times_5(10) # 50
#> [1] 50
# Each function remembers its own n!Key rule: R looks for variables in: 1. Current environment 2. Parent environment (where function was defined) 3. Parent’s parent, etc. 4. Eventually global environment 5. Loaded packages
17.5 Error #2: Unexpected Value from Outer Scope
⭐⭐⭐ ADVANCED 🔍 SCOPE
17.5.3 Solutions
✅ SOLUTION 1: Explicit Parameters
✅ SOLUTION 2: Check for Undefined Variables
# Development tool
check_function_variables <- function(func) {
# Get function body
body_expr <- body(func)
# Get all variable names (simplified)
all_vars <- all.names(body_expr)
# Get formal arguments
args <- names(formals(func))
# Find variables not in arguments
external <- setdiff(all_vars, c(args, "function", "{", "+", "-", "*", "/"))
if (length(external) > 0) {
message("Potentially external variables: ",
paste(external, collapse = ", "))
}
}
x <- 999
calculate <- function(a, b) {
a + b + x
}
check_function_variables(calculate) # Warns about x
#> Potentially external variables: x✅ SOLUTION 3: Use Local Functions
# Keep related functions together
calculator <- local({
# Private variable
default_offset <- 0
# Public functions
list(
add = function(a, b) a + b + default_offset,
set_offset = function(value) default_offset <<- value
)
})
calculator$add(5, 10)
#> [1] 15
calculator$set_offset(100)
calculator$add(5, 10)
#> [1] 11517.6 Global Assignment: <<-
⚠️ Pitfall: Global Assignment
counter <- 0
# Bad: modifies global
increment_bad <- function() {
counter <<- counter + 1
}
increment_bad()
counter # 1
#> [1] 1
increment_bad()
counter # 2
#> [1] 2
# Problem: Hard to track where changes happen
# Can cause bugs in large programs
# Better: Use closures
make_counter <- function() {
count <- 0
list(
increment = function() {
count <<- count + 1 # OK here (modifying enclosing env)
count
},
get = function() count,
reset = function() count <<- 0
)
}
counter_obj <- make_counter()
counter_obj$increment()
#> [1] 1
counter_obj$increment()
#> [1] 2
counter_obj$get()
#> [1] 2
counter_obj$reset()
counter_obj$get()
#> [1] 0When to use <<-: - Inside closures/function factories - For memoization - When you truly need state
When NOT to use <<-: - In regular functions (use return values instead) - When you can pass arguments - In package functions (very rarely appropriate)
17.7 Environments in Functions
💡 Key Insight: Function Environments
# Each function call creates new environment
create_accumulator <- function() {
sum <- 0
function(x) {
sum <<- sum + x
sum
}
}
# Create two independent accumulators
acc1 <- create_accumulator()
acc2 <- create_accumulator()
# Each has its own sum!
acc1(5) # 5
#> [1] 5
acc1(10) # 15
#> [1] 15
acc2(3) # 3
#> [1] 3
acc2(7) # 10
#> [1] 10
acc1(0) # 15 (independent)
#> [1] 15
acc2(0) # 10 (independent)
#> [1] 10
# Inspect environments
environment(acc1)
#> <environment: 0x7fead835ed30>
environment(acc2)
#> <environment: 0x7fea747f1a08>
# Different!
# Get value from environment
get("sum", environment(acc1))
#> [1] 15
get("sum", environment(acc2))
#> [1] 1017.8 Common Scoping Patterns
🎯 Best Practice: Scoping Patterns
# 1. Function factories
make_power <- function(n) {
function(x) {
x ^ n
}
}
square <- make_power(2)
cube <- make_power(3)
square(5)
#> [1] 25
cube(5)
#> [1] 125
# 2. Memoization (caching results)
fib_memo <- local({
cache <- list()
function(n) {
if (n <= 1) return(n)
# Check cache
key <- as.character(n)
if (!is.null(cache[[key]])) {
return(cache[[key]])
}
# Calculate and cache
result <- fib_memo(n - 1) + fib_memo(n - 2)
cache[[key]] <<- result
result
}
})
system.time(fib_memo(30))
#> user system elapsed
#> 0.001 0.000 0.001
system.time(fib_memo(30)) # Much faster (cached)
#> user system elapsed
#> 0.001 0.000 0.000
# 3. Private variables
create_account <- function(initial_balance = 0) {
balance <- initial_balance # Private
list(
deposit = function(amount) {
if (amount <= 0) stop("Amount must be positive")
balance <<- balance + amount
invisible(balance)
},
withdraw = function(amount) {
if (amount > balance) stop("Insufficient funds")
balance <<- balance - amount
invisible(balance)
},
get_balance = function() {
balance
}
)
}
account <- create_account(100)
account$deposit(50)
account$withdraw(30)
account$get_balance()
#> [1] 120
# Can't access balance directly
# account$balance # NULL (not accessible)
# 4. Package-like namespacing
my_package <- local({
# Private function
helper <- function(x) {
x * 2
}
# Public functions
list(
public_func1 = function(x) {
helper(x) + 1
},
public_func2 = function(x) {
helper(x) - 1
}
)
})
my_package$public_func1(5) # Works
#> [1] 11
# my_package$helper(5) # NULL (private)17.9 Search Path
💡 Key Insight: Search Path
# Where R looks for objects
search()
#> [1] ".GlobalEnv" "package:magick" "package:rsvg"
#> [4] "package:Rcpp" "package:R6" "package:dbplyr"
#> [7] "package:RSQLite" "package:DBI" "package:writexl"
#> [10] "package:readxl" "package:car" "package:carData"
#> [13] "package:lmtest" "package:zoo" "package:ggrepel"
#> [16] "package:patchwork" "package:rlang" "package:assertthat"
#> [19] "package:microbenchmark" "package:ggplot2" "package:glue"
#> [22] "package:stringr" "package:forcats" "package:MASS"
#> [25] "package:tibble" "package:purrr" "package:dplyr"
#> [28] "package:lubridate" "package:readr" "package:tidyr"
#> [31] "tools:rstudio" "package:stats" "package:graphics"
#> [34] "package:grDevices" "package:utils" "package:datasets"
#> [37] "package:methods" "Autoloads" "package:base"
# Order matters!
# 1. Global environment
# 2. Loaded packages (in order)
# 3. Base packages
# Example: name conflicts
library(dplyr)
# Both have filter()
# Which one gets used?
filter # Shows dplyr::filter
#> function (.data, ..., .by = NULL, .preserve = FALSE)
#> {
#> check_by_typo(...)
#> by <- enquo(.by)
#> if (!quo_is_null(by) && !is_false(.preserve)) {
#> abort("Can't supply both `.by` and `.preserve`.")
#> }
#> UseMethod("filter")
#> }
#> <bytecode: 0x7fea7c722d20>
#> <environment: namespace:dplyr>
# Use package::function to be explicit
stats::filter # Base R version
#> function (x, filter, method = c("convolution", "recursive"),
#> sides = 2L, circular = FALSE, init = NULL)
#> {
#> method <- match.arg(method)
#> x <- as.ts(x)
#> storage.mode(x) <- "double"
#> xtsp <- tsp(x)
#> n <- as.integer(NROW(x))
#> if (is.na(n))
#> stop(gettextf("invalid value of %s", "NROW(x)"), domain = NA)
#> nser <- NCOL(x)
#> filter <- as.double(filter)
#> nfilt <- as.integer(length(filter))
#> if (is.na(nfilt))
#> stop(gettextf("invalid value of %s", "length(filter)"),
#> domain = NA)
#> if (anyNA(filter))
#> stop("missing values in 'filter'")
#> if (method == "convolution") {
#> if (nfilt > n)
#> stop("'filter' is longer than time series")
#> sides <- as.integer(sides)
#> if (is.na(sides) || (sides != 1L && sides != 2L))
#> stop("argument 'sides' must be 1 or 2")
#> circular <- as.logical(circular)
#> if (is.na(circular))
#> stop("'circular' must be logical and not NA")
#> if (is.matrix(x)) {
#> y <- matrix(NA, n, nser)
#> for (i in seq_len(nser)) y[, i] <- .Call(C_cfilter,
#> x[, i], filter, sides, circular)
#> }
#> else y <- .Call(C_cfilter, x, filter, sides, circular)
#> }
#> else {
#> if (missing(init)) {
#> init <- matrix(0, nfilt, nser)
#> }
#> else {
#> ni <- NROW(init)
#> if (ni != nfilt)
#> stop("length of 'init' must equal length of 'filter'")
#> if (NCOL(init) != 1L && NCOL(init) != nser) {
#> stop(sprintf(ngettext(nser, "'init' must have %d column",
#> "'init' must have 1 or %d columns", domain = "R-stats"),
#> nser), domain = NA)
#> }
#> if (!is.matrix(init))
#> dim(init) <- c(nfilt, nser)
#> }
#> ind <- seq_len(nfilt)
#> if (is.matrix(x)) {
#> y <- matrix(NA, n, nser)
#> for (i in seq_len(nser)) y[, i] <- .Call(C_rfilter,
#> x[, i], filter, c(rev(init[, i]), double(n)))[-ind]
#> }
#> else y <- .Call(C_rfilter, x, filter, c(rev(init[, 1L]),
#> double(n)))[-ind]
#> }
#> tsp(y) <- xtsp
#> class(y) <- if (nser > 1L)
#> c("mts", "ts")
#> else "ts"
#> y
#> }
#> <bytecode: 0x7feacc960c78>
#> <environment: namespace:stats>
dplyr::filter # dplyr version
#> function (.data, ..., .by = NULL, .preserve = FALSE)
#> {
#> check_by_typo(...)
#> by <- enquo(.by)
#> if (!quo_is_null(by) && !is_false(.preserve)) {
#> abort("Can't supply both `.by` and `.preserve`.")
#> }
#> UseMethod("filter")
#> }
#> <bytecode: 0x7fea7c722d20>
#> <environment: namespace:dplyr>
# Check where function comes from
find("filter")
#> [1] "package:dplyr" "package:stats"17.10 Debugging Scope Issues
🎯 Best Practice: Debug Scoping
# 1. Check where you are
debug_env <- function() {
cat("Current environment:\n")
print(environment())
cat("\nParent environment:\n")
print(parent.env(environment()))
cat("\nObjects in current env:\n")
print(ls())
}
my_func <- function(x) {
y <- 10
debug_env()
}
my_func(5)
#> Current environment:
#> <environment: 0x7feacf1615a0>
#>
#> Parent environment:
#> <environment: R_GlobalEnv>
#>
#> Objects in current env:
#> character(0)
# 2. Trace variable lookups
where_is <- function(name) {
env <- parent.frame()
while (!identical(env, emptyenv())) {
if (exists(name, envir = env, inherits = FALSE)) {
return(environmentName(env))
}
env <- parent.env(env)
}
"Not found"
}
x <- 10
test_func <- function() {
where_is("x")
}
test_func()
#> [1] "R_GlobalEnv"
# 3. List all variables in scope
ls.all <- function() {
# Get all environments in search path
envs <- search()
for (env_name in envs) {
env <- as.environment(env_name)
objs <- ls(env)
if (length(objs) > 0) {
cat("\n", env_name, ":\n", sep = "")
cat(" ", paste(head(objs, 10), collapse = ", "), "\n")
if (length(objs) > 10) {
cat(" ... and", length(objs) - 10, "more\n")
}
}
}
}
# ls.all() # Lists everything17.11 Summary
Key Takeaways:
- Lexical scoping - Functions use variables from where they’re defined
- Function environments - Each call creates new environment
- Search path - R looks up through parent environments
- Local before global - Local variables shadow global ones
- <<- for parent environment - Use cautiously
- Return values preferred - Better than global modification
- Closures retain environment - Function factories work because of this
Quick Reference:
| Error | Cause | Fix |
|---|---|---|
| object not found | Variable in wrong scope | Return from function or use <<- |
| Unexpected value | Using unintended global | Make parameters explicit |
| Function modifies global | Using <<- unintentionally | Use return values |
| Name conflicts | Same name in multiple packages | Use package::function |
Scoping Rules:
# R looks for variables in order:
1. Current environment
2. Parent environment (where defined, not called)
3. Parent's parent
4. ... up to global environment
5. Loaded packages
6. Base package
# Assignment
x <- value # Creates in current environment
x <<- value # Creates in first parent with x, or global
# Accessing
x # Searches up environments
get("x") # Same as above
exists("x") # Check if existsBest Practices:
# ✅ Good
function(x, y) { x + y } # Explicit parameters
result <- my_func(data) # Return and assign
make_counter <- function() { } # Closures for state
# ❌ Avoid
function() { global_var + 5 } # Implicit global use
my_func <- function() { x <<- 5 } # Modifying global
Assuming variable exists # Check with exists()17.12 Exercises
📝 Exercise 1: Scope Exploration
Predict the output:
📝 Exercise 2: Counter Implementation
Create a counter using closures:
1. increment() - adds 1
2. decrement() - subtracts 1
3. get() - returns current value
4. reset() - sets to 0
📝 Exercise 3: Function Factory
Write make_scaler(center, scale) that returns a function that:
1. Subtracts center from input
2. Divides by scale
3. Use with built-in datasets
📝 Exercise 4: Environment Inspector
Write inspect_scope() that:
1. Shows current environment
2. Lists parent environments
3. Shows variables at each level
4. Identifies potential conflicts
17.13 Exercise Answers
Click to see answers
Exercise 1:
x <- 10
func1 <- function() {
x <- 20 # Local x in func1
func2()
}
func2 <- function() {
x # Looks for x where func2 was defined (global)
}
func1() # Returns 10 (not 20!)
#> [1] 10
# Why? Lexical scoping:
# func2 was defined in global environment
# So it looks for x in global environment (x = 10)
# NOT where it was called (inside func1 with x = 20)Exercise 2:
make_counter <- function(initial = 0) {
count <- initial
list(
increment = function() {
count <<- count + 1
invisible(count)
},
decrement = function() {
count <<- count - 1
invisible(count)
},
get = function() {
count
},
reset = function() {
count <<- initial
invisible(count)
}
)
}
# Test
counter <- make_counter(10)
counter$increment()
counter$increment()
counter$get() # 12
#> [1] 12
counter$decrement()
counter$get() # 11
#> [1] 11
counter$reset()
counter$get() # 10
#> [1] 10
# Create multiple independent counters
counter1 <- make_counter(0)
counter2 <- make_counter(100)
counter1$increment()
counter2$decrement()
counter1$get() # 1
#> [1] 1
counter2$get() # 99
#> [1] 99Exercise 3:
make_scaler <- function(center = 0, scale = 1) {
# Validate inputs
if (!is.numeric(center) || length(center) != 1) {
stop("center must be a single numeric value")
}
if (!is.numeric(scale) || length(scale) != 1 || scale == 0) {
stop("scale must be a single non-zero numeric value")
}
# Return scaling function
function(x) {
if (!is.numeric(x)) {
stop("x must be numeric")
}
(x - center) / scale
}
}
# Test with mtcars
mpg_mean <- mean(mtcars$mpg)
mpg_sd <- sd(mtcars$mpg)
standardize_mpg <- make_scaler(mpg_mean, mpg_sd)
# Standardize mpg
mpg_scaled <- standardize_mpg(mtcars$mpg)
# Check: should have mean ≈ 0, sd ≈ 1
mean(mpg_scaled)
#> [1] 7.112366e-17
sd(mpg_scaled)
#> [1] 1
# Create different scalers
scale_0_1 <- make_scaler(
center = min(mtcars$hp),
scale = max(mtcars$hp) - min(mtcars$hp)
)
hp_scaled <- scale_0_1(mtcars$hp)
range(hp_scaled) # Should be 0 to 1
#> [1] 0 1Exercise 4:
inspect_scope <- function() {
# Get calling environment
env <- parent.frame()
cat("=== Environment Inspection ===\n\n")
level <- 0
while (!identical(env, emptyenv())) {
# Environment name
env_name <- environmentName(env)
if (env_name == "") {
env_name <- paste0("<anonymous ", level, ">")
}
cat("Level", level, ":", env_name, "\n")
# List objects
objs <- ls(env, all.names = FALSE)
if (length(objs) > 0) {
cat(" Objects:", paste(head(objs, 5), collapse = ", "))
if (length(objs) > 5) {
cat(" ... +", length(objs) - 5, "more")
}
cat("\n")
# Check for conflicts with parent
parent_env <- parent.env(env)
if (!identical(parent_env, emptyenv())) {
parent_objs <- ls(parent_env, all.names = FALSE)
conflicts <- intersect(objs, parent_objs)
if (length(conflicts) > 0) {
cat(" ⚠ Shadows parent objects:",
paste(conflicts, collapse = ", "), "\n")
}
}
} else {
cat(" (empty)\n")
}
cat("\n")
# Move to parent
env <- parent.env(env)
level <- level + 1
# Stop at global or after reasonable depth
if (level > 10) {
cat("... (stopping after 10 levels)\n")
break
}
}
}
# Test
test_function <- function() {
local_var <- 123
x <- "local x" # Shadows global x if it exists
inspect_scope()
}
x <- "global x"
test_function()
#> === Environment Inspection ===
#>
#> Level 0 : <anonymous 0>
#> Objects: local_var, x
#> ⚠ Shadows parent objects: x
#>
#> Level 1 : R_GlobalEnv
#> Objects: a, A, A_col, A_inv, A_sub ... + 769 more
#>
#> Level 2 : package:magick
#> Objects: %>%, as_EBImage, autoviewer_disable, autoviewer_enable, channel_types ... + 137 more
#>
#> Level 3 : package:rsvg
#> Objects: librsvg_version, rsvg, rsvg_eps, rsvg_nativeraster, rsvg_pdf ... + 5 more
#>
#> Level 4 : package:Rcpp
#> Objects: compileAttributes, cpp_object_dummy, cpp_object_initializer, cppFunction, demangle ... + 20 more
#>
#> Level 5 : package:R6
#> Objects: is.R6, is.R6Class, R6Class
#>
#> Level 6 : package:dbplyr
#> Objects: %>%, as_table_path, as.sql, base_agg, base_no_win ... + 166 more
#>
#> Level 7 : package:RSQLite
#> Objects: datasetsDb, dbAppendTable, dbAppendTableArrow, dbBegin, dbBeginTransaction ... + 70 more
#> ⚠ Shadows parent objects: dbAppendTable, dbAppendTableArrow, dbBegin, dbBind, dbBindArrow, dbCanConnect, dbClearResult, dbColumnInfo, dbCommit, dbConnect, dbCreateTable, dbCreateTableArrow, dbDataType, dbDisconnect, dbDriver, dbExecute, dbExistsTable, dbFetch, dbFetchArrow, dbFetchArrowChunk, dbGetConnectArgs, dbGetException, dbGetInfo, dbGetQuery, dbGetQueryArrow, dbGetRowCount, dbGetRowsAffected, dbGetStatement, dbHasCompleted, dbIsReadOnly, dbIsValid, dbListFields, dbListObjects, dbListResults, dbListTables, dbQuoteIdentifier, dbQuoteLiteral, dbQuoteString, dbReadTable, dbReadTableArrow, dbRemoveTable, dbRollback, dbSendQuery, dbSendQueryArrow, dbSendStatement, dbUnloadDriver, dbUnquoteIdentifier, dbWithTransaction, dbWriteTable, dbWriteTableArrow, fetch, Id, isSQLKeyword, make.db.names, show, sqlData, SQLKeywords
#>
#> Level 8 : package:DBI
#> Objects: ANSI, dbAppendTable, dbAppendTableArrow, dbBegin, dbBind ... + 71 more
#>
#> Level 9 : package:writexl
#> Objects: lxw_version, write_xlsx, xl_formula, xl_hyperlink
#>
#> Level 10 : package:readxl
#> Objects: anchored, cell_cols, cell_limits, cell_rows, excel_format ... + 8 more
#>
#> ... (stopping after 10 levels)