Chapter 38 Iteration Best Practices

What You’ll Learn:

  • When to use loops vs apply vs purrr
  • Vectorization strategies
  • Performance optimization
  • Common pitfalls
  • Design patterns

Key Errors Covered: 12+ iteration errors

Difficulty: ⭐⭐⭐ Advanced

38.1 Introduction

Choosing the right iteration method matters:

library(purrr)
library(dplyr)

38.2 Vectorization First

🎯 Best Practice: Prefer Vectorized Operations

# ❌ Bad: Loop
result <- numeric(length(mtcars$mpg))
for (i in seq_along(mtcars$mpg)) {
  result[i] <- mtcars$mpg[i] * 2
}

# ❌ Bad: apply
result <- sapply(mtcars$mpg, function(x) x * 2)

# ✅ Good: Vectorized
result <- mtcars$mpg * 2

# Performance comparison
n <- 10000
x <- 1:n

system.time(sapply(x, sqrt))
#>    user  system elapsed 
#>   0.003   0.000   0.003
system.time(sqrt(x))  # Much faster!
#>    user  system elapsed 
#>       0       0       0

38.3 When to Use Each

💡 Key Insight: Decision Guide

# Use VECTORIZED operations when possible
x * 2
sqrt(x)
paste0("ID_", x)

# Use FOR LOOPS when:
# - Sequential dependencies
# - Early termination needed
# - Side effects (plotting, writing files)

# Use APPLY family when:
# - Row/column operations on matrices
# - Simple transformations on lists
# - Base R only (no tidyverse)

# Use PURRR when:
# - Type safety matters
# - Complex error handling needed
# - Working with nested lists
# - Modern tidyverse workflows

38.4 Growing Objects Anti-Pattern

⚠️ Avoid Growing Objects

# ❌ Very bad: Growing vector
n <- 1000
system.time({
  result <- c()
  for (i in 1:n) {
    result <- c(result, i^2)
  }
})
#>    user  system elapsed 
#>   0.004   0.000   0.005

# ✅ Good: Pre-allocate
system.time({
  result <- numeric(n)
  for (i in 1:n) {
    result[i] <- i^2
  }
})
#>    user  system elapsed 
#>   0.003   0.000   0.002

# ✅ Best: Vectorize
system.time({
  result <- (1:n)^2
})
#>    user  system elapsed 
#>       0       0       0

# Growing lists
# ❌ Bad
result_list <- list()
for (i in 1:n) {
  result_list[[i]] <- i^2
}

# ✅ Good: Pre-allocate
result_list <- vector("list", n)
for (i in 1:n) {
  result_list[[i]] <- i^2
}

# ✅ Better: Use map
result_list <- map(1:n, ~ .^2)

38.5 Summary

Decision Tree:

Can it be vectorized?
├─ Yes → Use vectorized operations
└─ No → Is it row/column-wise on matrix?
    ├─ Yes → Use apply()
    └─ No → Working with lists?
        ├─ Yes → Need type safety?
        │   ├─ Yes → Use purrr::map_*()
        │   └─ No → Use lapply/sapply
        └─ No → Sequential dependencies?
            └─ Yes → Use for loop

Quick Reference:

Task Best Choice Why
Element-wise math Vectorized Fastest
Row operations apply() Built-in
List operations purrr::map() Type-safe
Sequential for loop Clear logic
Side effects for/walk() Explicit

Best Practices:

# ✅ Good
Vectorize when possible
Pre-allocate in loops
Use type-safe functions
Consider readability
Profile before optimizing

# ❌ Avoid
Growing objects in loops
Unnecessary apply/map
Over-optimization early
Complex nested iterations

38.6 Completion

Part XIII Complete!

You’ve mastered: - apply family functions - purrr for modern iteration - Best practices and patterns - Performance considerations

Ready for: Part XIV (Package Development)!