Chapter 7 Subscript & Dimension Errors

What You’ll Learn:

  • How R’s indexing system works
  • Understanding dimensions and subscripts
  • Common indexing mistakes
  • Negative vs positive indexing
  • Logical indexing pitfalls
  • Matrix and array subsetting

Key Errors Covered: 18+ indexing errors

Difficulty: ⭐ Beginner to ⭐⭐ Intermediate

7.1 Introduction

R’s indexing system is powerful but confusing. You’ll see these errors constantly:

x <- 1:5
x[10]  # Wait, this works?
#> [1] NA
x[10]  # Returns NA, not an error!
#> [1] NA

But this doesn’t:

x[[10]]  # Now it's an error
#> Error in x[[10]]: subscript out of bounds

Understanding R’s indexing is critical for avoiding errors. Let’s master it.

7.2 R’s Indexing Methods

💡 Key Insight: Five Ways to Index

x <- c(10, 20, 30, 40, 50)

# 1. Positive integers (positions)
x[c(1, 3, 5)]
#> [1] 10 30 50

# 2. Negative integers (exclusion)
x[-c(2, 4)]
#> [1] 10 30 50

# 3. Logical vectors
x[c(TRUE, FALSE, TRUE, FALSE, TRUE)]
#> [1] 10 30 50

# 4. Names (if vector has names)
names(x) <- c("a", "b", "c", "d", "e")
x[c("a", "c", "e")]
#>  a  c  e 
#> 10 30 50

# 5. Empty (returns all)
x[]
#>  a  b  c  d  e 
#> 10 20 30 40 50

Each method has different error patterns!

7.3 Error #1: subscript out of bounds

⭐ BEGINNER 📏 DIMENSION

7.3.1 The Error

x <- 1:5
x[[10]]  # Double bracket
#> Error in x[[10]]: subscript out of bounds

🔴 ERROR

Error in x[[10]] : subscript out of bounds

7.3.2 What It Means

You’re trying to access an element beyond the vector’s length using [[]].

7.3.3 Single vs Double Bracket

x <- 1:5

# Single bracket: Returns NA, no error
x[10]
#> [1] NA

# Double bracket: Errors
x[[10]]
#> Error in x[[10]]: subscript out of bounds

Why the difference? - [ can return multiple elements or NA - [[ must return exactly one element

7.3.4 Common Causes

7.3.4.1 Cause 1: Off-by-One Error

scores <- c(85, 90, 95)

# Loop goes too far
for (i in 1:4) {  # Only 3 elements!
  print(scores[[i]])
}
#> [1] 85
#> [1] 90
#> [1] 95
#> Error in scores[[i]]: subscript out of bounds

7.3.4.2 Cause 2: Wrong Length Assumption

data <- c(10, 20, 30)

# Assumed it had 5 elements
first_five <- data[[1:5]]  # Error on 4th
#> Error in data[[1:5]]: attempt to select more than one element in vectorIndex

7.3.4.3 Cause 3: After Filtering

values <- 1:10
large_values <- values[values > 100]  # Empty!

# Try to access first element
large_values[[1]]  # Out of bounds (length 0)
#> Error in large_values[[1]]: subscript out of bounds

7.3.4.4 Cause 4: List Indexing

my_list <- list(a = 1, b = 2, c = 3)

# Trying to access 4th element
my_list[[4]]  # Only 3 elements
#> Error in my_list[[4]]: subscript out of bounds

7.3.5 Solutions

SOLUTION 1: Check Length First

x <- 1:5
index <- 10

# Safe access
if (index <= length(x)) {
  x[[index]]
} else {
  message("Index ", index, " is out of bounds")
  NA
}
#> Index 10 is out of bounds
#> [1] NA

SOLUTION 2: Use Single Bracket

x <- 1:5

# Single bracket returns NA instead of error
x[10]  # NA
#> [1] NA

# Good for loops where you want to continue
for (i in 1:10) {
  val <- x[i]
  if (!is.na(val)) {
    print(val)
  }
}
#> [1] 1
#> [1] 2
#> [1] 3
#> [1] 4
#> [1] 5

SOLUTION 3: Safe Indexing Function

safe_extract <- function(x, i, default = NA) {
  if (i < 1 || i > length(x)) {
    return(default)
  }
  return(x[[i]])
}

# Test
x <- 1:5
safe_extract(x, 3)   # 3
#> [1] 3
safe_extract(x, 10)  # NA
#> [1] NA
safe_extract(x, 10, default = 0)  # 0
#> [1] 0

SOLUTION 4: Use seq_along() in Loops

values <- c(10, 20, 30)

# Wrong: assumes length
for (i in 1:5) {
  # Error on 4th iteration
}

# Right: uses actual length
for (i in seq_along(values)) {
  print(values[[i]])  # Safe
}
#> [1] 10
#> [1] 20
#> [1] 30

# Even safer: iterate over values directly
for (val in values) {
  print(val)
}
#> [1] 10
#> [1] 20
#> [1] 30

⚠️ Common Pitfall: Empty Vectors

# Filter returns empty
x <- 1:10
big_numbers <- x[x > 100]  # numeric(0)

length(big_numbers)  # 0
#> [1] 0

# This errors!
big_numbers[[1]]
#> Error in big_numbers[[1]]: subscript out of bounds

# Always check
if (length(big_numbers) > 0) {
  big_numbers[[1]]
} else {
  NA
}
#> [1] NA

7.4 Error #2: undefined columns selected

⭐ BEGINNER 📏 DIMENSION

7.4.1 The Error

df <- data.frame(x = 1:5, y = 6:10)
df[, "z"]  # Column doesn't exist
#> Error in `[.data.frame`(df, , "z"): undefined columns selected

🔴 ERROR

Error in `[.data.frame`(df, , "z") : undefined columns selected

7.4.2 What It Means

You’re trying to select columns that don’t exist in the data frame.

7.4.3 Common Causes

7.4.3.1 Cause 1: Typo in Column Name

df <- data.frame(age = c(25, 30, 35), name = c("A", "B", "C"))

# Typo: "agee" instead of "age"
df[, "agee"]
#> Error in `[.data.frame`(df, , "agee"): undefined columns selected

7.4.3.2 Cause 2: Case Sensitivity

df <- data.frame(Age = c(25, 30, 35))

# Wrong case
df[, "age"]  # Error! It's "Age" not "age"
#> Error in `[.data.frame`(df, , "age"): undefined columns selected

7.4.3.3 Cause 3: Column Doesn’t Exist Yet

df <- data.frame(x = 1:5)

# Trying to select column before creating it
df[, c("x", "y")]  # "y" doesn't exist
#> Error in `[.data.frame`(df, , c("x", "y")): undefined columns selected

7.4.3.4 Cause 4: After Subsetting

df <- data.frame(x = 1:5, y = 6:10, z = 11:15)

# Select some columns
df_subset <- df[, c("x", "y")]

# Try to access z (no longer exists)
df_subset[, "z"]
#> Error in `[.data.frame`(df_subset, , "z"): undefined columns selected

7.4.4 Solutions

SOLUTION 1: Check Column Names

df <- data.frame(age = c(25, 30, 35), name = c("A", "B", "C"))

# List all columns
names(df)
#> [1] "age"  "name"
colnames(df)
#> [1] "age"  "name"

# Check if column exists
"age" %in% names(df)  # TRUE
#> [1] TRUE
"agee" %in% names(df) # FALSE
#> [1] FALSE

# Safe selection
col_name <- "age"
if (col_name %in% names(df)) {
  df[, col_name]
} else {
  message("Column ", col_name, " not found")
  NULL
}
#> [1] 25 30 35

SOLUTION 2: Use exists in dplyr

library(dplyr)

df <- data.frame(x = 1:5, y = 6:10)

# Select only existing columns
cols_to_select <- c("x", "z", "y")
existing_cols <- cols_to_select[cols_to_select %in% names(df)]

df %>% select(all_of(existing_cols))
#> Error in select(., all_of(existing_cols)): unused argument (all_of(existing_cols))

SOLUTION 3: Safe Column Selection Function

safe_select_cols <- function(df, cols) {
  # Check which columns exist
  existing <- cols[cols %in% names(df)]
  missing <- cols[!cols %in% names(df)]
  
  if (length(missing) > 0) {
    warning("Columns not found: ", paste(missing, collapse = ", "))
  }
  
  if (length(existing) == 0) {
    return(data.frame())  # Empty data frame
  }
  
  return(df[, existing, drop = FALSE])
}

# Test
df <- data.frame(x = 1:5, y = 6:10)
safe_select_cols(df, c("x", "z", "y"))
#> Warning in safe_select_cols(df, c("x", "z", "y")): Columns not found: z
#>   x  y
#> 1 1  6
#> 2 2  7
#> 3 3  8
#> 4 4  9
#> 5 5 10

7.5 Error #3: incorrect number of dimensions

⭐⭐ INTERMEDIATE 📏 DIMENSION

7.5.1 The Error

x <- 1:5  # Vector (1D)
x[1, 2]   # Using 2D indexing on 1D object
#> Error in x[1, 2]: incorrect number of dimensions

🔴 ERROR

Error in x[1, 2] : incorrect number of dimensions

7.5.2 What It Means

You’re using the wrong number of indices for the object’s dimensions.

7.5.3 Understanding Dimensions

# Vector: 1 dimension
vec <- 1:5
length(vec)
#> [1] 5
dim(vec)  # NULL
#> NULL

# Matrix: 2 dimensions
mat <- matrix(1:6, nrow = 2, ncol = 3)
dim(mat)  # 2 3
#> [1] 2 3

# Array: 3+ dimensions
arr <- array(1:24, dim = c(2, 3, 4))
dim(arr)  # 2 3 4
#> [1] 2 3 4

# Data frame: 2 dimensions (special)
df <- data.frame(x = 1:3, y = 4:6)
dim(df)  # 3 2
#> [1] 3 2

7.5.4 Common Causes

7.5.4.1 Cause 1: Treating Vector as Matrix

x <- c(10, 20, 30, 40, 50)

# Vector needs 1 index
x[3]  # Correct
#> [1] 30

# Not 2 indices
x[1, 3]  # Error!
#> Error in x[1, 3]: incorrect number of dimensions

7.5.4.2 Cause 2: Treating Matrix as Vector

mat <- matrix(1:6, nrow = 2, ncol = 3)

# Matrix needs 2 indices
mat[1, 2]  # Correct
#> [1] 3

# Or can use 1 index (treats as vector)
mat[5]  # Also works! (column-major order)
#> [1] 5

# But this is confusing:
mat[1]  # First element, not first row
#> [1] 1

7.5.4.3 Cause 3: After Subsetting

mat <- matrix(1:12, nrow = 3, ncol = 4)

# Extract one column (becomes vector!)
col1 <- mat[, 1]
class(col1)  # "numeric" (not matrix)
#> [1] "integer"

# Now 1D, can't use 2D indexing
col1[1, 1]  # Error!
#> Error in col1[1, 1]: incorrect number of dimensions

7.5.4.4 Cause 4: List vs Data Frame Confusion

# List: 1D (use single bracket or [[]])
my_list <- list(a = 1:3, b = 4:6)
my_list[[1]]     # Correct
#> [1] 1 2 3
my_list[1, 2]    # Error!
#> Error in my_list[1, 2]: incorrect number of dimensions

# Data frame: 2D (use row, col)
df <- data.frame(a = 1:3, b = 4:6)
df[1, 2]         # Correct
#> [1] 4
df[[1]]          # Also works (returns column)
#> [1] 1 2 3

7.5.5 Solutions

SOLUTION 1: Check Dimensions Before Indexing

x <- 1:5

# Check what you have
ndims <- length(dim(x))  # 0 for vector

if (is.null(dim(x))) {
  # Vector: use 1 index
  x[3]
} else if (length(dim(x)) == 2) {
  # Matrix/data frame: use 2 indices
  x[1, 3]
}
#> [1] 3

SOLUTION 2: Preserve Dimensions with drop = FALSE

mat <- matrix(1:12, nrow = 3, ncol = 4)

# Default: drops to vector
col1 <- mat[, 1]
class(col1)  # "numeric"
#> [1] "integer"

# Preserve matrix structure
col1 <- mat[, 1, drop = FALSE]
class(col1)  # "matrix"
#> [1] "matrix" "array"
dim(col1)    # 3 1
#> [1] 3 1

# Now can still use 2D indexing
col1[1, 1]
#> [1] 1

SOLUTION 3: Use Appropriate Functions

mat <- matrix(1:12, nrow = 3, ncol = 4)

# For vectors: use vector operations
vec <- 1:5
vec[3]
#> [1] 3
vec[c(1, 3, 5)]
#> [1] 1 3 5

# For matrices: use matrix operations
mat[1, ]      # First row
#> [1]  1  4  7 10
mat[, 2]      # Second column
#> [1] 4 5 6
mat[1:2, 3:4] # Submatrix
#>      [,1] [,2]
#> [1,]    7   10
#> [2,]    8   11

# For data frames: mix of both
df <- data.frame(x = 1:5, y = 6:10)
df[, "x"]     # Column (becomes vector)
#> [1] 1 2 3 4 5
df[, "x", drop = FALSE]  # Column (stays data frame)
#>   x
#> 1 1
#> 2 2
#> 3 3
#> 4 4
#> 5 5
df$x          # Column (vector)
#> [1] 1 2 3 4 5

7.6 Error #4: incorrect number of subscripts on matrix

⭐ BEGINNER 📏 DIMENSION

7.6.1 The Error

mat <- matrix(1:6, nrow = 2, ncol = 3)
mat[1, 2, 3]  # Too many indices!
#> Error in mat[1, 2, 3]: incorrect number of dimensions

🔴 ERROR

Error in mat[1, 2, 3] : incorrect number of subscripts on matrix

7.6.2 What It Means

Matrix needs exactly 2 indices (or 1), but you provided a different number.

7.6.3 Correct Matrix Indexing

mat <- matrix(1:6, nrow = 2, ncol = 3)
mat
#>      [,1] [,2] [,3]
#> [1,]    1    3    5
#> [2,]    2    4    6

# Correct ways:
mat[1, 2]      # Single element
#> [1] 3
mat[1, ]       # Entire row
#> [1] 1 3 5
mat[, 2]       # Entire column
#> [1] 3 4
mat[1:2, 2:3]  # Submatrix
#>      [,1] [,2]
#> [1,]    3    5
#> [2,]    4    6

# Also works (treats as vector):
mat[5]         # 5th element (column-major)
#> [1] 5

# Wrong:
# mat[1, 2, 3]  # Too many indices

7.6.4 Solutions

SOLUTION 1: Use Correct Number of Indices

mat <- matrix(1:6, nrow = 2, ncol = 3)

# For matrix: [row, col]
mat[1, 2]
#> [1] 3

# For array: [dim1, dim2, dim3, ...]
arr <- array(1:24, dim = c(2, 3, 4))
arr[1, 2, 3]
#> [1] 15

SOLUTION 2: Check Object Type First

check_and_subset <- function(x, ...) {
  indices <- list(...)
  
  if (is.matrix(x)) {
    if (length(indices) > 2) {
      stop("Matrix needs 1 or 2 indices, got ", length(indices))
    }
    return(do.call(`[`, c(list(x), indices)))
  } else if (is.array(x)) {
    expected <- length(dim(x))
    if (length(indices) != expected) {
      stop("Array needs ", expected, " indices, got ", length(indices))
    }
    return(do.call(`[`, c(list(x), indices)))
  } else {
    return(x[[indices[[1]]]])
  }
}

7.7 Error #5: only 0's may be mixed with negative subscripts

⭐⭐ INTERMEDIATE 🔤 SYNTAX

7.7.1 The Error

x <- 1:10
x[c(-1, 5)]  # Can't mix negative and positive!
#> Error in x[c(-1, 5)]: only 0's may be mixed with negative subscripts

🔴 ERROR

Error in x[c(-1, 5)] : only 0's may be mixed with negative subscripts

7.7.2 What It Means

R won’t let you mix negative indices (exclusion) with positive indices (selection).

7.7.3 Negative Indexing Rules

x <- 1:10

# Positive: select elements
x[c(1, 3, 5)]
#> [1] 1 3 5

# Negative: exclude elements
x[-c(1, 3, 5)]
#> [1]  2  4  6  7  8  9 10

# Zero: ignored
x[c(0, 1, 3)]  # Same as x[c(1, 3)]
#> [1] 1 3

# Can mix zero with negative
x[c(0, -1, -3)]  # Same as x[-c(1, 3)]
#> [1]  2  4  5  6  7  8  9 10
# CANNOT mix positive and negative
x[c(-1, 5)]  # Error!
#> Error in x[c(-1, 5)]: only 0's may be mixed with negative subscripts
x[c(1, -5)]  # Error!
#> Error in x[c(1, -5)]: only 0's may be mixed with negative subscripts

7.7.4 Why This Rule?

# Ambiguous meaning:
x[c(-1, 5)]

# Does this mean:
# "Select 5th, excluding 1st"?
# "Exclude 1st, but also select 5th"?

# R refuses to guess!

7.7.5 Solutions

SOLUTION 1: Use Only Positive or Only Negative

x <- 1:10

# Want elements 2-10 (exclude 1st)?
# Use negative:
x[-1]
#> [1]  2  3  4  5  6  7  8  9 10

# Want elements except 1 and 3?
# Use negative:
x[-c(1, 3)]
#> [1]  2  4  5  6  7  8  9 10

# Want only 1, 3, 5?
# Use positive:
x[c(1, 3, 5)]
#> [1] 1 3 5

SOLUTION 2: Convert to Logical

x <- 1:10

# Want: "not 1, not 3, but yes 5"
# Create logical vector
indices <- rep(TRUE, length(x))
indices[c(1, 3)] <- FALSE  # Exclude these
indices[5] <- TRUE          # Include this (already TRUE)

x[indices]
#> [1]  2  4  5  6  7  8  9 10

SOLUTION 3: Use setdiff()

x <- 1:10

# Want all except positions 1 and 3
exclude <- c(1, 3)
keep <- setdiff(seq_along(x), exclude)
x[keep]
#> [1]  2  4  5  6  7  8  9 10

# More complex: all except 1 and 3, but must include 5
exclude <- c(1, 3)
include <- 5
keep <- union(setdiff(seq_along(x), exclude), include)
keep <- sort(unique(keep))
x[keep]
#> [1]  2  4  5  6  7  8  9 10

7.8 Error #6: negative length vectors are not allowed

⭐ BEGINNER 📏 LENGTH

7.8.1 The Error

n <- -5
x <- numeric(n)  # Can't have negative length!
#> Error in numeric(n): invalid 'length' argument

🔴 ERROR

Error in numeric(n) : negative length vectors are not allowed

7.8.2 What It Means

You’re trying to create a vector with negative length, which is impossible.

7.8.3 Common Causes

7.8.3.1 Cause 1: Calculation Error

n_start <- 5
n_end <- 3

# Calculation gives negative
n <- n_end - n_start  # -2
result <- numeric(n)  # Error!
#> Error in numeric(n): invalid 'length' argument

7.8.3.2 Cause 2: User Input

create_vector <- function(n) {
  numeric(n)
}

create_vector(-5)  # Error!
#> Error in numeric(n): invalid 'length' argument

7.8.3.3 Cause 3: Filtering Gone Wrong

data <- c(10, 20, 30)
threshold <- 50

# No values meet criteria
n_above <- sum(data > threshold)  # 0

# Then you subtract
n_below <- length(data) - n_above - 5  # Negative!

result <- numeric(n_below)  # Error!
#> Error in numeric(n_below): invalid 'length' argument

7.8.4 Solutions

SOLUTION 1: Validate Before Creating

create_vector_safe <- function(n, default_value = 0) {
  if (n < 0) {
    warning("Negative length requested: ", n, ". Using 0.")
    n <- 0
  }
  
  if (n == 0) {
    return(numeric(0))
  }
  
  return(rep(default_value, n))
}

# Test
create_vector_safe(5)
#> [1] 0 0 0 0 0
create_vector_safe(-5)
#> Warning in create_vector_safe(-5): Negative length requested: -5. Using 0.
#> numeric(0)
create_vector_safe(0)
#> numeric(0)

SOLUTION 2: Use max() for Safety

n_start <- 5
n_end <- 3

# Ensure non-negative
n <- max(0, n_end - n_start)
result <- numeric(n)  # Safe
result
#> numeric(0)

7.9 Logical Indexing

💡 Key Insight: Logical Indexing

x <- c(10, 20, 30, 40, 50)

# Logical vector
x > 25
#> [1] FALSE FALSE  TRUE  TRUE  TRUE

# Use for indexing
x[x > 25]
#> [1] 30 40 50

# Multiple conditions
x[x > 25 & x < 45]
#> [1] 30 40

# With which()
which(x > 25)
#> [1] 3 4 5
x[which(x > 25)]
#> [1] 30 40 50

Important: Logical indexing with NA creates NA in result!

x <- c(10, NA, 30, 40)
x > 25  # Has NA
#> [1] FALSE    NA  TRUE  TRUE

# Result includes NA
x[x > 25]
#> [1] NA 30 40

# Remove NA from condition
x[which(x > 25)]  # which() drops NA
#> [1] 30 40

7.10 Error #7: [ ] with missing values only allowed for atomic vectors

⭐⭐ INTERMEDIATE 🔢 TYPE

7.10.1 The Error

my_list <- list(a = 1, b = 2, c = 3)
indices <- c(1, NA, 3)
my_list[indices]
#> $a
#> [1] 1
#> 
#> $<NA>
#> NULL
#> 
#> $c
#> [1] 3

🔴 ERROR

Error in my_list[indices] : 
  [ ] with missing values only allowed for atomic vectors

7.10.2 What It Means

You can use NA in indices for vectors, but not for lists or data frames without special handling.

7.10.3 Atomic vs Non-Atomic

# Atomic vector: NA indexing works
vec <- c(10, 20, 30)
vec[c(1, NA, 3)]  # Returns with NA
#> [1] 10 NA 30

# List: NA indexing fails
my_list <- list(a = 1, b = 2, c = 3)
my_list[c(1, NA, 3)]  # Error!
#> $a
#> [1] 1
#> 
#> $<NA>
#> NULL
#> 
#> $c
#> [1] 3

7.10.4 Solutions

SOLUTION 1: Remove NAs from Indices

my_list <- list(a = 1, b = 2, c = 3)
indices <- c(1, NA, 3)

# Remove NAs
clean_indices <- indices[!is.na(indices)]
my_list[clean_indices]
#> $a
#> [1] 1
#> 
#> $c
#> [1] 3

SOLUTION 2: Use Logical Indexing

my_list <- list(a = 1, b = 2, c = 3)
indices <- c(1, NA, 3)

# Convert to logical
logical_indices <- seq_along(my_list) %in% indices
my_list[logical_indices]
#> $a
#> [1] 1
#> 
#> $c
#> [1] 3

7.11 Matrix Indexing Special Cases

🎯 Best Practice: Matrix Indexing Mastery

mat <- matrix(1:12, nrow = 3, ncol = 4)
mat
#>      [,1] [,2] [,3] [,4]
#> [1,]    1    4    7   10
#> [2,]    2    5    8   11
#> [3,]    3    6    9   12

# Single index: column-major order
mat[5]  # Row 2, Column 2
#> [1] 5

# Row, column
mat[2, 2]
#> [1] 5

# Entire row
mat[2, ]
#> [1]  2  5  8 11

# Entire column
mat[, 2]
#> [1] 4 5 6

# Multiple rows/columns
mat[1:2, 2:3]
#>      [,1] [,2]
#> [1,]    4    7
#> [2,]    5    8

# Logical indexing (rows)
mat[mat[, 1] > 1, ]
#>      [,1] [,2] [,3] [,4]
#> [1,]    2    5    8   11
#> [2,]    3    6    9   12

# Negative indexing
mat[-1, ]     # All but first row
#>      [,1] [,2] [,3] [,4]
#> [1,]    2    5    8   11
#> [2,]    3    6    9   12
mat[, -c(1, 4)]  # All but first and last column
#>      [,1] [,2]
#> [1,]    4    7
#> [2,]    5    8
#> [3,]    6    9

# Preserve matrix structure
mat[, 2]              # Becomes vector
#> [1] 4 5 6
mat[, 2, drop = FALSE]  # Stays matrix
#>      [,1]
#> [1,]    4
#> [2,]    5
#> [3,]    6

7.12 Summary

Key Takeaways:

  1. [ ] vs [[]]: Single bracket is lenient (returns NA), double bracket is strict (errors)
  2. Check bounds: Always validate indices before using [[]]
  3. Dimensions matter: Vectors (1D), matrices (2D), arrays (3D+)
  4. Negative indexing: Can’t mix with positive (except 0)
  5. Logical indexing: Watch for NA in conditions
  6. drop = FALSE: Preserves matrix structure
  7. seq_along(): Safer than 1:length() for loops

Quick Reference:

Error Cause Fix
subscript out of bounds Index > length with [[]] Check length first
undefined columns Column name doesn’t exist Check with %in% names()
incorrect number of dimensions Wrong # of indices Match object dimensions
incorrect number of subscripts Too many/few indices Matrix needs 2, array needs n
only 0’s may be mixed with negative Positive + negative indices Use one or the other
negative length vectors Tried length < 0 Validate with max(0, n)
[ ] with missing values NA index on list Remove NAs or use logical

Safe Indexing Checklist:

# Before indexing:
length(x)              # Check size
dim(x)                 # Check dimensions
names(x)               # Check names (if using)
seq_along(x)           # Safe iteration
index %in% seq_along(x)  # Validate index

# For conditional indexing:
which(condition)       # Drops NA automatically
x[!is.na(x) & x > 0]  # Handle NA explicitly

7.13 Exercises

📝 Exercise 1: Predict the Outcome

What happens? Error, NA, or value?

# A
x <- 1:5
x[10]

# B
x[[10]]

# C
mat <- matrix(1:6, nrow = 2)
mat[3, 1]

# D
x[c(-1, 5)]

# E
my_list <- list(a = 1, b = 2)
my_list[c(1, NA)]

📝 Exercise 2: Fix the Code

Debug these indexing problems:

# Problem 1
scores <- c(85, 90, 95)
top_score <- scores[[4]]

# Problem 2
df <- data.frame(x = 1:5, y = 6:10)
result <- df[, "z"]

# Problem 3
vec <- 1:10
subset <- vec[c(-1, -2, 5, 6)]

# Problem 4
mat <- matrix(1:6, nrow = 2)
col2 <- mat[, 2]
element <- col2[1, 1]

📝 Exercise 3: Safe Indexing Function

Write safe_index(x, i) that: 1. Works with vectors, lists, matrices 2. Never errors on out-of-bounds 3. Returns NA for invalid indices 4. Handles both [ ] and [[ ]] style 5. Reports what went wrong

📝 Exercise 4: Matrix Subsetting

Given a matrix, write functions to: 1. Get elements on the diagonal 2. Get upper triangle (above diagonal) 3. Get lower triangle (below diagonal) 4. Get border elements (edges only) 5. Handle any matrix size

7.14 Exercise Answers

Click to see answers

Exercise 1:

# A - Returns NA (single bracket is lenient)
x <- 1:5
x[10]
#> [1] NA

# B - Errors (double bracket is strict)
tryCatch(x[[10]], error = function(e) "ERROR")
#> [1] "ERROR"

# C - Errors (only 2 rows)
mat <- matrix(1:6, nrow = 2)
tryCatch(mat[3, 1], error = function(e) "ERROR")
#> [1] "ERROR"

# D - Errors (can't mix positive and negative)
tryCatch(x[c(-1, 5)], error = function(e) "ERROR")
#> [1] "ERROR"

# E - Errors (lists don't allow NA indices)
my_list <- list(a = 1, b = 2)
tryCatch(my_list[c(1, NA)], error = function(e) "ERROR")
#> $a
#> [1] 1
#> 
#> $<NA>
#> NULL

Exercise 2:

# Problem 1 - Out of bounds
scores <- c(85, 90, 95)
# Fix: Check length
if (4 <= length(scores)) {
  top_score <- scores[[4]]
} else {
  top_score <- NA
}

# Problem 2 - Column doesn't exist
df <- data.frame(x = 1:5, y = 6:10)
# Fix: Check column exists
if ("z" %in% names(df)) {
  result <- df[, "z"]
} else {
  result <- NULL
}

# Problem 3 - Mixing positive and negative
vec <- 1:10
# Fix: Use only negative
subset <- vec[-c(1, 2)]
# Or only positive
subset <- vec[c(5, 6)]

# Problem 4 - Vector can't use 2D indexing
mat <- matrix(1:6, nrow = 2)
col2 <- mat[, 2, drop = FALSE]  # Keep as matrix
element <- col2[1, 1]
# Or:
col2 <- mat[, 2]  # Vector
element <- col2[1]  # 1D indexing

Exercise 3:

safe_index <- function(x, i, double_bracket = FALSE) {
  # Handle different object types
  if (is.null(x)) {
    message("Object is NULL")
    return(NULL)
  }
  
  # Get valid range
  max_index <- if (is.list(x)) {
    length(x)
  } else if (!is.null(dim(x))) {
    length(x)  # For matrices, treat as vector
  } else {
    length(x)
  }
  
  # Check index validity
  if (any(is.na(i))) {
    message("Index contains NA")
    i <- i[!is.na(i)]
  }
  
  if (length(i) == 0) {
    message("No valid indices")
    return(if (double_bracket) NA else x[integer(0)])
  }
  
  if (any(i < 1 | i > max_index)) {
    invalid <- i[i < 1 | i > max_index]
    message("Invalid indices: ", paste(invalid, collapse = ", "))
    i <- i[i >= 1 & i <= max_index]
  }
  
  if (length(i) == 0) {
    return(NA)
  }
  
  # Extract
  if (double_bracket) {
    if (length(i) > 1) {
      message("Double bracket with multiple indices, using first")
      i <- i[1]
    }
    return(x[[i]])
  } else {
    return(x[i])
  }
}

# Test
x <- 1:5
safe_index(x, 3)
#> [1] 3
safe_index(x, 10)
#> Invalid indices: 10
#> [1] NA
safe_index(x, c(1, 10, 3))
#> Invalid indices: 10
#> [1] 1 3

Exercise 4:

# Get diagonal elements
get_diagonal <- function(mat) {
  if (!is.matrix(mat)) stop("Input must be a matrix")
  n <- min(nrow(mat), ncol(mat))
  mat[cbind(1:n, 1:n)]
}

# Get upper triangle
get_upper_tri <- function(mat, include_diag = FALSE) {
  if (!is.matrix(mat)) stop("Input must be a matrix")
  mat[upper.tri(mat, diag = include_diag)]
}

# Get lower triangle
get_lower_tri <- function(mat, include_diag = FALSE) {
  if (!is.matrix(mat)) stop("Input must be a matrix")
  mat[lower.tri(mat, diag = include_diag)]
}

# Get border elements
get_border <- function(mat) {
  if (!is.matrix(mat)) stop("Input must be a matrix")
  nr <- nrow(mat)
  nc <- ncol(mat)
  
  if (nr == 1 || nc == 1) {
    return(as.vector(mat))
  }
  
  c(
    mat[1, ],                    # Top row
    mat[nr, ],                   # Bottom row
    mat[2:(nr-1), 1],           # Left column (excluding corners)
    mat[2:(nr-1), nc]           # Right column (excluding corners)
  )
}

# Test
mat <- matrix(1:20, nrow = 4, ncol = 5)
mat
#>      [,1] [,2] [,3] [,4] [,5]
#> [1,]    1    5    9   13   17
#> [2,]    2    6   10   14   18
#> [3,]    3    7   11   15   19
#> [4,]    4    8   12   16   20
get_diagonal(mat)
#> [1]  1  6 11 16
get_upper_tri(mat)
#>  [1]  5  9 10 13 14 15 17 18 19 20
get_lower_tri(mat)
#> [1]  2  3  4  7  8 12
get_border(mat)
#>  [1]  1  5  9 13 17  4  8 12 16 20  2  3 18 19