# Chapter 3 Agent-based modeling

R can be used in order to simulate agents’ behaviors in virtual environments. In the following I will introduce you to what this can look like. In the end, there are further resources where you can find more ABM examples in R and also some theoretical background.

## 3.1 Building an ABM: Schelling model

When designing an ABM from scratch, we need to take a bunch of things into account. A structural way to think about this is in terms of a flow chart.

In the first part, we need to create the agents and their world. We need to think about parameters that need to be specified and how they can be implemented in their code. Thereafter, the actual agent (inter)actions take place. In the end, we can evaluate the (macro) outcomes and check for robustness etc. In this part of the script, it’s going to be about the former two steps.

When writing an ABM in R, I would advise you to write functions for each step in the flow chart and run the entire thing in a for loop or while loop.

### 3.1.1 Recap: functions, if…else, loops, and matrices

Building blocks for ABMs in R are custom functions (since you want your agents to do things…), flow control (…under certain conditions…), and loops (…several times in a row). Matrices are handy to implement a cellular automaton.

#### 3.1.1.1 Functions

When you define functions in R, you need to follow a certain structure:

function_name <- function(argument_1, argument_2, argument_n) {
function body
}
• The function_name is the thing you will call (e.g., mean()). In general, it should be a verb, it should be concise, and it should be in_snakecase.
• The arguments are what you need to provide the function with (e.g., mean(1:10)).
• The function body contains the operations which are performed to the arguments. It can contain other functions as well – which need to be defined beforehand (e.g., sum(1:10) / length(1:10))). It is advisable to split up the function body into as little pieces as you can.

#### 3.1.1.2 Loops

With ABMs you often want to loop until a certain condition is met. This is where while loops come in handy.

The basic structure of while loops is as follows:

while (condition) {
code
}

Alternatively, you can also do a for loop with a pre-defined set of runs and a break() condition:

for (i in seq_along(seq(number_of_iterations))) {
do_something_i_times

if break_condition_is_met break
}

Note in both cases that you need to pre-allocate space for your loop to run smoothly for instance by pre-defining a list with vector(mode = "list", length = number_of_iterations)

#### 3.1.1.3 Flow control

Sometimes you want your code to only run in specific cases. A generalized solution for conditionally running code in R are if statements. They look as follows:

if (conditional_statement evaluates to TRUE) {
do_something
}

They also have an extension – if…else:

if (conditional_statement evaluates to TRUE) {
do_something
} else {
do_something_else
}

#### 3.1.1.4 Matrix

A matrix is a multidimensional vector. We need it for building a cellular automaton like the one used by . It is created using matrix(values, nrow = number_of_rows, ncol = number_of_columns). Thereafter, the

matrix(1:10, nrow = 2) # by default, it's filled by columns
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    3    5    7    9
## [2,]    2    4    6    8   10
matrix(1:10, nrow = 2, byrow = TRUE) # can be also filled by rows
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1    2    3    4    5
## [2,]    6    7    8    9   10
example <- matrix(sample(0:2, 10*10, replace = TRUE, prob = c(0.4, 0.3, 0.3)), ncol = 10) # probably more useful for a cellular automaton

A matrix can be visualized by transforming it to an image.

image(example, col = c("black","red","green"), axes = FALSE)

Note that you can index a matrix using [row, column].

### 3.1.2 Schelling model

Let’s build a Schelling model together. The inspiration for parts of the code come from this blog post – I tried to write the code in a cleaner fashion though.

The steps we will program today are the follwoing:

• Initialization:
• Create field with n x n dimensions
• Create agents with certain attributes and put them on the field
• Determine happiness criterion
• Simulation
• Determine each agent’s happiness level
• Move unhappy agents to empty field
• Repeat until time is up or all agents are happy

#### 3.1.2.1 Initialization

1. Build a 25x25 matrix. It should contain 0s (for unoccupied field), 1s (color 1), and 2s (color 2). Make sure to wrap it in a function with different parameters for the share of different colors. Let R print an image() of it.
build_world <- function(dimensions, share_free, share_col_1, share_col_2) {
if (share_free + share_col_1 + share_col_2 != 1) stop("shares need to add up to 1.")

matrix(sample(0:2, dimensions^2, replace = TRUE, prob = c(share_free, share_col_1, share_col_2)),
ncol = dimensions)
}
image(build_world(dimensions = 25, share_free = 0.5, share_col_1 = 0.25, share_col_2 = 0.25), axes = FALSE)
1. Determine the initial happiness criterion as 0.4 – you will need to feed it to the executing functions later. The happiness criterion is the least share of neighbors surrounding an individual that is of the same color as the agent itself.
happiness_criterion <- 0.4

#### 3.1.2.2 Action

1. Build a function that determines each agent’s happiness (hint: you can use a nested for loop for looping through the matrix).
1. write a function to determine one agent’s neighbors (as determined through their coordinates [row, col] – Moore neighborhood, i.e., all 8 neighbors)
2. determine whether the agent is happy (is the share of same-value neighbors above the threshold? – hint: exclude the empty fields first and stop the function if a person does not have neighbors – then the person is automatically happy). Moreover, determine the share of same-value agents in each agent’s environment.
3. loop over all coordinates to check whether the agents are happy; store the results in a tibble with the following cells: row, column, happiness (NA if free spot, 0 if unhappy, 1 if happy)
4. write a function to determine the share of happy agents.
build_world(dimensions = 25, share_free = 0.5, share_col_1 = 0.25, share_col_2 = 0.25)
coordinates <- c(2,1)
dimensions <- 25
#a
get_neighbors <- function(coordinates, dimensions) {
if (coordinates[1] > dimensions) stop("invalid value for coordinates[1]")
if (coordinates[2] > dimensions) stop("invalid value for coordinates[2]")

expand_grid(dim_1 = (coordinates[1] - 1):(coordinates[1] + 1),
dim_2 = (coordinates[2] - 1):(coordinates[2] + 1)) %>%
filter(dim_1 > 0,
dim_2 > 0,
dim_1 <= dimensions,
dim_2 <= dimensions,
!(dim_1 == coordinates[1] & dim_2 == coordinates[2]))
}

#b
determine_same_share <- function(coordinates, field_matrix) {
own_value <- field_matrix[coordinates[1], coordinates[2]]

if (own_value == 0) return(NA_real_)

neighbors <- get_neighbors(coordinates, dimensions = nrow(field_matrix))
map2(neighbors[[1]], neighbors[[2]], ~field_matrix[.x, .y]) %>%
reduce(c) %>%
enframe(name = NULL, value = "neighbor") %>%
mutate(own_value = own_value) %>%
filter(neighbor != 0) %>%
mutate(same_type = case_when(own_value == neighbor ~ TRUE,
TRUE ~ FALSE)) %>%
pull(same_type) %>%
mean()
}
# c
coordinate_tbl <- expand_grid(dim_1 = 1:25, dim_2 = 1:25) %>%
mutate(same_share = NA_real_,
happiness_ind = NA_real_)

for (i in seq_along(coordinate_tbl$same_share)) { coordinate_tbl$same_share[i] <- determine_same_share(coordinates = c(coordinate_tbl$dim_1[i], coordinate_tbl$dim_2[i]),
field_matrix = field_matrix)
if (is.nan(coordinate_tbl$same_share[i]) == TRUE) coordinate_tbl$happiness_ind[i] <- 1
if (is.nan(coordinate_tbl$same_share[i]) == TRUE) next if (is.na(coordinate_tbl$same_share[i]) == TRUE) next

if (coordinate_tbl$same_share[i] >= happiness_criterion) { coordinate_tbl$happiness_ind[i] <- 1
}else{
coordinate_tbl$happiness_ind[i] <- 0 } } # d determine_happy_share <- function(coordinate_tbl) { mean(coordinate_tbl[["happiness_ind"]], na.rm = TRUE) } 1. Make the unhappy cells “move” to a free spot. Make sure that the agents will be randomized before so that there is no spatial bias (hint: pingers::shuffle()). unhappy_ones <- coordinate_tbl %>% filter(happiness_ind == 0) %>% pingers::shuffle() free_spots <- coordinate_tbl %>% filter(is.na(happiness_ind)) %>% pingers::shuffle() move_agents <- function(unhappy_ones, free_spots, field_matrix) { i <- 1 while (i <= nrow(unhappy_ones)) { field_matrix[free_spots$dim_1[i], free_spots$dim_2[i]] <- field_matrix[unhappy_ones$dim_1[i], unhappy_ones$dim_2[i]] field_matrix[unhappy_ones$dim_1[i], unhappy_ones$dim_2[i]] <- 0 if(i == nrow(free_spots)) { unhappy_ones %<>% slice(., (i + 1):nrow(.)) free_spots <- which(field_matrix == 0, arr.ind = TRUE) %>% as_tibble() %>% select(dim_1 = row, dim_2 = col) %>% pingers::shuffle() i <- 1 } i <- i + 1 } field_matrix } 1. Put the functions in a while loop that runs as long as you want or until all agents are satisfied. get_own_value <- function(coordinate_a, coordinate_b, matrix_name) matrix_name[coordinate_a, coordinate_b] run_schelling <- function(max_steps = 500, dimensions = 25, share_free = 0.2, share_col_1 = 0.4, share_col_2 = 0.4, happiness_criterion = 0.4) { current_step <- 1 share_unhappy <- 1 field_matrix <- build_world(dimensions, share_free, share_col_1, share_col_2) report_tbl <- tibble( step = 1:max_steps, unhappy_ones = NA_integer_ ) #image_list <- vector(mode = "list", length = max_steps/3 + 1) agent_list <- vector(mode = "list", length = max_steps) coordinate_tbl <- expand_grid(dim_1 = 1:dimensions, dim_2 = 1:dimensions) %>% mutate(happiness_ind = NA_real_, same_share = NA_real_) while (current_step <= max_steps & share_unhappy > 0) { for (i in seq_along(coordinate_tbl$happiness_ind)) {
coordinate_tbl$same_share[i] <- determine_same_share(coordinates = c(coordinate_tbl$dim_1[i],
coordinate_tbl$dim_2[i]), field_matrix = field_matrix) if (is.nan(coordinate_tbl$same_share[i]) == TRUE) coordinate_tbl$happiness_ind[i] <- 1 if (is.nan(coordinate_tbl$same_share[i]) == TRUE) next
if (is.na(coordinate_tbl$same_share[i]) == TRUE) next if (coordinate_tbl$same_share[i] >= happiness_criterion) {
coordinate_tbl$happiness_ind[i] <- 1 }else{ coordinate_tbl$happiness_ind[i] <- 0
}
}

agent_list[[current_step]] <- coordinate_tbl %>%
mutate(own_value = map2_dbl(dim_1, dim_2, get_own_value, field_matrix))

share_unhappy <- 1 - mean(coordinate_tbl %>% filter(!is.na(same_share)) %>% pull(happiness_ind), na.rm = TRUE)

unhappy_ones <- coordinate_tbl %>%
filter(happiness_ind == 0) %>%
pingers::shuffle()

free_spots <- coordinate_tbl %>%
mutate(own_value = map2_dbl(dim_1, dim_2, get_own_value, field_matrix)) %>%
filter(own_value == 0) %>%
pingers::shuffle() %>%
select(1:2)

field_matrix <- move_agents(unhappy_ones, free_spots, field_matrix)

distance_df$euclidean_vec[[i]] <- euclidean(actor_list[[distance_df$a[[i]]]], actor_list[[distance_df$b[[i]]]]) } distance_df %<>% mutate(euclidean_normalized = (euclidean_vec/max(euclidean_vec)) %>% mean()) #### 3.2.1.6 Input data Do you use empirically observed data to calibrate your model? Those data can affect state variables that might change over time. Parameters and everything else used to initially set up the world needs to be specified in step 5, initialization, however. Input data in : No. #### 3.2.1.7 Submodels In the final part, you need to go back to the schedule and describe each step and its submodels thoroughly. A thorough description encompasses its “equations, algorithms, parameters, and parameter values.” (Grimm et al. 2020, Supp. 1: 27) Also justify why you have made those design decisions. Include robustness checks that take into account scenarios the submodels might encounter. Submodels in : In the following, I neither include robustness checks nor thorough or complete theoretical motivation. However, I include workable code that can be put in, e.g., a loop to run the model. The first submodel is the selection of interaction partners: “for each actor at each time period t, we randomly selected from the population a number of potential interlocutors as a function of the actor’s overall level of interest. The number of people selected is proportional to the sum of the squared mean of interest over the four issues … the probability of interaction is proportional to their interest and an inverse function of the perceived ideological distance between the two.” (p. 791) $$\lambda$$ is the distance between two actors. Initially, it is the demeaned euclidean distance (normalized by division by max(euclidean_distance)) between all actors. As actors communicate with each other, they learn about their views and update their distance to their actual distance $d_{ab}^{(t)}=\frac{\sqrt{\sum\limits_{i=1}^{4} (a_{i}^{(t)}-b_{i}^{(t)})^{2}}}{\max_{\{a,b\}\in N}\left[\sqrt{\sum\limits_{i=1}^{4}(a_{i}^{(t)}-b_{i}^{(t)})^{2}} \right]}$ In code, the initial distance is computed as follows: euclidean <- function(a, b) sqrt(sum((a - b)^2)) %>% as.double() The formula for computing the probability of interaction between two actors a and b then looks like this: $\frac{\eta \times [(\sum\limits_{i=1}^{4}|a_{i}^{(t)}|)^2(\sum\limits_{i=1}^{4}|b_{i}^{(t)}|)^2]}{100} \times (1-\lambda_{ab}^{(t)})$ $$\eta$$ is a scaling factor (.005) that limits the number of interactions to a reasonable range. In general, at time 1, actors have between 0 and 6 conversations, while at time 500, 0 to 12.” (p. 791) Code-wise, this is implemented in the following manner: 1. potential discussants are drawn based on their interest (get_avg_interest()) and the distance between the respective agents; the resulting probability (see formula above) is used to draw a value from a Bernoulli distribution (either 0 or 1, purrr::rbernoulli(n = 1, p = formula above)) 2. Agent-pairs that talk to each other and the ones that don’t are stored in separate tibbles. The interlocutor-tibble is pinger::shuffle()d. get_avg_interest <- function(vec) vec %>% abs() %>% mean() %>% .^2 distance_df$prob <- NA_real_

draw_discussants <- function(actor_list, distance_df){
sum_squared_mean_interest <- map_dbl(actor_list, get_avg_interest)
for (i in seq_along(distance_df$euclidean_vec)){ distance_df$prob[[i]] <- rbernoulli(1,
(0.005 * (sum_squared_mean_interest[[distance_df$a[[i]]]] + sum_squared_mean_interest[[distance_df$b[[i]]]]) * 1 - distance_df$euclidean_normalized[[i]])/100) } list(interlocution = distance_df %>% filter(prob == 1) %>% pingers::shuffle(), no_interlocution = distance_df %>% filter(prob != 1)) } discussion_list <- draw_discussants(actor_list, distance_df) In the next step, the agents are interacting. The first step is to determine the topic they are deliberating upon. This is the one which is of greatest cumulative interest to them: select_issue <- function(a, b, actor_list) { actor_list[[a]] %>% enframe(name = "id", value = "value_a") %>% bind_cols(actor_list[[b]] %>% enframe(name = NULL, value = "value_b")) %>% mutate(sum = value_a + value_b) %>% filter(sum == max(sum)) } select_rest <- function(a, b, actor_list) { actor_list[[a]] %>% enframe(name = "id", value = "value_a") %>% bind_cols(actor_list[[b]] %>% enframe(name = NULL, value = "value_b")) %>% mutate(sum = value_a + value_b) %>% filter(sum != max(sum)) } Once the topic is determined, they start talking and subsequently adapt their opinions ON THE TOPIC THEY ARE TALKING ABOUT. The opinion change process is implemented as follows: • if they share the opinion on the discussed topic (i.e., they have the same sign), their opinions are reinforced (reinforce_opinion()) • if they do not have the same signs (read: views) on the discussed topic yet agree on all the other topics, their opinions on the focal topic will go closer together (make_compromise()) • if neither of the former is the case, their own views on the topic will be reinforced and their overall distance increase (make_conflict()) reinforce_opinion <- function(actor_list, a, b, issue_id) { issue_a <- actor_list[[a]][[issue_id]] issue_b <- actor_list[[b]][[issue_id]] change_a <- (0.1 * (abs(issue_a) - abs(issue_b))/abs(issue_a)) %>% abs() change_b <- (0.1 * (abs(issue_b) - abs(issue_a))/abs(issue_b)) %>% abs() return_vec <- vector(mode = "double", length = 2L) if (issue_a > 0 & issue_b > 0) { return_vec[1] <- issue_a + change_a return_vec[2] <- issue_b + change_b }else{ return_vec[1] <- issue_a - change_a return_vec[2] <- issue_b - change_b } return_vec } make_compromise <- function(actor_list, a, b, issue_id) { issue_a <- actor_list[[a]][[issue_id]] issue_b <- actor_list[[b]][[issue_id]] change_a <- (0.1 * (abs(issue_a) - abs(issue_b))/abs(issue_a)) %>% abs() change_b <- (0.1 * (abs(issue_b) - abs(issue_a))/abs(issue_b)) %>% abs() return_vec <- vector(mode = "double", length = 2L) if (issue_a > 0) return_vec[1] <- issue_a - change_a if (issue_a < 0) return_vec[1] <- issue_a + change_a if (issue_b > 0) return_vec[2] <- issue_b - change_b if (issue_b < 0) return_vec[2] <- issue_b + change_b return_vec } make_conflict <- function(actor_list, a, b, issue_id) { issue_a <- actor_list[[a]][[issue_id]] issue_b <- actor_list[[b]][[issue_id]] change_a <- (0.1 * (abs(issue_a) - abs(issue_b))/abs(issue_a)) %>% abs() change_b <- (0.1 * (abs(issue_b) - abs(issue_a))/abs(issue_b)) %>% abs() return_vec <- vector(mode = "double", length = 2L) if (issue_a > 0) return_vec[1] <- issue_a + change_a if (issue_a < 0) return_vec[1] <- issue_a - change_a if (issue_b > 0) return_vec[2] <- issue_b + change_b if (issue_b < 0) return_vec[2] <- issue_b - change_b return_vec } The rationale behind the reinforcement of views is based on prior research on group polarization: discourse between like-minded people usually leads to an amplification of their pre-existing views (see, e.g., Sunstein, Hastie, and Schkade 2007). In the case of disagreement on the focal issue, actors actively try to reduce dissonance. If they can agree on the remainder of the issues, dissonance reduction implies compromise and the actors will move closer with regard to the focal issue. If not, they will “stand their ground” and become even more convinced and more extreme in terms of their views on the salient issue. opinion_change <- function(a, b, actor_list) { # get issue issue <- select_issue(a, b, actor_list) %>% mutate(same_sign = case_when(sign(value_a) == sign(value_b) ~ TRUE, TRUE ~ FALSE)) # get rest rest <- select_rest(a, b, actor_list) %>% mutate(same_sign = case_when(sign(value_a) == sign(value_b) ~ TRUE, TRUE ~ FALSE)) # communicate if (issue$same_sign == FALSE | sum(rest$same_sign) == 3) { issue$value_a <- make_compromise(actor_list, a, b, issue$id)[1] issue$value_b <- make_compromise(actor_list, a, b, issue$id)[2] } if (issue$same_sign == FALSE | sum(rest$same_sign) < 3) { issue$value_a <- make_conflict(actor_list, a, b, issue$id)[1] issue$value_b <- make_conflict(actor_list, a, b, issue$id)[2] } if (issue$same_sign == TRUE) {
issue$value_a <- reinforce_opinion(actor_list, a, b, issue$id)[1]
issue$value_b <- reinforce_opinion(actor_list, a, b, issue$id)[2]
}

issue
}

report_opinion_change <- function(discussion_list, actor_list) {
for (i in seq_along(discussion_list$interlocution$a)){
issue <- opinion_change(discussion_list$interlocution$a[i],
discussion_list$interlocution$b[i],
actor_list)
# report results
actor_list[[discussion_list$interlocution$a[i]]][issue$id] <- issue$value_a
actor_list[[discussion_list$interlocution$b[i]]][issue$id] <- issue$value_b
}
for (i in seq_along(discussion_list$interlocution$a)) {
discussion_list$interlocution$euclidean_vec[[i]] <- euclidean(actor_list[[discussion_list$interlocution$a[[i]]]], actor_list[[discussion_list$interlocution$b[[i]]]])
}
maximum_euclidean <- max(c(discussion_list$interlocution$euclidean_vec, discussion_list$no_interlocution$euclidean_vec))
discussion_list\$interlocution %<>% mutate(euclidean_normalized = (euclidean_vec/maximum_euclidean))
return(list(discussion_list %>% bind_rows(.id = "discussion"), actor_list))
}
#test-run
test <- report_opinion_change(discussion_list, actor_list)`

After each deliberation, the new opinions are calculated and updated. Hence, an agent can change their opinion multiple times in each step, depending on how many interlocutors they have.

When the deliberation is finished, the euclidean distances between all agents are updated and the process can start fresh from anew. Each round, the actors’ opinions and discussion lists are saved. The latter contains an indicator whether actors had contact. Based on those two objects, eventual analyses can determine how the discussion networks and opinion distributions unfold.

### References

Baldassarri, Delia, and Peter S. Bearman. 2007. “Dynamics of Political Polarization.” American Sociological Review 72 (5): 784–811. https://doi.org/10.1177/000312240707200507.
Bruch, Elizabeth, and Jon Atwell. 2015. “Agent-Based Models in Empirical Social Research.” Sociological Methods & Research 44 (2): 186–221. https://doi.org/10.1177/0049124113506405.
Edmonds, Bruce, Christophe Le Page, Mike Bithell, Edmund Chattoe-Brown, Volker Grimm, Ruth Meyer, Cristina Montañola-Sales, Paul Ormerod, Hilton Root, and Flaminio Squazzoni. 2019. “Different Modelling Purposes.” Journal of Artificial Societies and Social Simulation 22 (3): 6. https://doi.org/10.18564/jasss.3993.
Grimm, Volker, Uta Berger, Finn Bastiansen, Sigrunn Eliassen, Vincent Ginot, Jarl Giske, John Goss-Custard, et al. 2006. “A Standard Protocol for Describing Individual-Based and Agent-Based Models.” Ecological Modelling 198 (1-2): 115–26. https://doi.org/10.1016/j.ecolmodel.2006.04.023.
Grimm, Volker, Steven F. Railsback, Christian E. Vincenot, Uta Berger, Cara Gallagher, Donald L. DeAngelis, Bruce Edmonds, et al. 2020. “The ODD Protocol for Describing Agent-Based and Other Simulation Models: A Second Update to Improve Clarity, Replication, and Structural Realism.” Journal of Artificial Societies and Social Simulation 23 (2): 7. https://doi.org/10.18564/jasss.4259.
Schelling, Thomas C. 1971. “Dynamic Models of Segregation.” The Journal of Mathematical Sociology 1 (2): 143–86. https://doi.org/10.1080/0022250X.1971.9989794.
Sunstein, Cass R., Reid Hastie, and David Schkade. 2007. “What Happened on Deliberation Day.” California Law Review 95 (915): 915–40.

1. As you can easily see, this parameter space becomes fairly huge and the endeavor becomes computationally quite expensive. An alternative might be Latin Hypercube Sampling .↩︎