13 Data Manipulation (II)

This chapter focuses on variable transformations, data aggregation, and other tasks operating on rows and columns, including handling duplicates, sorting rows, and ordering columns.

13.1 Transformating variables

Variable transformation is the replacement of a variable by a new value or a function of that variable. The purposes can be multiple: convenience, reducing skewness, producing linear or additive relationships, and more. Standardization and logarithmic transformation are common types of variable transformation.

creating new variables

The simplest way to create a new variable is using the operator $.

AFL <- sp500stocks[sp500stocks$symbol == "AFL",]

AFL$close_lagged1 <- lag(AFL$close, k = 1)
AFL$close_lagged2 <- lag(AFL$close, k = 2)

lag computes a lagged version of a time series, shifting the time base back by a given number of observations. The argument k is the number of lags in units of observations.

Below is the dplyr version. The lag() is not the base R lag, and n is equivalent to k.

library(dplyr)

AFL %>% mutate(close_lagged1 = lag(close, n = 1))
## # A tibble: 20 × 9
##    symbol date        open  high   low close  volume adjusted close_lagged1
##    <chr>  <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl>         <dbl>
##  1 AFL    2021-04-12  52.0  52.6  51.8  52.4 2731200     52.4          NA  
##  2 AFL    2021-04-13  52.1  52.5  51.9  52.3 3074600     52.3          52.4
##  3 AFL    2021-04-14  52.2  52.9  52.1  52.9 2255700     52.9          52.3
##  4 AFL    2021-04-15  52.9  52.9  52.5  52.8 2164500     52.8          52.9
##  5 AFL    2021-04-16  53.3  53.6  53.1  53.5 2763400     53.5          52.8
##  6 AFL    2021-04-19  53.5  53.7  53.3  53.6 3330100     53.6          53.5
##  7 AFL    2021-04-20  53.3  53.4  52.2  52.6 2512600     52.6          53.6
##  8 AFL    2021-04-21  52.7  53.6  52.6  53.5 3037700     53.5          52.6
##  9 AFL    2021-04-22  53.1  53.4  52.8  52.9 3874600     52.9          53.5
## 10 AFL    2021-04-23  53.0  53.6  52.8  53.4 2803400     53.4          52.9
## 11 AFL    2021-04-26  53.5  54.1  53.3  53.4 3473500     53.4          53.4
## 12 AFL    2021-04-27  53.4  53.6  53.2  53.5 2466700     53.5          53.4
## 13 AFL    2021-04-28  53.9  54    53.5  53.5 2247400     53.5          53.5
## 14 AFL    2021-04-29  54.1  54.9  53.5  54.2 3253400     54.2          53.5
## 15 AFL    2021-04-30  54.2  54.2  53.4  53.7 4143000     53.7          54.2
## 16 AFL    2021-05-03  54.3  54.9  53.8  54.5 3071000     54.5          53.7
## 17 AFL    2021-05-04  54.4  55.0  54.0  54.9 2948400     54.9          54.5
## 18 AFL    2021-05-05  54.9  55.4  54.9  55.4 2575000     55.4          54.9
## 19 AFL    2021-05-06  55.7  56.2  55.2  56.2 3791500     56.2          55.4
## 20 AFL    2021-05-07  55.4  56.2  55.3  56.1 2597200     56.1          56.2

mutate() can operate on multiple variables.

AFL %>% mutate(close_lagged1 = lag(close, n = 1), close_lagged2 = lag(close, n = 2))
## # A tibble: 20 × 10
##    symbol date        open  high   low close  volume adjusted close_lagged1 close_lagged2
##    <chr>  <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl>         <dbl>         <dbl>
##  1 AFL    2021-04-12  52.0  52.6  51.8  52.4 2731200     52.4          NA            NA  
##  2 AFL    2021-04-13  52.1  52.5  51.9  52.3 3074600     52.3          52.4          NA  
##  3 AFL    2021-04-14  52.2  52.9  52.1  52.9 2255700     52.9          52.3          52.4
##  4 AFL    2021-04-15  52.9  52.9  52.5  52.8 2164500     52.8          52.9          52.3
##  5 AFL    2021-04-16  53.3  53.6  53.1  53.5 2763400     53.5          52.8          52.9
##  6 AFL    2021-04-19  53.5  53.7  53.3  53.6 3330100     53.6          53.5          52.8
##  7 AFL    2021-04-20  53.3  53.4  52.2  52.6 2512600     52.6          53.6          53.5
##  8 AFL    2021-04-21  52.7  53.6  52.6  53.5 3037700     53.5          52.6          53.6
##  9 AFL    2021-04-22  53.1  53.4  52.8  52.9 3874600     52.9          53.5          52.6
## 10 AFL    2021-04-23  53.0  53.6  52.8  53.4 2803400     53.4          52.9          53.5
## 11 AFL    2021-04-26  53.5  54.1  53.3  53.4 3473500     53.4          53.4          52.9
## 12 AFL    2021-04-27  53.4  53.6  53.2  53.5 2466700     53.5          53.4          53.4
## 13 AFL    2021-04-28  53.9  54    53.5  53.5 2247400     53.5          53.5          53.4
## 14 AFL    2021-04-29  54.1  54.9  53.5  54.2 3253400     54.2          53.5          53.5
## 15 AFL    2021-04-30  54.2  54.2  53.4  53.7 4143000     53.7          54.2          53.5
## 16 AFL    2021-05-03  54.3  54.9  53.8  54.5 3071000     54.5          53.7          54.2
## 17 AFL    2021-05-04  54.4  55.0  54.0  54.9 2948400     54.9          54.5          53.7
## 18 AFL    2021-05-05  54.9  55.4  54.9  55.4 2575000     55.4          54.9          54.5
## 19 AFL    2021-05-06  55.7  56.2  55.2  56.2 3791500     56.2          55.4          54.9
## 20 AFL    2021-05-07  55.4  56.2  55.3  56.1 2597200     56.1          56.2          55.4

Let’s create some technical indicators to evaluate the market performance of the stocks of the S&P 500 companies.

Simple Moving Average (SMA)

An n-day simple moving average (n-day SMA) is arithmetic average of prices of past n days.

We’ll create a variable that is a 2-day SMA of AFL’s close prices.

mySMA <- function (price, n){
  sma <- c()
  sma[1 : (n - 1)] <- NA
  for (i in n : length(price)){
    sma[i] <- mean(price[(i - n + 1) : i])
  }
  return(sma)
}

mySMA(AFL$close, 2)
##  [1]     NA 52.365 52.580 52.850 53.140 53.505 53.075 53.045 53.220 53.155 53.370 53.450 53.525 53.870 53.975
## [16] 54.125 54.705 55.135 55.780 56.145

We can add that variable to our data frame.

AFL$sma <- mySMA(AFL$close, 2)

The package quantmod provides an SMA() function, which we can directly call to calculate SMA.

Exponential Moving Average (EMA)

An exponential moving average (EMA) is a type of moving average that places a greater weight and significance on the most recent data points.

EMA with n lagged period at time t is defined below by myEMA().

myEMA <- function (price, n){
  ema <- c()
  ema[1 : (n - 1)] <- NA
  ema[n]<- mean(price[1 : n])
  beta <- 2 / (n + 1)
  for (i in (n + 1):length(price)){
    ema[i] <- beta * price[i] + (1 - beta) * ema[i - 1]
  }
  return(ema)
}

myEMA(AFL$close, 3)
##  [1]       NA       NA 52.53333 52.68167 53.06583 53.31292 52.95146 53.22573 53.08286 53.22643 53.29822 53.41411
## [13] 53.46705 53.84353 53.78676 54.15338 54.52169 54.95085 55.56542 55.83771

We can add a new variable that calculates three-day EMA of AFL’s close prices.

AFL$ema <- myEMA(AFL$close, 3)

The quantmod function is EMA().

dropping variables

Assigning NULL to a variable removes that variable from the data frame.

AFL$close_lagged2 <- NULL

Adding - to a variable deselects that variable with dplyr::select().

AFL %>% 
  mutate(close_lagged1 = lag(close, n = 1), 
         close_lagged2 = lag(close, n = 2)) %>%
  select(-close_lagged2)
## # A tibble: 20 × 11
##    symbol date        open  high   low close  volume adjusted   sma   ema close_lagged1
##    <chr>  <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl> <dbl> <dbl>         <dbl>
##  1 AFL    2021-04-12  52.0  52.6  51.8  52.4 2731200     52.4  NA    NA            NA  
##  2 AFL    2021-04-13  52.1  52.5  51.9  52.3 3074600     52.3  52.4  NA            52.4
##  3 AFL    2021-04-14  52.2  52.9  52.1  52.9 2255700     52.9  52.6  52.5          52.3
##  4 AFL    2021-04-15  52.9  52.9  52.5  52.8 2164500     52.8  52.9  52.7          52.9
##  5 AFL    2021-04-16  53.3  53.6  53.1  53.5 2763400     53.5  53.1  53.1          52.8
##  6 AFL    2021-04-19  53.5  53.7  53.3  53.6 3330100     53.6  53.5  53.3          53.5
##  7 AFL    2021-04-20  53.3  53.4  52.2  52.6 2512600     52.6  53.1  53.0          53.6
##  8 AFL    2021-04-21  52.7  53.6  52.6  53.5 3037700     53.5  53.0  53.2          52.6
##  9 AFL    2021-04-22  53.1  53.4  52.8  52.9 3874600     52.9  53.2  53.1          53.5
## 10 AFL    2021-04-23  53.0  53.6  52.8  53.4 2803400     53.4  53.2  53.2          52.9
## 11 AFL    2021-04-26  53.5  54.1  53.3  53.4 3473500     53.4  53.4  53.3          53.4
## 12 AFL    2021-04-27  53.4  53.6  53.2  53.5 2466700     53.5  53.4  53.4          53.4
## 13 AFL    2021-04-28  53.9  54    53.5  53.5 2247400     53.5  53.5  53.5          53.5
## 14 AFL    2021-04-29  54.1  54.9  53.5  54.2 3253400     54.2  53.9  53.8          53.5
## 15 AFL    2021-04-30  54.2  54.2  53.4  53.7 4143000     53.7  54.0  53.8          54.2
## 16 AFL    2021-05-03  54.3  54.9  53.8  54.5 3071000     54.5  54.1  54.2          53.7
## 17 AFL    2021-05-04  54.4  55.0  54.0  54.9 2948400     54.9  54.7  54.5          54.5
## 18 AFL    2021-05-05  54.9  55.4  54.9  55.4 2575000     55.4  55.1  55.0          54.9
## 19 AFL    2021-05-06  55.7  56.2  55.2  56.2 3791500     56.2  55.8  55.6          55.4
## 20 AFL    2021-05-07  55.4  56.2  55.3  56.1 2597200     56.1  56.1  55.8          56.2

13.2 Recoding variables

Recode means replacing numeric values based on their position or name.

<-

One way we to recode variables is to use the assignment sign <-.

For instance, below we combine the category “Communication Services” with “Information Technology”, and “Consumer Discretionary” with “Consumer Staples” in the sp500tickers dataset.

table(sp500tickers$`GICS Sector`)
## 
## Communication Services Consumer Discretionary       Consumer Staples                 Energy 
##                     26                     63                     32                     23 
##             Financials            Health Care            Industrials Information Technology 
##                     65                     62                     74                     75 
##              Materials            Real Estate              Utilities 
##                     28                     29                     28
sp500tickers$sector <- sp500tickers$`GICS Sector`
sp500tickers$sector[sp500tickers$sector == "Communication Services"] <- "CS&IT"
sp500tickers$sector[sp500tickers$sector == "Information Technology"] <- "CS&IT"
sp500tickers$sector[sp500tickers$sector == "Consumer Discretionary"] <- "Consumer"
sp500tickers$sector[sp500tickers$sector == "Consumer Staples"] <- "Consumer"

The resulting new category is shown below.

table(sp500tickers$sector)
## 
##    Consumer       CS&IT      Energy  Financials Health Care Industrials   Materials Real Estate   Utilities 
##          95         101          23          65          62          74          28          29          28

dplyr::recode()

The same can be achieved by dplyr function recode().

sp500tickers$sector2 <- recode(sp500tickers$`GICS Sector`,
         `Communication Services` = "CS&IT",  
         `Information Technology` = "CS&IT", 
         `Consumer Discretionary` = "Consumer",
         `Consumer Staples` = "Consumer")

head(sp500tickers$sector2)
## [1] "Industrials" "Health Care" "Health Care" "Health Care" "CS&IT"       "CS&IT"

dplyr::case_when()

case_when() allows us to vectorize multiple if and else if statements.

sp500tickers$sector3 <- case_when(
  sp500tickers$`GICS Sector` %in% c("Communication Services","Information Technology") ~ "CS&IT",
  sp500tickers$`GICS Sector` %in% c("Consumer Discretionary","Consumer Staples") ~ "Consumer")

What is not specified becomes NA.

head(sp500tickers$sector3)
## [1] NA      NA      NA      NA      "CS&IT" "CS&IT"

To correct that:

sp500tickers$sector3 <- case_when(
  sp500tickers$`GICS Sector` %in% c("Communication Services","Information Technology") ~ "CS&IT",
  sp500tickers$`GICS Sector` %in% c("Consumer Discretionary","Consumer Staples") ~ "Consumer",
  TRUE ~ sp500tickers$`GICS Sector`)

head(sp500tickers$sector3)
## [1] "Industrials" "Health Care" "Health Care" "Health Care" "CS&IT"       "CS&IT"

dplyr::mutate(case_when())

mutate(case_when()) creates new variables based on specified conditions.

sp500tickers <- sp500tickers %>% 
  mutate(sector4 = case_when(
    sp500tickers$`GICS Sector` %in% c("Communication Services","Information Technology") ~ "CS&IT",
    sp500tickers$`GICS Sector` %in% c("Consumer Discretionary","Consumer Staples") ~ "Consumer",
    TRUE ~ sp500tickers$`GICS Sector`)
  )

head(sp500tickers$sector4)
## [1] "Industrials" "Health Care" "Health Care" "Health Care" "CS&IT"       "CS&IT"

dplyr::mutate(ifelse())

mutate() can also be used with ifelse().

sp500tickers %>%
  filter(is.na(`Date first added`)) %>%
  mutate(newdate = ifelse(is.na(`Date first added`), Founded, `Date first added`)) %>%
  select(`Date first added`, Founded, newdate)
## # A tibble: 51 × 3
##    `Date first added` Founded newdate
##    <chr>              <chr>   <chr>  
##  1 <NA>               1823    1823   
##  2 <NA>               1851    1851   
##  3 <NA>               1969    1969   
##  4 <NA>               1938    1938   
##  5 <NA>               1983    1983   
##  6 <NA>               1911    1911   
##  7 <NA>               1966    1966   
##  8 <NA>               1858    1858   
##  9 <NA>               1997    1997   
## 10 <NA>               1947    1947   
## # ℹ 41 more rows

13.3 Aggregating data

Data aggregation is the process of combining individual-level data to high-level data.

apply family functions

We discussed apply family functions thoroughly in the chapter Apply Family Functions. One application of these functions is aggregating data.

For instance, we can get the means of open, close, high, and low prices and the volume in the time periods from 2021-04-12 to 2021-05-07 using apply().

apply(AFL[, 3 : 8], 2, mean, na.rm = TRUE)
##         open         high          low        close       volume     adjusted 
##      53.6065      54.0265      53.2685      53.7645 2955745.0000      53.7645

As another example, we can use tapply() to calculate the momentum of the close prices broken out by symbol.

We first create the momentum function and set the default n to be 2.

myMom <- function (price, n = 2){
  mom <- rep(0, n)
  N <- length(price)
  Lprice <- lag(price, n)
  for (i in (n + 1) : N){
    mom[i] <- price[i] - Lprice[i]
  }
  return(mom)
}

We then apply myMom() to the close prices of AFL. Here, symbol is the group variable.

t <- tapply(sp500stocks$close, as.factor(sp500stocks$symbol), myMom)
head(t, 3)
## $A
##  [1]  0.000000  0.000000  0.699997  0.309998  2.610001  0.220001 -2.119995  0.330002  2.069992  2.849991  2.809997
## [12] -0.199997 -2.539993 -2.330002 -1.160004 -0.959992 -2.740005  0.179993  1.500000  0.529999
## 
## $AAL
##  [1]  0.000000  0.000000 -0.539999 -0.430000 -0.340000 -0.580000 -1.660000 -0.539999 -0.300001  0.100001  1.970001
## [12]  0.649999 -0.340000 -0.460001  0.019998  0.650002 -0.299999 -0.380001  0.070000  0.430000
## 
## $AAP
##  [1]  0.000000  0.000000 -0.289994  2.050003  5.149994  3.079986  0.990006  7.330002  6.389999 -0.119995 -3.160004
## [12] -2.470001  0.490006  3.399994  2.529999  1.180008  3.989990  0.949997  0.010010  1.660003

We can also use by() to apply the function myMom() to each symbol. The result is a list.

b <- by(sp500stocks$close, as.factor(sp500stocks$symbol), myMom)
head(b, 3)
## $A
##  [1]  0.000000  0.000000  0.699997  0.309998  2.610001  0.220001 -2.119995  0.330002  2.069992  2.849991  2.809997
## [12] -0.199997 -2.539993 -2.330002 -1.160004 -0.959992 -2.740005  0.179993  1.500000  0.529999
## 
## $AAL
##  [1]  0.000000  0.000000 -0.539999 -0.430000 -0.340000 -0.580000 -1.660000 -0.539999 -0.300001  0.100001  1.970001
## [12]  0.649999 -0.340000 -0.460001  0.019998  0.650002 -0.299999 -0.380001  0.070000  0.430000
## 
## $AAP
##  [1]  0.000000  0.000000 -0.289994  2.050003  5.149994  3.079986  0.990006  7.330002  6.389999 -0.119995 -3.160004
## [12] -2.470001  0.490006  3.399994  2.529999  1.180008  3.989990  0.949997  0.010010  1.660003

dplyr functions

The outputs from the apply functions are not always so controllable as we would like them to be. Sometimes getting our results into a data frame from apply function can take a bit of effort.

This is where the dplyr functions come in handy. We show two quick examples below.

group_by()

group_by() groups a data frame by the specified variable. ungroup() removes the grouping.

group_by() is most useful when used in combination with summarise(), filter(), mutate(), and arrange().

sp500stocks %>% 
  filter(symbol %in% c("AFL", "MMM", "AAPL")) %>%
  group_by(symbol) %>% 
  summarise(sum(volume))
## # A tibble: 3 × 2
##   symbol `sum(volume)`
##   <chr>          <dbl>
## 1 AAPL      1820543300
## 2 AFL         59114900
## 3 MMM         43785600

tally()

tally() counts the number of cases or calls sum() to calculate the total amount.

library(dplyr)
sp500tickers %>% group_by(`GICS Sector`, `GICS Sub-Industry`) %>% tally()
## # A tibble: 124 × 3
## # Groups:   GICS Sector [11]
##    `GICS Sector`          `GICS Sub-Industry`                       n
##    <chr>                  <chr>                                 <int>
##  1 Communication Services Advertising                               2
##  2 Communication Services Alternative Carriers                      1
##  3 Communication Services Broadcasting                              2
##  4 Communication Services Cable & Satellite                         3
##  5 Communication Services Integrated Telecommunication Services     2
##  6 Communication Services Interactive Home Entertainment            3
##  7 Communication Services Interactive Media & Services              4
##  8 Communication Services Movies & Entertainment                    6
##  9 Communication Services Publishing                                2
## 10 Communication Services Wireless Telecommunication Services       1
## # ℹ 114 more rows

tally(sum()) calculates the sum by the specified group.

sp500stocks %>% 
  filter(symbol %in% c("AFL", "MMM", "AAPL")) %>%
  group_by(symbol) %>% 
  tally(sum(volume))
## # A tibble: 3 × 2
##   symbol          n
##   <chr>       <dbl>
## 1 AAPL   1820543300
## 2 AFL      59114900
## 3 MMM      43785600

13.4 Duplicates

keeping unique rows

dplyr::distinct() is equivalent to unique(), which selects unique rows.

We can apply distinct() to the entire data frame to keep unique records using information from all variables.

distinct(sp500stocks)

We can also filter records based on a single variable.

sp500stocks %>% distinct(symbol, .keep_all = TRUE)

In this example, duplicates are removed and all other variables are kept where symbol has no duplicates. Only the first cases of duplicated symbol have been kept. Without argument .keep_all, the other variables will be dropped.

distinct() allows for multiple variables to be included.

sp500stocks %>% distinct(symbol,date, .keep_all = TRUE)

finding duplicates

Using duplicated() from base R, we will get a logical vector indicating which values are duplicates. By default, the first value will be marked as the unique value (FALSE) and the rest as duplicates (TRUE).

duplicated(sp500tickers$Founded)

We may also create a new column to mark the duplicates.

sp500tickers$dup <- duplicated(sp500tickers$Founded)
head(sp500tickers$dup, 20)
##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
## [19] FALSE FALSE

In cases where we want to mark all identical elements as duplicates, we can specify the fromLast argument to be TRUE, where duplication would be considered from the reverse side.

sp500tickers$dup2 <- duplicated(sp500tickers$Founded) | duplicated(sp500tickers$Founded, fromLast=TRUE)

finding positions of duplicates

Together with which(), which(duplicated()) returns the positions of duplicate cases.

which(duplicated(sp500tickers$Founded))

13.5 Sorting rows

dplyr::arrange() sorts variables in ascending order. It is the dplyr version of sort().

sp500stocks %>% arrange(date, close)

desc() sorts variables in descending order.

sp500stocks %>% arrange(symbol, desc(close))

Alternatively, we can add - to a variable within arrange().

sp500stocks %>% arrange(symbol, -close)

13.6 Ordering columns

In addition to moving the rows, we can move columns in select().

sp500stocks %>% select(adjusted, symbol, date, open, high, low, close)
## # A tibble: 10,100 × 7
##    adjusted symbol date        open  high   low close
##       <dbl> <chr>  <date>     <dbl> <dbl> <dbl> <dbl>
##  1     198. MMM    2021-04-12  198.  199.  197.  198.
##  2     196. MMM    2021-04-13  196.  197.  195.  196.
##  3     197. MMM    2021-04-14  196.  197.  195.  197.
##  4     197. MMM    2021-04-15  197.  198.  196.  197.
##  5     199. MMM    2021-04-16  199.  200.  198.  199.
##  6     199. MMM    2021-04-19  198.  199.  197.  199.
##  7     198. MMM    2021-04-20  199.  199.  197.  198.
##  8     201. MMM    2021-04-21  198.  202.  198.  201.
##  9     201. MMM    2021-04-22  201.  202.  200.  201.
## 10     202. MMM    2021-04-23  201   203.  201.  202.
## # ℹ 10,090 more rows

We can move a variable to the front with select(var, everything()).

sp500stocks %>% select(adjusted, everything())
## # A tibble: 10,100 × 8
##    adjusted symbol date        open  high   low close  volume
##       <dbl> <chr>  <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>
##  1     198. MMM    2021-04-12  198.  199.  197.  198. 1682100
##  2     196. MMM    2021-04-13  196.  197.  195.  196. 2259700
##  3     197. MMM    2021-04-14  196.  197.  195.  197. 1575000
##  4     197. MMM    2021-04-15  197.  198.  196.  197. 1461600
##  5     199. MMM    2021-04-16  199.  200.  198.  199. 2769500
##  6     199. MMM    2021-04-19  198.  199.  197.  199. 2200500
##  7     198. MMM    2021-04-20  199.  199.  197.  198. 1653100
##  8     201. MMM    2021-04-21  198.  202.  198.  201. 2649000
##  9     201. MMM    2021-04-22  201.  202.  200.  201. 2215800
## 10     202. MMM    2021-04-23  201   203.  201.  202. 1864100
## # ℹ 10,090 more rows

select(-adjusted, everything()) moves the column adjusted to the last.

sp500stocks %>% select(adjusted, everything()) %>% select(-adjusted, everything())
## # A tibble: 10,100 × 8
##    symbol date        open  high   low close  volume adjusted
##    <chr>  <date>     <dbl> <dbl> <dbl> <dbl>   <dbl>    <dbl>
##  1 MMM    2021-04-12  198.  199.  197.  198. 1682100     198.
##  2 MMM    2021-04-13  196.  197.  195.  196. 2259700     196.
##  3 MMM    2021-04-14  196.  197.  195.  197. 1575000     197.
##  4 MMM    2021-04-15  197.  198.  196.  197. 1461600     197.
##  5 MMM    2021-04-16  199.  200.  198.  199. 2769500     199.
##  6 MMM    2021-04-19  198.  199.  197.  199. 2200500     199.
##  7 MMM    2021-04-20  199.  199.  197.  198. 1653100     198.
##  8 MMM    2021-04-21  198.  202.  198.  201. 2649000     201.
##  9 MMM    2021-04-22  201.  202.  200.  201. 2215800     201.
## 10 MMM    2021-04-23  201   203.  201.  202. 1864100     202.
## # ℹ 10,090 more rows