2.3 Comparing 3 different dictionaries

pride_prejudice <- tidy_books %>% 
  filter(book == "Pride & Prejudice")

pride_prejudice
#> # A tibble: 122,204 x 4
#>   book              linenumber chapter word     
#>   <fct>                  <int>   <int> <chr>    
#> 1 Pride & Prejudice          1       0 pride    
#> 2 Pride & Prejudice          1       0 and      
#> 3 Pride & Prejudice          1       0 prejudice
#> 4 Pride & Prejudice          3       0 by       
#> 5 Pride & Prejudice          3       0 jane     
#> 6 Pride & Prejudice          3       0 austen   
#> # ... with 1.222e+05 more rows

afinn <- pride_prejudice %>% 
  inner_join(get_sentiments("afinn")) %>% 
  mutate(index = linenumber %/% 80) %>% 
  count(book, index, wt = value, name = "value") %>%
  mutate(dict = "afinn") %>% 
  select(index, value, dict)

bing <- pride_prejudice %>% 
  inner_join(get_sentiments("bing")) %>% 
  mutate(index = linenumber %/% 80) %>% 
  count(index, sentiment) %>% 
  pivot_wider(names_from = sentiment, values_from = n, values_fill = list(n = 0)) %>% 
  mutate(value = positive - negative,
         dict = "bing") %>%  
  select(index, value, dict)

nrc <- pride_prejudice %>% 
  inner_join(get_sentiments("nrc")) %>%
  filter(sentiment %in% c("positive", "negative")) %>% 
  mutate(index = linenumber %/% 80) %>% 
  count(index, sentiment) %>% 
  pivot_wider(names_from = sentiment, values_from = n, values_fill = list(n = 0)) %>% 
  mutate(value = positive - negative,
         dict = "nrc") %>% 
  select(index, value, dict)

bind_rows(afinn, bing, nrc) %>% 
  ggplot() + 
  geom_col(aes(index, value, fill = dict), show.legend = FALSE) + 
  facet_wrap(~ dict, nrow = 3)

It is natural for score based on these 3 different dictionary to differ in some sense, because for the latter two we are just consider as the sentiment score the number of positive words minus that of negative words. But they shoud all have similar relative trajectories through the novel.

Why is, for example, the result for the NRC lexicon biased so high in sentiment compared to the Bing et al. result? Let’s look briefly at how many positive and negative words are in these lexicons.

get_sentiments("nrc") %>% 
  filter(sentiment %in% c("positive", "negative")) %>% 
  count(sentiment)
#> # A tibble: 2 x 2
#>   sentiment     n
#>   <chr>     <int>
#> 1 negative   3324
#> 2 positive   2312

get_sentiments("bing") %>% 
  count(sentiment)
#> # A tibble: 2 x 2
#>   sentiment     n
#>   <chr>     <int>
#> 1 negative   4781
#> 2 positive   2005

Both lexicons have more negative than positive words, but the ratio of negative to positive words is higher in the Bing lexicon than the NRC lexicon. This will contribute to the effect we see in the plot above, as will any systematic difference in word matches, e.g. if the negative words in the NRC lexicon do not match the words that Jane Austen uses very well.. Whatever the source of these differences, we see similar relative trajectories across the narrative arc, with similar changes in slope, but marked differences in absolute sentiment from lexicon to lexicon. This is all important context to keep in mind when choosing a sentiment lexicon for analysis.