Chapter 4 References

Benoit, Kenneth, Kohei Watanabe, Haiyan Wang, Paul Nulty, Adam Obeng, Stefan Müller, and Akitaka Matsuo. 2018. “Quanteda: An R Package for the Quantitative Analysis of Textual Data.” Journal of Open Source Software 3 (30): 774. https://doi.org/10.21105/joss.00774.

Blei, David, Andrew Ng, and Michael Jordan. 2003. “Latent Dirichlet Allocation.” Journal of Machine Learning Research 3: 993–1022.

Feinerer, Ingo, Kurt Hornik, and David Meyer. 2008. “Text Mining Infrastructure in R.” Journal of Statistical Software 25 (5). https://doi.org/10.18637/jss.v025.i05.

Grimmer, Justin, Margaret Roberts, and Brandon Stewart. 2022. Text as Data: A New Framework for Machine Learning and the Social Sciences. Princeton: Princeton University Press.

Grün, Bettina, Kurt Hornik, David Blei, John Lafferty, Xuan-Hieu Phan, Makoto Matsumoto, Nishimura Takuji, and Shawn Cokus. 2020. “Topicmodels: Topic Models.”

Hvitfeldt, Emil. 2022. “Textrecipes: Extra ’Recipes’ for Text Processing.”

Kearney, Michael. 2019. “Rtweet: Collecting and Analyzing Twitter Data.” Journal of Open Source Software 4 (42): 1829. https://doi.org/10.21105/joss.01829.

Kuhn, Max, and Hannah Frick. 2022. “Dials: Tools for Creating Tuning Parameter Values.”

Kuhn, Max, Davis Vaughan, and Emil Hvitfeldt. 2022. “Parsnip: A Common API to Modeling and Analysis Functions.”

Kuhn, Max, and Hadley Wickham. 2020. “Tidymodels: A Collection of Packages for Modeling and Machine Learning Using Tidyverse Principles.”

———. 2022. “Recipes: Preprocessing and Feature Engineering Steps for Modeling.”

Manning, Christopher D, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. New York: Cambridge University Press.

Monroe, Burt L., Michael P. Colaresi, and Kevin M. Quinn. 2008. “Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict.” Political Analysis 16 (4): 372–403. https://doi.org/10.1093/pan/mpn018.

Munzert, Simon, Christian Rubba, Peter Meißner, and Dominik Nyhuis. 2014. Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining. Chichester, West Sussex, United Kingdom: Wiley.

Ooms, Jeroen, Duncan Temple Lang, and Lloyd Hilaiel. 2020. “Jsonlite: A Simple and Robust JSON Parser and Generator for R.”

Perepolkin, Dmytro. 2019. “Polite: Be Nice on the Web.”

Porter, Martin. 2001. “Snowball: A Language for Stemming Algorithms.”

Robinson, David. 2020. “Broom: Convert Statistical Analysis Objects into Tidy Data Frames.”

Silge, Julia, and David Robinson. 2016. “Tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” The Journal of Open Source Software 1 (3): 37. https://doi.org/10.21105/joss.00037.

Vaughan, Davis. 2022. “Workflows: Modeling Workflows.”

Wickham, Hadley. 2010. “A Layered Grammar of Graphics.” Journal of Computational and Graphical Statistics 19 (1): 3–28. https://doi.org/10.1198/jcgs.2009.07098.

———. 2019a. “Rvest: Easily Harvest (Scrape) Web Pages.”

———. 2019b. “Stringr: Simple, Consistent Wrappers for Common String Operations.”

———. 2020. “Httr: Tools for Working with URLs and HTTP.”

Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the Tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

Wickham, Hadley, Jennifer Bryan, Malcolm Barrett, and RStudio. 2021. “Usethis: Automate Package and Project Setup.”

Wickham, Hadley, and Garrett Grolemund. 2016. R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. First edition. Sebastopol, CA: O’Reilly.