Text Mining

Text Mining for Social Sciences (Summer 2024)

by Felix Lennert

2024-04-15

Felix Lennert Dear student, if you read this script, you are either participating in one of my courses on digital methods for the social sciences, or at least interested in this topic. If you have any questions or remarks regarding this script, hit me up at felix.lennert@ensae.fr. This script will introduce you to two techniques I regard as elementary for any aspiring (computational) social scientist: the collection of digital trace data via either scraping the web or acquiring data from application programming interfaces (APIs) and the analysis of text in an automated fashion (text mining). … Read more →

1

Text Mining with R

by Julia Silge and David Robinson

2024-02-02
Text Mining with R

A guide to text analysis within the tidy data framework, using the tidytext package and other tidy tools […] This is the website for Text Mining with R! Visit the GitHub repository for this site, find the book at O’Reilly, or buy it on Amazon. This work by Julia Silge and David Robinson is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 United States License. … Read more →

2

Toolbox Computational Social Science

by Felix Lennert

2024-01-07

Felix Lennert Dear student, if you read this script, you are either participating in one of my courses on digital methods for the social sciences, or at least interested in this topic. If you have any questions or remarks regarding this script, hit me up at felix.lennert@ensae.fr. This script will introduce you to two techniques I regard as elementary for any aspiring (computational) social scientist: the collection of digital trace data via either scraping the web or acquiring data from application programming interfaces (APIs) and the analysis of text in an automated fashion (text mining). … Read more →

3

Text Mining for Social Scientists

by Felix Lennert

2023-07-07

This book is supposed to introduce the reader (i.e., you) to a fundamental technique for computational social science research: the quantitative analysis of text. […] Dear student, if you read this script, you are either participating in one of my courses on digital methods for the social sciences, or at least interested in this topic. If you have any questions or remarks regarding this script, hit me up at felix.lennert@ensae.fr. This script will introduce you to the quantitative analysis of text using R. Through the last decades, more and more text has become readily available. Think for … Read more →

4

Statistical Inference

by Michael Foley

2023-03-01

Notes cobbled together from books, online classes, etc. to be used as quick reference for common work projects. […] These are notes from books, classes, tutorials, vignettes, etc. They contain mistakes, are poorly organized, and are sloppy on fundamentals. They should improve over time, but that’s all I can say for it. Use at your own risk. The focus of this handbook is statistical inference, including population estimates, group comparisons, and regression modeling. Not included here: probability, supervised ML, unsupervised ML, text mining, time series, survey analysis, or survival … Read more →

5

Toolbox CSS

by Felix Lennert

2023-01-16

This book is supposed to introduce the reader (i.e., you) into some fundamental techniques for computational social science research: acquiring online data, agent-based modeling, and text mining. […] Dear student, if you read this script, you are either participating in one of my courses on digital methods for the social sciences, or at least interested in this topic. If you have any questions or remarks regarding this script, hit me up at felix.lennert@ensae.fr. This script will introduce you to three techniques I regard as elementary for any aspiring (computational) social scientist: the … Read more →

6

Probability

by Michael Foley

2022-02-12

Notes cobbled together from books, online classes, etc. to be used as quick reference for common work projects. […] These are notes from books, classes, tutorials, vignettes, etc. They contain mistakes, are poorly organized, and are sloppy on fundamentals. They should improve over time, but that’s all I can say for it. Use at your own risk. The focus of this handbook is probability, including random variables and probability distributions. Not included here: statistics, machine learning, text mining, survey analysis, or survival analysis. These subjects frequently arise at work, but are … Read more →

7

Notes for “Text Mining with R: A Tidy Approach”

by Qiushi Yan

2020-05-10

Notes for “Text Mining with R: A Tidy Approach” […] This is a notebook concerning Text Mining with R: A Tidy Approach(Silge and Robinson 2017). tidyverse and tidytext are automatically loaded before each chapter: I have defined a simiple function, facet_bar() to meet the frequent need in this book to make a facetted bar plot, with the y variable reordered by x in each facet by: As a quick demostration of this function, we can plot the top 10 common words in Jane Austen’s six books: … Read more →

8