10.1 Text as Data
- Many sources of text data for social scientists:
- open ended survey responses, social media data, interview transcripts, news articles, official documents (public records, etc.), research publications, etc.
- even if data of interest does not exist in textual form (yet): tools of speech recognition and machine translation, crowdworkers, etc.
- previously: text data was often ignored, selectively read, anecdotally used or manually labeled by researchers
- today: wide variety of text analytically methods (supervised + unsupervised) and increasing adoption of these methods by social scientists (Wilkerson and Casas 2017)
References
Wilkerson, John, and Andreu Casas. 2017. “Large-Scale Computerized Text Analysis in Political Science: Opportunities and Challenges.” Annu. Rev. Polit. Sci. 20 (1): 529–44.