Chapter 5 Converting to and from non-tidy formats

Non-tidy data structures, in particular matrcies, is essential in topic modeling where other packages for NLP in R play a major role.

The book has a diagram describing the “glue” part functions in this chapter play:

Taken from the book, Chapter 5

Figure 5.1: Taken from the book, Chapter 5

As shown in the figure, a tidied DTM is typically equivalent with a one-token-per-row data frame after counting.