AI4PH2022
Preface
1
Prerequisite
1.1
Before the sessions
1.1.1
Install Required R Packages
1.1.2
Twitter Dataset for the Tutorial
2
Text Analytics with R: Classification
2.1
Initial R Setup
2.1.1
Load the R Packages
2.1.2
Load the Twitter Data for Classification
2.2
Data Pre-processing
2.2.1
Regular Expressions
2.2.2
Tokenization
2.2.3
Stop Words
2.2.4
Stemming
2.2.5
Exporatory Data Analysis
2.3
Classification Models
2.3.1
Split the Training and Test Set
2.3.2
Pre-processing with the
recipe()
2.3.3
Choose a Classification Model
2.3.4
The
workflow()
for bundling
recipe()
and models
2.3.5
Model Evaluation
2.3.6
Evaluate on the Testing Set (Optional)
3
Text Analytics with R: Topic Modelling
3.1
Initial R Setup
3.1.1
Load the R Packages
3.1.2
Load the Twitter Data for Topic Modelling
3.2
Topic Modelling
3.2.1
Latent Dirichlet allocation (LDA)
4
Data Challenge Prep
4.1
N2C2 NLP Research Datasets
4.2
Initial R Setup
4.3
Reading XML files in R
4.4
Preprocessing the
text
column in the
pt_record_raw
dataframe
4.5
Descriptive Statistics
4.5.1
Discharge Summary Dataset
4.5.2
The Annotation Dataset
4.6
Your Analyses
4.7
Appendix: Feature Manual Extraction from the Text
5
Relevant Resources
References
Published with bookdown
AI4PH: Text Analyses with R Tutorial and Data Challenge
References