A Minimal Book Example
1
Hello and Welcome!
2
Life Advice
2.1
Some Potentially Cool Readings…[more to be added in time…]
3
Introduction to Data Mining
3.1
What is Data Mining?
3.1.1
How can big data help?
3.1.2
What are the techniques in data mining?
3.2
Ethics of Data Mining
3.2.1
Bias in Data Mining (more to be added)
3.2.2
Bias in Data Mining, cont.
4
Exploratory Data Analysis
4.1
Looking at data
4.2
Basic EDA
4.2.1
Brief ethics consideration based on point (e)
4.2.2
Quick reminders of definitions for EDA
4.2.3
Variation
4.2.4
Data in Context, part 2
4.2.5
Ethics interjection
4.2.6
Covariation
4.3
Missing Data
4.3.1
NA v NULL in R
4.3.2
So, what do we do?
4.4
Imputations
4.4.1
Definitions
4.4.2
Imputation Examples
4.5
Basic Statistics in R
4.5.1
Descriptive Statistics
4.5.2
Correlations
4.5.3
Visual Correlations using corrplot
4.5.4
t-tests
5
Data Visualization
5.1
An introduction and very brief history
5.2
What makes a good graphic?
5.2.1
The ggplot way
5.2.2
Facets
5.3
More Advanced Data Visualization
5.3.1
Bar Plots
6
Cluster Analysis
6.1
Introduction to Cluster Analysis
6.2
Partitional Clustering
6.2.1
K-Means
6.2.2
K-Medoids Clustering (PAM)
6.2.3
Clustering Large Applications (CLARA)
6.3
Hierarchical Clustering
6.3.1
Agglomerative Hierarchical Clustering
6.3.2
Divisive Hierarchical Clustering
6.4
Density-based Clustering
6.5
Soft Clustering
6.6
Evaluation Techniques
6.6.1
Internal Clustering Validation Methods
6.6.2
External Clustering Validation Method
7
Supervised Learning [holding space]
8
Reinforcement Learning [holding space]
9
Text Mining
9.1
Introduction to Text Mining
9.2
Basic Sentiment Analysis
9.3
Term-Frequency
9.3.1
How are these calculated?
9.3.2
Analyzing TF-IDF
9.4
n-grams
9.4.1
Tokenizing ngrams
9.4.2
How can we analyze bigrams?
9.4.3
Bigrams and Sentiment
9.4.4
Network Visualization of ngrams
9.5
Topic Modelling
9.5.1
Latent Dirichlet Allocation (LDA)
9.5.2
Data Prep and Cleaning
10
Applied Data Analytics: Principle Components Analysis (PCA)
11
Applied Data Analytics: More Advanced Topic Modelling
11.1
Data Prep
11.2
Manually testing different numbers of topics
11.3
Topic Coherence Measures
12
Bonus 2 [holding space]
References
Published with bookdown
Data Analytics Living Texbook
Chapter 7
Supervised Learning [holding space]