Exploratory Visual Data Analysis
Preface
0.1
Pedigogical Plan
0.2
Data Sources
0.3
Introduction
1
What is Data?
1.1
Attributes of Reliable Data
1.1.1
Collected in “good faith”
1.1.2
Representative
1.1.3
Sufficient
1.2
Categories of Data Quality
1.2.1
Anecdotes & Deceptive Evidence
1.2.2
Ad-Hoc Data
1.2.3
Purposefully Designed Studies
1.3
Examples
1.3.1
A Washingtion Post / ABC News Survey
1.4
Exercises
2
Data Variablity
2.1
Individual Chain Reasoning
2.2
Population Trends Reasoning
2.3
Exercises
3
Data Sources
3.1
Government Collected Data
3.1.1
US Census Bureau
3.1.2
US NOAA
3.1.3
US Bureau of Labor Statistics
3.1.4
Data.gov
3.2
Known but not government collected
3.3
Exercises
4
Data Manipulation
4.1
Introduction
4.1.1
Import
4.1.2
Tidying
4.1.3
Cleaning
4.1.4
Use
4.2
Fundamental Actions
4.2.1
Sorting
4.2.2
Subsetting
4.2.3
New Column via from Existing Column(s)
4.2.4
Aggregation
4.2.5
Unions
4.2.6
Joins
4.2.7
Pivots
4.3
Combining Actions
4.3.1
Example: Column Splitting
4.3.2
Example: Calculating Percentages
4.4
Using Software
4.4.1
Excel
4.4.2
Tableau
4.5
Exercises
5
Basic Graphs
5.1
Creating a basic graph
5.2
Light pre-processing and adjusting labels
5.3
Exporting the graph to MS Word
5.4
Selecting EPTs is done using the Marks pane
5.5
Creating Histograms, Boxplots, and Regression Lines
5.6
Exercises
6
Graphing Principles
6.1
Elementary Perception Tasks
6.2
Groupings / Gestalt
6.2.1
Grouping Effects
6.2.2
Grouping Examples
6.2.3
Example: Warpbreaks
6.2.4
Example - Federal Spending over Time
6.3
“Color” Scales
6.4
Examples
6.4.1
RobinHood App
6.4.2
Coffee Varieties & Origins
6.4.3
Trade with Britain
6.5
Exercises
7
A Selection of Graph Examples
7.1
Introduction
7.1.1
Example 1
7.1.2
Example 2
7.1.3
Example 3
7.2
Proportions
7.2.1
Single Set
7.3
Multiple Sets of Proportions
7.3.1
Faceted Bar charts
7.3.2
Side-by-Side Stacked Barcharts
7.3.3
Mosiac plots
7.3.4
Alluvial Plots
7.3.5
Tree graphs
7.4
Exercises
8
Plotting with aggregation
8.1
Univariate
8.1.1
Small samples
8.1.2
Histograms
8.1.3
Density plots
8.1.4
Faceting
8.1.5
Stacking
8.1.6
Overlapping curves
8.2
Bivariate (one continuous, one categorical)
8.2.1
Box plots
8.2.2
Ridge Plots
8.2.3
Violin Plots
8.3
Bivariate (two continuous)
8.3.1
Scatter plots
8.3.2
Pairs plots (All-vs-all scatterplots)
8.3.3
Correlation Plots
8.3.4
Overplotting
8.3.5
Regression Lines
8.4
Plot building process
8.5
Exercises
9
Quantitative Insights
10
Reasoning Across Scales
11
Data Journalism
12
Dashboards
13
Graphs Influence Our Thoughts
14
Malicious Uses of Data
STA 141 - Exploratory Data Analysis and Visualization
Chapter 14
Malicious Uses of Data
This will showcase many examples of people using data and graphics fraudulently.