Exploratory Visual Data Analysis
Preface
0.1
Pedigogical Plan
0.2
Data Sources
0.3
Introduction
1
What is Data?
1.1
Attributes of Reliable Data
1.1.1
Collected in “good faith”
1.1.2
Representative
1.1.3
Sufficient
1.2
Categories of Data Quality
1.2.1
Anecdotes & Deceptive Evidence
1.2.2
Ad-Hoc Data
1.2.3
Purposefully Designed Studies
1.3
Examples
1.3.1
A Washingtion Post / ABC News Survey
1.4
Exercises
2
Data Variablity
2.1
Individual Chain Reasoning
2.2
Population Trends Reasoning
2.3
Exercises
3
Data Sources
3.1
Government Collected Data
3.1.1
US Census Bureau
3.1.2
US NOAA
3.1.3
US Bureau of Labor Statistics
3.1.4
Data.gov
3.2
Known but not government collected
3.3
Exercises
4
Data Manipulation
4.1
Introduction
4.1.1
Import
4.1.2
Tidying
4.1.3
Cleaning
4.1.4
Use
4.2
Fundamental Actions
4.2.1
Sorting
4.2.2
Subsetting
4.2.3
New Column via from Existing Column(s)
4.2.4
Aggregation
4.2.5
Unions
4.2.6
Joins
4.2.7
Pivots
4.3
Combining Actions
4.3.1
Example: Column Splitting
4.3.2
Example: Calculating Percentages
4.4
Using Software
4.4.1
Excel
4.4.2
Tableau
4.5
Exercises
Basic Graphs
Creating a basic graph
Light pre-processing and adjusting labels
Exporting the graph to MS Word
Selecting EPTs is done using the Marks pane
Creating Histograms, Boxplots, and Regression Lines
5
Graphing Principles
5.1
Elementary Perception Tasks
5.2
Groupings / Gestalt
5.2.1
Grouping Examples
5.2.2
Example: Warpbreaks
5.3
“Color” Scales
5.4
Exercises
6
A Survey of Graph Types
6.1
Introduction
6.2
Example 1
6.3
Example 2
6.4
Example 3
6.5
Proportions
6.6
Single Set
6.6.1
Pie Charts
6.6.2
Stacked Bar
6.6.3
Side-by-side Barchart
6.7
Multiple Sets of Proportions
6.7.1
Faceted Bar charts
6.7.2
Side-by-Side Stacked Barcharts
6.7.3
Mosiac plots
6.7.4
Alluvial Plots
6.7.5
Tree graphs
6.8
Exercises
7
Quantitative Insights
8
Reasoning Across Scales
9
Data Journalism
10
Dashboards
11
Graphs Influence Our Thoughts
12
Malicious Uses of Data
STA 141 - Exploratory Data Analysis and Visualization
Chapter 10
Dashboards
This will introduce dashboards and emphasize
They allow the user to look at different resolutions of the data.
They allow the user to do their own exploration.
This allows for an “unguided” tour.