5 Introduction to Exploratory Data Analysis

Exploratory Data Analysis (EDA) is crucial to understanding the dataset before we move into statistical and predictive modeling. over the next few chapters, we will review data visualizations and common statistical tests including t.test, ks.test, chisq.test, and aov (analysis of variance) for:

  • Chapter 6 - Two Factor Classification with a Single Continuous Feature
  • Chapter 7 - Two Factor Classification with Categorical and Continuous Interactions .

We will explore the relationships between the corresponding p-value’s from these statistical analyses and the impact they have on determining a feature’s importance in simple logistic regression classification models.