Multivariate Statistical Analysis with R: PCA & Friends making a Hotdog
12/01/2020
1 Introduction
Multivariate Analysis has been developed and evolved through many iterations by many different disciplines. Virtually all scientific domains need to use statistical methods under the Multivariate umbrella to analyze data with more than 1 variable. Thus, Multivariate Analysis has gotten many names and has been customized by many “-metric” disciplines throughout the years. Overall, Multivariate Analysis explore the relationships between observations and/or variables in a multivariate dataset. The strategies commonly found in any Multivariate Analysis include dimension reduction, projections, and singular vector decomposition coupled with visualization and various re-sampling methods like permutation (random), jackknife (leave-one-out/without replacement), or bootstrap (with replacement).
In this short book, we will explore 8 major Multivariate Methods that include Principal Component Analysis (PCA), Barycentric Discriminant Analysis (BADA), Multiple Correspondence Analysis (MCA), Discriminant Correspondence Analysis (MCA), Partial Least Squares Correlation (PLS-C), Multiple Factor Analysis (MFA), Correspondence Analysis (CA), and DiSTATIS. These methods differ on the type of input dataset (quantitative vs qualitative), target of analysis (factor scores, distance, variance, covariance, group means), and implication (features selection, correlation studies, prediction).
Special Thanks to Dr. Abdi, Ju-chi, Brendon, Luke and all members of the Fall 2020 class for your dedication, instructions, and tireless efforts in building and debugging the amazing “Position” family of packages. Amidst the challenges of our times due to the Pandemic and Virtual Learning, I have learned so much from everyone and appreciate every learning movements we have had.
Prerequisite: Readers only need to have prerequisite in basic linear algebra and programming to consume this book. This book only provides a brief overview of background and mathematical theory, and emphasizes more on the application, programming in R and practical aspects of each method.
Packages: ExPosition, InPosition, TExPosition, TInPosition, DistatisR, Factominer, MExposition, PTCA4CATA, data4PCCAR, tidyverse, etc.