Chapter 11 Exploratory Data Analyses
Phase I of my “statistics” is usually termed “Data Exploration” or “Exploratory Data Analysis”. The goal of this step is to gain valuable insights through the data so that one can know what is going on with the data, which part needs to be cleaned, what new features can be built, build hypotheses to be tested during the model creation/validation phase, or even just knowing some fun facts about the data (src).
A few of my favorite packages to get a glimpse of the data are
11.1 Creating Report with DataExplorer
DataExplorer package allows you to get a preliminary look at your data. It will check for missing data
create_report( # the name of your dataframe df.fa, #y = 'heart_disease', output_dir = 'output', # where do you want it to be saved relative to your project directory output_file = 'data_explorer_fa_report.html', # the filename for the report report_title = 'DTI (FA) Data Description' # the Title of your report )
::ExpNumStat(tbl.desc, round = 1) SmartEDA ExpNumStat( tbl.desc,by = "GA", gp = "Group", Qnt = c(.1, .9), Outlier = TRUE, round = 1 ) ExpNumViz(tbl.desc, target = 'Group') ::dfSummary( summarytools tbl.desc,varnumbers = FALSE, round.digits = 2, plain.ascii = FALSE, style = "grid", graph.magnif = .33, valid.col = FALSE, tmp.img.dir = "img" )