Exploratory Data Analysis

edav.info, 2nd edition

by Joyce Robbins


Supplemental resources for GR5293 and GR5702 […] This is the brand new start of edav.info 2.0! The first version of edav.info is still available, but will no longer be updated. With this resource, we try to give you a curated collection of tools and references that will make it easier to learn how to work with data in R. Detailed Examples will also be used to show proper exploratory data analysis under different circumstances. This resource is specifically tailored to the GR5293 Statistical Graphics and GR5702 Exploratory Data Analysis and Visualization courses offered at Columbia … Read more →


Introduction to Environmental Data Science

by Jerry Davis, SFSU Institute for Geographic Information Science


Background, methods and exercises for using R for environmental data science. The focus is on applying the R language and various libraries for data abstraction, transformation, data analysis, spatial data/mapping, statistical modeling, and time series, applied to environmental research. Applies exploratory data analysis methods and tidyverse approaches in R, and includes contributed chapters presenting research applications, with associated data and code packages. Read more →


R Exploratory Data Analysis

by Nathan Garrett


R Exploratory Data Analysis […] This site is for the ACCT 426 and BUDA 451 classes in Fall 2023. It has been created by Nathan Garrett (nathan.garrett@mail.wvu.edu) … Read more →



by ggiaever


The bookdown version of these pages is published on my page at intro-to-R and here, RNAseq_analysis. This course will cover several of the statistical concepts and data analytic skills needed to succeed in data-driven life science research as it relates to genomics and the omic sciences. For the bulk of the course we cover topics related to genomics and high-dimensional data. Specifically, we describe multiple testing, error rate controlling procedures, exploratory data analysis for high-throughput data, p-value corrections and the false discovery rate. Here we will cover experimental … Read more →


Cheat sheet til R - 2.0

by Jeppe Aarup Andersen, LIR21


Dette er en bog lavet som en oversigt over funktioner i R der gør det muligt at besvare statistiske og data-analytiske spørgsmål. Bogen er baseret på kurset “Fødevaredataanalyse” hos Københavns Universitet, og er skrevet af Jeppe Aarup Andersen (LIR21). […] Denne bog er lavet som en oversigt over funktioner i R der gør det muligt at besvare statistiske og data-analytiske spørgsmål. Bogen er primært baseret på kurserne “Fødevaredataanalyse” og “Exploratory Data Analysis / Chemometrics” udbudt hos Københavns Universitet, men trækker desuden på en del andre kurser samt en stor interesse for at … Read more →


Painting the Malaysian Covid Public Data

by Azman Hussin and Wan M Hasni


The book is designed primarily for data science and R beginners who want to learn exploratory data analysis (EDA) through visualization in a practical way by working on actual data related to a real problem. We continue to stress these themes in the book; EDA, visualization, actual data, and learning by solving problems (#learnbydoing). We envisage that the book will only have an online version because of the dynamic nature of the problems related to Covid and the increasing data. The Covid pandemic should be of concern to all. Everyone is affected through being infected, constrained by … Read more →


Data Analytics: A Small Data Approach

by Shuai Huang & Houtao Deng

Data Analytics: A Small Data Approach

This book is suitable for an introductory course of data analytics to help students understand some main statistical learning models, such as linear regression, logistic regression, tree models and random forests, ensemble learning, sparse learning, principal component analysis, kernel methods including the support vector machine and kernel regression, etc. Data science practice is a process that should be told as a story, rather than a one-time implementation of one single model. This process is a main focus of this book, with many course materials about exploratory data analysis, residual analysis, and flowcharts to develop and validate models and data pipelines. Read more →


STA 141 - Exploratory Data Analysis and Visualization

by Derek L. Sonderegger


STA 141 - Exploratory Data Analysis and Visualization […] The history of advertisement is full of examples of false advertisement. In the United States, the Federal Trade Commission regulates advertisement and can level fines for deceptive or misleading ads. As a result, the ads typically say true, but misleading facts. The shift to people getting information from social media sources has exacerbated the problem. With hundreds of automated accounts on a media site, a disinformation campaign can continually present their information without suffering any penalty. In order to be compelling, … Read more →


Applied Spatio-temporal Statistics

by Trevor Hefley


Course notes for Applied Spatio-temporal Statistics (STAT 764) at Kansas State University […] This document contains the course notes for Applied Spatio-temporal Statistics at Kansas State University (STAT 764). During the semester we will cover construction and analysis of spatial, time series, and spatio-temporal data sets. Topics include data generation using geographic information systems, exploratory data analysis and visualization, and descriptive and dynamic spatio-temporal statistical … Read more →


Exploratory Data Analysis with R

by Roger D. Peng

Exploratory Data Analysis with R

This book covers the essential exploratory techniques for summarizing data with R. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data you have. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing informative data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data. Read more →


An Incomplete Solutions Guide to the NIST/SEMATECH e-Handbook of Statistical Methods

by Ray Hoobler


Analysis of case studies and exercies with a focus on using the tidyverse and ggplot2. This handbook was created using the bookdown package in RStudio. The output format for this example is bookdown::gitbook. […] Exploratory Data Analysis (EDA) is a philosophy on how to work with data, and for many applications, the workflow is better suited for scientist and engineers. As a scientist, we are trained to formulate a hypothesis and design a series of experiments that allow us to test the hypothesis effectively. Most data, however, doesn’t come from carefully controlled trials, but from … Read more →


APS 135: Introduction to Exploratory Data Analysis with R

by Dylan Z. Childs


Course book for Introduction to Exploratory Data Analysis with R (APS 135) in the Department of Animal and Plant Sciences, University of Sheffield. […] This is the online course book for the Introduction to Exploratory Data Analysis with R component of APS 135, a module taught by the Department and Animal and Plant Sciences at the University of Sheffield. You can view this book in any modern desktop browser, as well as on your phone or tablet device. Dylan Childs is running the course this year. Please email him if you spot any problems with the course book. You will be introduced to the R … Read more →