# Data Analysis

# R Markdown: The Definitive Guide

## by Yihui Xie, J. J. Allaire, Garrett Grolemund

The first official book authored by the core R Markdown developers that provides a comprehensive and accurate reference to the R Markdown ecosystem. With R Markdown, you can easily create reproducible data analysis reports, presentations, dashboards, interactive applications, books, dissertations, websites, and journal articles, while enjoying the simplicity of Markdown and the great power of R and other languages. Read more →

# Geocomputation with R

## by Robin Lovelace, Jakub Nowosad, Jannes Muenchow

A book on geographic data with R. […] This is the online home of Geocomputation with R, a book on geographic data analysis, visualization and modeling. Note: This book has been published by CRC Press in the R Series. The online version of this book is free to read here. Inspired by bookdown and the Free and Open Source Software for Geospatial (FOSS4G) movement, this book is open source. This ensures its contents are reproducible and publicly accessible for people worldwide. The online version of the book is hosted at geocompr.robinlovelace.net and kept up-to-date by Travis, which provides … Read more →

# Teaching and Learning with Jupyter

## by Lorena A. Barba, Lecia J. Barker, Douglas S. Blank, Jed Brown, Allen B. Downey, Timothy George, Lindsey J. Heagy, Kyle T. Mandli, Jason K. Moore, David Lippert, Kyle E. Niemeyer, Ryan R. Watkins, Richard H. West, Elizabeth Wickes, Carol Willing, and Michael Zingale

A handbook on teaching and learning with Jupyter notebooks. […] Lorena A. Barba, Lecia J. Barker, Douglas S. Blank, Jed Brown, Allen B. Downey, Timothy George, Lindsey J. Heagy, Kyle T. Mandli, Jason K. Moore, David Lippert, Kyle E. Niemeyer, Ryan R. Watkins, Richard H. West, Elizabeth Wickes, Carol Willing, and Michael Zingale This handbook is for any educator teaching a topic that includes data analysis or computation in order to support learning. It is not just for educators teaching courses in engineering or science, but also data journalism, business and quantitative economics, … Read more →

# Lab Manual for the RIPL_Effect Research Team (RIPLRT)

## by Authors: The RIPL_Effect Research Team

This book constitute the lab manual for the RIPL_Effect Research Team (RIPLRT). The output format for this example is bookdown::gitbook. […] It looks like you recently joined the RIPL (Respiratory and Immunology Project) Effect Research Lab at Larkin University College of Biomedical Sciences. That’s great! We’re really glad to have you here, and will do what we can to make your time in the lab amazing. We hope you’ll learn a lot about respiratory health and immunology (also population health), develop new skills (coding, data analysis, writing, giving talks), make new friends, and have a … Read more →

# ntpu-programming-for-data-science.utf8.md

## by tpemartin

資料科學程式設計（一） […] This course is to build the foundation for being a data scientist–who masters both data analysis and data engineering. There are two programming languages that will be taught through the course: R and Javascript. R will serve as the data analysis backend, while Javascipt will serve as the communication tool interacting with cloud services–such as Google G Suite. After taking the course, students will be able to create their own data services that can automate routine works and enhance their productivities. … Read more →

# Introduction to Data Exploration and Analysis with R

## by Michael Mahoney

This is a course reader for a class that will never be taught. Hopefully it helps you nonetheless. […] This is a course reader for a hypothetical 3-credit undergraduate class, focusing on getting those with no prior exposure to R up to speed in coding and data analysis procedures. This reader is currently being continuously deployed to bookdown.org and GitHub, particularly as new sections are completed or old ones restructured. This is so that I can get feedback from the small group of people who are using this book to learn R themselves, so I can adjust and adapt the text as needed. If … Read more →

# Big data and Social Science

## by Paul C. Bauer

Script for the seminar ‘Big Data and Social Science’ at the University of Bern. […] The present document serves both as slides and script for the workshop/seminar Big Data and Social Science. This seminar is taught by Paul C. Bauer at the University of Bern (Fall Semester 2018). The material was developed by Paul C. Bauer and heavily draws on material developed by Pablo Barberà in courses such as Social Media & Big Data Research, Big Data Analysis in the Social Sciences and Automated Collection of Web and Social Data. Any original material and examples is licensed under a Creative Commons … Read more →

# Introduction to Data Science

## by Rafael A. Irizarry

This book introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression and machine learning and helps you develop skills such as R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with UNIX/Linux shell, version control with GitHub, and reproducible document preparation with R markdown. Read more →

# Notes for ST463/ST683 Linear Models 1

## by Katarina Domijan, Catherine Hurley

These are the notes for ST463/ST683 Linear Models 1 course offered by the Mathematics and Statistics Department at Maynooth University. This module is offered at as a part of of MSc in Data Science and Data Analytics. It is an introductory course for students who have basic background in Statistics, Data analysis, R Programming and linear algebra (matrices). […] There are many good resources, e.g. Weisberg (2005), Fox (2005), Fox (2016), Ramsey and Schafer (2002), Draper and Smith (1966). We will use Minitab and R (R Core Team 2017). To create this document, I am using the bookdown package … Read more →

# Meta-Workflow

## by Miao YU

This is a workflow for metabolomics studies. […] This is an online handout for data analysis in mass spectrometry based metabolomics. It would cover a full reproducible metabolomics workflow for data analysis and important topics related to metabolomics. Here is a list: This is a book written in Bookdown. You could contribute it by a pull request in Github. R and Rstudio are the softwares needed in this … Read more →

# Bayesian Basics

## by Michael Clark

This document provides an introduction to Bayesian data analysis. It is conceptual in nature, but uses the probabilistic programming language Stan for demonstration (and its implementation in R via rstan). From elementary examples, guidance is provided for data preparation, efficient modeling, diagnostics, and more. […] … Read more →

# Course Notes for IS 6489, Statistics and Predictive Analytics

## by Jeff Webb

Course notes for IS 6489. […] These are the course notes for IS 6489, Statistics and Predictive Analytics, offered through the Information Systems (IS) department in the University of Utah’s David Eccles School of Business. This is an exciting time for data analysis! The field has undergone a revolution in the last 15 years with increases in computing power and the availability of “big data” from web-based systems of data collection. “Data science” is the umbrella term that describes the result of this revolution—a new discipline at the intersection of many traditional fields such as … Read more →

# Scalable Machine Learning and Data Science with Microsoft R Server and Spark

## by Ali Zaidi, Machine Learning and Data Science, Microsoft

These are (tentatively) rough notes showcasing some tips on conducting large scale data analysis with R, Spark, and Microsoft R Server. The focus is primarily on machine learning with Azure HDInsight platform, but review other in-memory, large-scale data analysis platforms, such as R Services with SQL Server 2016, and discuss how to utilize BI tools such as PowerBI and Shiny for dynamic reporting, and report generation. Read more →

# Data Science Live Book

## by Pablo Casas

An intuitive and practical approach to data analysis, data preparation and machine learning, suitable for all ages! […] This book is now available at Amazon. Check it out! 📗 🚀. Link to the black & white version, also available on full-color. It can be shipped to over 100 countries. 🌎 The book will facilitate the understanding of common issues when data analysis and machine learning are done. Building a predictive model is as difficult as one line of R code: That’s it. But, data has its dirtiness in practice. We need to sculp it, just like an artist does, to expose its information in order … Read more →