# Data Analysis

# A Guide on Data Analysis

## by Mike Nguyen

This is a guide on how to conduct data analysis in the field of data science, statistics, or machine learning. […] 1. APA (7th edition): Nguyen, M. (2020). A Guide on Data Analysis. Bookdown. https://bookdown.org/mike/data_analysis/ 2. MLA (8th edition): Nguyen, Mike. A Guide on Data Analysis. Bookdown, 2020. https://bookdown.org/mike/data_analysis/ 3. Chicago (17th edition): Nguyen, Mike. 2020. A Guide on Data Analysis. Bookdown. https://bookdown.org/mike/data_analysis/ 4. Harvard: Nguyen, M. (2020) A Guide on Data Analysis. Bookdown. Available at: https://bookdown.org/mike/data_analysis/ … Read more →

# 03-data-visualization

## by Lingxiao HE

This is a textbook for a quick start on data analysis in R. […] R下载链接：点击任意站点，进入后，选择下载与操作系统匹配的R软件。 RStudio下载链接 练习题链接 R之所以广受欢迎，很大一个原因就是它拥有种类繁多的“包”以满足我们的多种需求。尽管R自带一些功能强大的“包”，但是我们在需要的时候还得额外安装“包”，否则，调用包就会报错。 这时就需要用到install.packages()函数。例如，当我们需要安装tidyverse包时，可以输入install.packages(“tidyverse”)。 我们如果需要使用某一个“包”，则需要在开始时使用library()函数调用它。例如，我们在安装完tidyverse包后，调用它时可以输入 library(tidyverse)。 … Read more →

# Introduction to Statistics and Data Analysis – A Case-Based Approach

## by Conrad Ziller, University of Duisburg-Essen

A book created with bookdown. […] Suggested citation: Ziller, Conrad (2024). Introduction to Statistics and Data Analysis – A Case-Based Approach. Available online at https://bookdown.org/conradziller/introstatistics To download the R-Scripts and data used in this book, go HERE. This short book is a complete introduction to statistics and data analysis using R and RStudio. It contains hands-on exercises with real data—mostly from social sciences. In addition, this book presents four key ingredients of statistical data analysis (univariate statistics, bivariate statistics, statistical … Read more →

# Meta-Workflow

## by Miao YU

This is a workflow for metabolomics studies. […] This is an online handout for mass spectrometry based metabolomics data analysis. It would cover a full reproducible metabolomics workflow for data analysis and important topics related to metabolomics. Here is a list of topics: This is a book written in Bookdown. You could contribute it by a pull request in Github. A workshop based on this book could be found here. Meanwhile, a docker image xcmsrocker is available for metabolomics reproducible research. R and Rstudio are the software needed in this … Read more →

# Introduction to Data Science

## by Hansjörg Neth

This book provides a gentle introduction to data science for students of any discipline with little or no background in data analysis or computer programming. Based on notions of representation, measurement, and modeling, we examine key data types (e.g., logicals, numbers, text) and learn to clean, summarize, transform, and visualize (rectangular) data. By reflecting on the relations between representations, tasks, and tools, the course promotes data literacy and cultivates reproducible research practices that precede and enable practical uses of programming or statistics. This book is still being written and revised. It currently serves as a scaffold for a curriculum that will be filled with content as we go along. Read more →

# The openair book

## by David C Carslaw, Jack Davison

David C Carslaw Jack Davison This document has been a long time coming. The openair project started with funding from the UK Natural Environment Research Council (NERC) over 10 years ago. The main aim was to fill a perceived gap in that there was a lack of a dedicated set of easily accessible, open source tools for analysing air quality data. At that time R was becoming increasingly popular but far, far less than it is today. The book is split into broad sections that cover common aspects of air quality data analysis. Data Import Mostly focused on the easy access of UK air quality data across … Read more →

# STAT 521B: Topics in Multivariate Analysis

## by Alexander Sharp

Course notes. […] This Bookdown notebook comprises the notes for the course “Theory of Functional Data Analysis with Applications” taught during the Winter 2024 semester, term 2, at UBC. They follow closely the textbook (Hsing and Eubank … Read more →

# Inferential Reasoning in Data Analysis

## by Ben Prytherch

Ben Prytherch People who analyze data are usually interested in something other than the data they analyze. A financial analyst might use patterns and anomalies in market data to create an investment strategy for the upcoming year. A physician might reference data from a randomized controlled trial when deciding what drug to prescribe to a patient. A basketball coach might plan player rotations after looking at data collected from their next opponent’s recent matches. Members of a local board of education might look at data from state standardized tests to decide whether to approve a proposed … Read more →

# An Introduction to Political and Social Data Analysis Using R

## by Thomas M. Holbrook

This book has two purposes: Provide students with a comprehensive, accessible overview of important issues related to political and social data analysis, and, at the same time, provide a gentle introduction to using the R programming environment to address those issues. […] … Read more →

# Meta-analysis Shiny Application Guideline

## by Sangyoung Jung

Sangyoung Jung This application can help data analysis for the meta analysis and data visualization including forest plots and geographical frequency maps. It offers four key benefits: Data Cleaning and Check: The application not only cleanses datasets, preparing elements such as author names for meta-analysis, but also assists in identifying missing values and outliers during data checks. Meta-analysis Model Fitting: It is capable of conducting meta-analysis and moderator analysis with detailed statistics and diagnostic plots. Data Visualization: The application supports data visualization, … Read more →

# Analysing Data using Linear Models

## by Stéphanie M. van den Berg

This is the data analysis textbook used for study programmes at the faculty of BMS at the University of Twente. […] This book is for bachelor students in social, behavioural and management sciences that want to learn how to analyse their data, with the specific aim to answer research questions. The book has a practical take on data analysis: how to do it, how to interpret the results, and how to report the results. All techniques are presented within the framework of linear models: this includes simple and multiple regression models, linear mixed models and generalised linear models. This … Read more →

# Psychometrics in Exercises using R and RStudio

## by Anna Brown

This textbook provides a comprehensive set of exercises for practicing all major Psychometric techniques using R and RStudio. Each exercise includes a worked example illustrating data analysis steps and teaching how to interpret results and make analysis decisions, and self-test questions that readers can attempt to check own understanding. […] This textbook provides a comprehensive set of exercises for practicing all major Psychometric techniques using R and RStudio. The exercises are based on real data from research studies and operational assessments, and provide step-by-step guides that … Read more →

# Introduction to Environmental Data Science

## by Jerry Davis, SFSU Institute for Geographic Information Science

Background, methods and exercises for using R for environmental data science. The focus is on applying the R language and various libraries for data abstraction, transformation, data analysis, spatial data/mapping, statistical modeling, and time series, applied to environmental research. Applies exploratory data analysis methods and tidyverse approaches in R, and includes contributed chapters presenting research applications, with associated data and code packages. Read more →

# Insights and Analyses: A Course Companion

## by Tyler R. Pritchard

Tyler R. Pritchard Report errors, recommendations, or concerns to trpritchard@grenfell.mun.ca. Latest Updates: Jan 2024 Dec 2023 From the university calendar: PSYC 3950 Research Methods and Data Analysis in Psychology III will cover advanced research methods, including survey methods, and supporting statistical concepts and techniques. Designs will include single factor designs and multi-factor designs with both random and fixed factors. Supporting statistical concepts will include analysis of variance (ANOVA) from a linear model perspective, statistical power, and multiple regression, … Read more →

# R Markdown: The Definitive Guide

## by Yihui Xie, J. J. Allaire, Garrett Grolemund

The first official book authored by the core R Markdown developers that provides a comprehensive and accurate reference to the R Markdown ecosystem. With R Markdown, you can easily create reproducible data analysis reports, presentations, dashboards, interactive applications, books, dissertations, websites, and journal articles, while enjoying the simplicity of Markdown and the great power of R and other languages. Read more →

# Data Analysis with R

## by Joseph Fox

Data Analysis with R […] R is an open-source programming language that is popular among statisticians and data scientists. We’ll be using the software RStudio to write and run R code. There are two ways to access RStudio for free. You can choose either of the following options. Download R and RStudio to your own computer. Visit https://posit.co/download/rstudio-desktop/ and click the buttons to start the two required installations. Access Posit Cloud (formerly RStudio Cloud) online. Visit https://posit.cloud/ and click “Get Started,” then choose the free plan on the next page. You’ll be … Read more →

# An Introduction to R Analytics

## by GT CY

This is a blueprint of an introduction to R. […] Welcome to the world of data analysis! “Introduction to R in Data Analytics” is your friendly guide to understanding how to use the R programming language for playing with data. If you’re new to this, don’t worry - we’ve got you covered. This book takes you step by step, teaching you how to make sense of data using R. We’ll show you how to organize information, create cool charts and graphs, and even predict trends from data. You’ll learn all about the powerful tools that R offers for understanding numbers and patterns in data. But we won’t … Read more →

# Analysing CRISPR Screens with edgeR

## by Göknur Giner

This is a book version to write a book. set in the _output.yml file. The HTML output format for this example is bookdown::gitbook, [...] Welcome to the “Analyzing CRISPR Screens with edgeR”. Our aim is to empower researchers like you with the tools and knowledge needed to navigate the complex landscape of CRISPR data analysis. This platform serves as the central hub for a comprehensive guide on leveraging one of the most commonly used differential expression analysis Bioconductor package edgeR, for the analysis of CRISPR screens. Whether you’re delving into CRISPR experiments for the ... Read more →

# Foundations of Statistics

## by Prof Peter Neal and Dr Daniel Cavey

Lecture Notes for Foundations of Statistics […] In this course the fundamental principles and techniques underlying modern statistical and data analysis will be introduced. The course will cover the core foundations of statistical theory consisting of: The course highlights the importance of computers, and in particular, statistical packages, in performing modern statistical analysis. Students will be introduced to the statistical package R as a statistical and programming tool and will gain experience in interpreting and communicating its output. Learning Outcomes A student who completes … Read more →

# STAT 331

## by Ben Prytherch

Ben Prytherch STAT 331, as the title states, is an “applied” statistics course. It is intended for anyone who has taken at least one introductory level statistics course, and who wants to learn more about the use of statistical methods in quantitative research. It covers many statistical tools that are usually considered too advanced for an introductory level class, but are nonetheless very popular. It also provides guidance on making data analysis decisions. Most assignments will involve looking up a published scientific paper for which the data are available and reproducing the main … Read more →

# Spatial transcriptomics data analysis: theory and practice

## by Eleftherios Zormpas, Dr Simon J. Cockell

This book will guide you through the practical steps of the in-person tutorial IP2 for the ISMB/ECCB 2023 conference in Lyon named: Spatial transcriptomics data analysis: theory and practice. […] This book will guide you through the practical steps of the in-person tutorial IP2 for the ISMB/ECCB 2023 conference in Lyon named: “Spatial transcriptomics data analysis: theory and practice”. Recent technological advances have led to the application of RNA Sequencing in situ. This allows for whole-transcriptome characterisation, at approaching single-cell resolution, while retaining the spatial … Read more →

# Workshop: Interactive Data Analysis with Shiny

## by Paul C. Bauer & Jonas Lieth

Paul C. Bauer & Jonas Lieth This document serves as slides and script for the workshop Interactive Data Analysis with Shiny taught by Paul C. Bauer and Jonas Lieth (Gesis, Mannheim, Online, 5-7th of July 2023). Original material is licensed under a Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license. Where we draw on other authors material other licenses may apply (see references in the syllabus as well as the citations and links in the script). For potential future versions of this material see the github repository. If you have feedback or discover errors/dead … Read more →

# Applied longitudinal data analysis in brms and the tidyverse

## by A Solomon Kurz

This project is a reworking of Singer and Willett’s classic (2003) text within a contemporary Bayesian framework with emphasis of the brms and tidyverse packages within the R computational framework. […] This project is based on Singer and Willett’s classic (2003) text, Applied longitudinal data analysis: Modeling change and event occurrence. You can download the data used in the text at http://www.bristol.ac.uk/cmm/learning/support/singer-willett.html and find a wealth of ideas on how to fit the models in the text at https://stats.idre.ucla.edu/other/examples/alda/. My contributions show … Read more →

# Generalized Linear Mixture Model

## by Ying Lu and Marc Scott

This is a minimal example of using the bookdown package to write a book. set in the _output.yml file. The HTML output format for this example is bookdown::gitbook, [...] This is a course in advanced statistical techniques that covers generalized linear models and extensions that are commonly used in health and policy research. Assuming a strong foundation in the general linear model (linear regression and ANOVA) and exposure to the linear mixed model (a.k.a. multilevel models), this course focuses on data analysis that utilizes models for categorical, discrete or limited outcomes, some ... Read more →

# DSCI 335: Inferential Reasoning in Data Analysis

## by Ben Prytherch

DSCI 335: Inferential Reasoning in Data Analysis […] This book is meant to accompany DSCI 335. It is not a complete textbook; you will need to take notes on what you hear in class and what you read throughout the semester. In it, you will find: This book will likely be updated and revised as the semester progresses. Feel free to read ahead, just don’t be surprised if something … Read more →

# Financial Data Science

## by Prof. Dr. Ryan Riordan & Teaching Assistants

This bookdown contains the teaching materials for the projectcourse Financial Data Science at the LMU Munich. The files have been set up by Lisa Kaminski. [...] Here you will find the course pages for the projectcourse Financial Data Science. The course is offered regularly in the summer term and aims at providing in-depth knowledge about the programming language Python and its most important libraries for data analysis. Furthermore, the course introduces the topic of database management and the process of retrieving, aggregating and manipulating data using SQL. Students will learn to ... Read more →

# Doing Bayesian Data Analysis in brms and the tidyverse

## by A Solomon Kurz

This project is an attempt to re-express the code in Kruschke’s (2015) textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] Kruschke began his text with “This book explains how to actually do Bayesian data analysis, by real people (like you), for realistic data (like yours).” In the same way, this project is designed to help those real people do Bayesian data analysis. My contribution is converting Kruschke’s JAGS and Stan code for use in Bürkner’s brms package (Bürkner, 2017, 2018, 2022g), … Read more →

# Data Analysis in Medicine and Health using R

## by Kamarul Imran, Wan Nor Arifin, Tengku Muhammad Hanis Tengku Mokhtar

Data Analysis in Medicine and Health using R […] We wrote this book to help new R programming users with limited programming and statistical background. We understand the struggles they are going through to move from point-and-click statistical software such as SPSS or MS Excel to more code-centric software such as R and Python. From our experiences, frustration sets in early in learning this code-centric software. It often demotivates new users to the extent that they ditch them and return to using point-and-click statistical software. This book will minimise these struggles and gently … Read more →

# tidy[ing] up POL345

## by John Kim

A guide to the tidyverse for POL345 Students. […] POL345 is often Princeton students’ first foray into the programming language R. Through POL345, students gain an introductory overview of R, and programming generally, to conduct basic data analysis on their own. However, many further courses (SML201, SOC306, POL346), along with industry users of R, use the tidyverse instead, a “language” within R to conduct clean, readable data analysis. This book seeks to bridge that gap, revisiting each of the POL345 handouts using the tidyverse to introduce students to this “language within a language”. … Read more →

# R Cookbook in Food Science

## by Zhijun Wang

This is Rmarkdown for R cookbook in food science, code works for data anslysis using R. […] The book aims to provide a basic overview of data science for statistical analysis in Food Science. This book is intended to save students and young scientists from confusions as a starter in data science application. With the development of food science, this major has became a comprehensive discipline including analytical chemistry, biochemistry, nutrition and even basic medicine. The large amount of data greatly increases the difficulty of data analysis and visualization. Many universities and … Read more →

# Programming and Applied Data Visualization with R

## by Dr. Paul C. Bauer (University of Mannheim)

Q: What is your experience with looking at data analysis code you have written 2 years earlier? Comment your code Use meaningful names! A “new” package dplyr written by Hadley Wickham/Romain Francois replaces many old functions for data management Functions in dplyr are highly performant (big data!) and consistent See this page for an excellent overview and the Data Wrangling Cheat Sheet What could the following functions be used for? Hadley Wickhams ggplot2 Package developed into a powerful alternative to the default plot() function. Its goal is to simplify complex plots (e.g. take care of … Read more →

# An Introduction to ggplot2

## by Ozancan Ozdemir

A ggplot2 Tutorial […] Hi! Data Visualization is one of the important steps of the data analysis process. It is actually not only part of the data analysis, but also can be considered as an art. R Programming language provides a powerful visualization package to us, ggplot2. This book aims to show how you can make a well-known statistical plots by using ggplot2, and also how you can improve or customize them. The book is created by the lab notes of statistical computing (STAT 291-STAT 292) of Ozancan Ozdemir. For your opinions and suggestions, please send me an e-mail to … Read more →

# Elon R Data Camp

## by Adam Aiken

These notes cover our three hours together as we learn about using R for data analysis with R Studio. […] What is R and why are you here? We are to spend our time tonight learning about R, R Markdown, and the developer environment that puts these tools together, R Studio. How do these tools fit together? Scriptability, coding, working with our data (\rightarrow) R Reproducible, literate programming with all of our code, narrative, and formatted output in one place (\rightarrow) R Markdown A place to do this (\rightarrow) RStudio Our most important goal: Get R and RStudio running on … Read more →

# Data Analysis

## by Chia-Ching Wu

A book created with bookdown. […] R語言是一個開源（open-source）的程式語言，是用來做資料探勘、統計分析與繪圖的工具。R語言是1990年代初期，奧克蘭大學統計學教授Ross Ihaka與Robert Gentleman共同開發的，在歷經了近三十年的演變後，現在由R核心團隊成員共同維護。 除了R語言之外，常見的統計分析軟體還包括了SPSS、SAS、Stata、Minitab等，甚至Microsoft office裡的Excel試算表也都可以用來做簡單的統計分析和繪圖。那麼，究竟在這麼多軟體下，R語言有什麼優勢，讓很多人選擇它呢？ R語言最大的優勢在於，它是跨平台的免費軟體，而且擁有豐富的資源。 … Read more →

# Data Analysis in R

## by Steve Midway

This is a text that covers the principles and practices of handling and analyzing data. … Read more →

# An Introduction to Bayesian Reasoning and Methods

## by Kevin Ross

This textbook presents an introduction to Bayesian reasoning and methods […] Statistics is the science of learning from data. Statistics involves We will assume some familiarity with many of these aspects, and we will focus on the items in italics. That is, we will focus on statistical inference, the process of using data analysis to draw conclusions about a population or process beyond the existing data. “Traditional” hypothesis tests and confidence intervals that you are familiar with are components of “frequestist” statistics. This book will introduce aspects of “Bayesian” statistics. We … Read more →

# Bridging the gap between service extension and cultural facilitation among ASHAs

## by Oskar Burger, Maciej J. Danko, Faiz Hashmi, Palash Singh, Hannah Lunkenheimer, Emily Little, Micah Goldwater, Tracy Johnson, Cristine Legare

This book covers data analysis and synthesis for the major empirical contributions of Project RISE. Project RISE is a mixed-methods project designed to leverage the power of ritual for understanding the motivation and performance of community health workers in Bihar. […] Project RISE is a collaborative and mixed-methods effort with the goal of improving maternal and newborn health in Bihar, India by designing tools to help the motivation and performance of community health workers. This Report covers data analysis and synthesis for the major empirical sections of Project RISE, including … Read more →

# Portfolio, Churn & Customer Value

## by Hugo Cornet, Pierre-Emmanuel Diot, Guillaume Le Halper, Djawed Mancer

This research paper aims at modelling customer portfolio, churn and customer value. […] This paper is being realized as part of our last year in master’s degree in economics. It aims at studying a firm’s most valuable asset namely its customers. To that end, we adopt a quantitative approach based on econometrics and data analysis with a threefold purpose to : After having defined the subject’s key concepts, we apply duration models and machine learning techniques to a kaggle dataset related to customers of a fictional telecommunications service provider (TSP). Keywords: customer portfolio … Read more →

# Single Cell Multi-Omics Data Analysis

## by Yuting Liu

This book is a collection for pre-processing and visualizing scripts for single cell milti-omics data. The data is downsampled from a real dataset. … Read more →

# Using R in Social Work Research

## by Jerry Bean, College of Social Work, The Ohio State University

This is an example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook. […] Our goal for this document is to illustrate the importance of good data analysis practices and how R and companion packages support these practices. We think the R system has many benefits for social work research. R has become the flagship computing environment for many areas of science and has great appeal because it is free and open-access. In addition, free tools like RStudio and R Markdown promote an a replication commitment and open science philosophy … Read more →

# R @ Ewha (Sunbok Lee)

## by Sunbok Lee copied by 212AIE40 Jiwon Choi

R @ Ewha (Sunbok Lee) […] “In nonrandomized experiments, it is usually only possible to detemine the existence of a relationship between two measurements, but not the underlying mechanism or the reason for it.” It is known that the best way to investigate causal relationship is to conduct randomized experiments. However, unlike in natural science, it is not easy to conduct randomized experiments in social science because of ethical and practical reasons. The fundamental dilemma of data analysis in social science is that we essentially want to make causal statements in the absence of … Read more →

# R for Solving Social Problems

## by Sunbok Lee (Ewha Womans University, 2021-2)

R for Solving Social Problems […] Before talking about R and social problems, let’s talk about the types of data analysis first. @leek2015question categorized data analysis into the 6 types as presented in the table below, and emphasized “mistaking the type of question being considered is the most common error in data analysis.” @leek2015question’s main point is that we should keep in mind the type of question being asked by our own data analysis. In other words, we should say what we can say, not what we want to say. @leek2015question presents a table showing common mistakes “In … Read more →

# Using R for Educational Research

## by Jerry Bean, College of Education and Human Ecology, The Ohio State University

This is an example of using the bookdown package to write a book […] Our goal for this document is to illustrate the importance of good data analysis practices and how R and companion packages support these practices. We think the R system has many benefits for educational research. R has become the flagship computing environment for many areas of science and has great appeal because it is free and open-access. In addition, free tools like RStudio and R Markdown promote a replication commitment and open science philosophy important to our work. One particular strength of R is that it … Read more →

# The Shape of Polarization: A Topological Data Analysis of Congressional Voting Patterns

## by Aidan Toner-Rodgers

The Shape of Polarization: A Topological Data Analysis of Congressional Voting Patterns […] Polarization is a pervasive feature of modern American politics. But has this always been the case? Understanding trends in polarization has been a topic of intense interest in the social sciences, with researchers taking a variety of approaches. The classic strategy has been to use congressional roll call votes and measure the difference in voting patterns between parties (Theriault, 2008; Ladewig, 2010; Shor, 2018; Moskowitz, 2019). More recent work has used text analysis of congressional speech … Read more →

# Do A Data Science Project in 10 Days

## by Gangmin Li

This is a data science project practice book. It was initially written for my Big Data course to help students to run a quick data analytical project and to understand 1. the data analytical process, the typical tasks and the methods, techniques and the algorithms need to accomplish these tasks. During convid19, the unicersity has adopted on-line teaching. So the students can not access to the university labs and HPC facilities. Gaining an experience of doing a data science project becomes individual students self-learning in isolation. This book aimed to help them to read through it and follow instructions to complete the sample propject by themslef. However, it is required by many other students who want to know about data analytics, machine learning and particularly practical issues, to gain experience and confidence of doing data analysis. So it is aimed for beginners and have no much knowledge of data Science. the format for this book is bookdown::gitbook. Read more →

# Using R for Social Work Research

## by Jerry Bean, College of Social Work, The Ohio State University

This is an example of using the bookdown package to write a book […] Our goal for this document is to illustrate the importance of good data analysis practices and how R and companion packages support these practices. We think the R system has many benefits for social work research. R has become the flagship computing environment for many areas of science and has great appeal because it is free and open-access. In addition, free tools like RStudio and R Markdown promote a replication commitment and open science philosophy important to our work. One particular strength of R is that it … Read more →

# COVID Data Analysis

## by Mike Lyons

Analysis of COVID Data from data.ct.gov. […] I am not an epidemiologist, nor am I a professional scientist or proper research professional. I studied Engineering in college quite a few years ago, and work in the cosmetics industry now. I am also a curious citizen and father who wanted to get sense for the prevalence of COVID where I live, in Redding, CT and the surrounding … Read more →

# Computational Genomics with R

## by Altuna Akalin

A guide to computationa genomics using R. The book covers fundemental topics with practical examples for an interdisciplinery audience […] The aim of this book is to provide the fundamentals for data analysis for genomics. We developed this book based on the computational genomics courses we are giving every year. We have had invariably an interdisciplinary audience with backgrounds from physics, biology, medicine, math, computer science or other quantitative fields. We want this book to be a starting point for computational genomics students and a guide for further data analysis in more … Read more →

# R for Fundamental Data Analysis in Market Research

## by Sujata Ramnarayan

Everything you need (and nothing more) to begin to learn R for fundamental data analysis in Market Research […] … Read more →

# DondeRs Group

## by Henrik Eckermann

This bookdown-project contains introductory material to learn the R programming language […] Instructor: My name is Henrik. I am a PhD-candidate in the Developmental Psychobiology lab group at the Donders Institute in Nijmegen. I find that the R programming language is an extremely useful tool for Scientists, especially (but not only) for data analysis and visualization. I can help you learning the basics of the R programming language and how to approach learning a programming language so you can advance in learning whatever is needed in your specific field. Target audience: Anyone at … Read more →

# Causal Inference in Education

## by Anthony Schmidt

Causal Inference in Education […] It is an R-based book of data analysis exercises related to the following three causal inference … Read more →

# Interactive web-based data visualization with R, plotly, and shiny

## by Carson Sievert

A useR guide to creating highly interactive graphics for exploratory and expository visualization. […] This is the website for “Interactive web-based data visualization with R, plotly, and shiny”. In this book, you’ll gain insight and practical skills for creating interactive and dynamic web graphics for data analysis from R. It makes heavy use of plotly for rendering graphics, but you’ll also learn about other R packages that augment a data science workflow, such as the tidyverse and shiny. Along the way, you’ll gain insight into best practices for visualization of high-dimensional data, … Read more →

# The Open Quant Live Book

## by OpenQuants.com

The Open Quant Live Book […] The book aims to be an Open Source introductory reference of the most important aspects of financial data analysis, algo trading, portfolio selection, econophysics and machine learning in finance with an emphasis in reproducibility and openness not to be found in most other typical Wall Street-like references. The Book is Open and we welcome co-authors. Feel free to reach out or simply create a pull request with your contribution! See project structure, guidelines and how to contribute here. First published at: openquants.com. Licensed under Attribution-NonCommer … Read more →

# Course Handouts for Bayesian Data Analysis Class

## by Mark Lai

This is a collection of my course handouts for PSYC 621 class in the 2019 Spring semester. Please contact me [mailto:hokchiol@usc.edu] for any errors (as I’m sure there are plenty of them). […] This is a collection of my course handouts for PSYC 621 class. The materials are based on the book by McElreath (2016), the brms package (Bürkner 2017), and the STAN language. Please contact me for any errors (as I’m sure there are plenty of them). Bürkner, Paul-Christian. 2017. “brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. … Read more →

# Teaching and Learning with Jupyter

## by Lorena A. Barba, Lecia J. Barker, Douglas S. Blank, Jed Brown, Allen B. Downey, Timothy George, Lindsey J. Heagy, Kyle T. Mandli, Jason K. Moore, David Lippert, Kyle E. Niemeyer, Ryan R. Watkins, Richard H. West, Elizabeth Wickes, Carol Willing, and Michael Zingale

A handbook on teaching and learning with Jupyter notebooks. […] Lorena A. Barba, Lecia J. Barker, Douglas S. Blank, Jed Brown, Allen B. Downey, Timothy George, Lindsey J. Heagy, Kyle T. Mandli, Jason K. Moore, David Lippert, Kyle E. Niemeyer, Ryan R. Watkins, Richard H. West, Elizabeth Wickes, Carol Willing, and Michael Zingale This handbook is for any educator teaching a topic that includes data analysis or computation in order to support learning. It is not just for educators teaching courses in engineering or science, but also data journalism, business and quantitative economics, data-based … Read more →

# Data Analysis for Psychology in R (dapR1) - Labs

## by Department of Psychology, University of Edinburgh

This is the page that contains the course labs materials […] Data Analysis for Psychology in R 1 (dapR1) is your first step on the road to being a data, programming and applied statistics guru! This course provides a introduction to data, R and statistics. It is designed to work slowly through conceptual content that form the basis of understanding and working with data to perform statistical testing. At the same time, we will be introducing you to basic programming in R, covering the fundamentals of working with data, visualization and simple statistical tests. The overall aim of the … Read more →

# Readings in applied data science

## by Qiushi Yan

Readings in applied data science […] This project is highly motivated and inspired by stats337 at Stanford University offered by Hadley Wickham, and Data Science with R: A Resource Compendium by Martin Monkman. They both provided great reading materials in data analysis with R, or applied data science in general. Here I attempt to finish one or two papers per week, draw a brief summary, and document my personal … Read more →

# Uber Movement dataset : playing with spatial data

## by Clement Lefevre

Using the Uber Movement dataset, we combine it with the OpenStreetMap data for Berlin. […] Uber released for some cities the datasets of their drivers movement. Those include the OSM way identifier, the mean and standard speed deviation. In order to anonymize them, the data have been aggregated per hour. Let’s have a look at the Berlin data for the month of June 2019, and how they are distributed in space and time. For this, we will combine those data with the OpenStreetMap shapefile for Berlin. Through this book, we will use some concepts of data analysis … Read more →

# How to Build a Shiny Application from Scratch

## by Hadrien@rstudio.com

How to Build a Shiny Application from Scratch […] Shiny is a powerful R package which allows you to create interactive web applications using the R programming language. It is particularly useful for creating applications that run on data and include some sort of data analysis or visualization. In addition to leveraging the power of R and its thousands of packages, one of the big benefits of shiny is the ease of developing applications using R only. Although it is possible to incorporate more traditional web design languages such as custom CSS or Javascript into your shiny application, it … Read more →

# Introduction to Data Exploration and Analysis with R

## by Michael Mahoney

A detailed introduction to coding in R and the process of data analytics. Version 1.0.0 […] Welcome to Introduction to Data Exploration and Analysis in R (IDEAr)! This book is designed as a crash course in coding with R and data analysis, built for people trying to teach themselves the skills needed for most analyst jobs today. You won’t need any past experience with R or data analytics - the aim of the book is to work as a primer for people of all backgrounds. This book is currently being continuously deployed to bookdown.org and GitHub while editing continues. This is so that I can get … Read more →

# Data Analysis and Processing with R based on IBIS data

## by Kevin Donovan

Data Analysis and Processing with R based on IBIS data […] Over the course of my time working with the Carolina Insitute for Developmental Disabilities (CIDD) and the Infant Brain Imaging Study (IBIS) network, I have seen a great interest in learning how to do basic statistical analyses and data processing among the trainees. Specially, there is an interest in learning how to use R, due to its popularity across the sciences and its zero financial cost. As a statistican in training, I feel it is a great benefit for scientists to learn R. It is vital for scientists to understand the … Read more →

# Techincal Analysis with R

## by Ko Chiu Yu

This is an introductory textbook that focuses on how to use R to do technical analysis. […] R is widely used in statistical computation. It is well-suited to do computationally heavy financial analysis. In particular, evaluating performance of trading rule based on technical indicators. Moreover, R can be one-stop solution to the whole procedure of data analysis. A standard procedure of financial data analysis is: You can do all of them inside R without using other software. This short book is a short introduction on how to use R and RStudio to do financial data analysis from the beginning. … Read more →

# Big data and Social Science

## by Paul C. Bauer

Script for the seminar ‘Big Data and Social Science’ at the University of Bern. […] The present document serves both as slides and script for the workshop/seminar Big Data and Social Science. This seminar is taught by Paul C. Bauer at the University of Bern (Fall Semester 2018). The material was developed by Paul C. Bauer and heavily draws on material developed by Pablo Barberà in courses such as Social Media & Big Data Research, Big Data Analysis in the Social Sciences and Automated Collection of Web and Social Data. Any original material and examples is licensed under a Creative Commons … Read more →

# Notes for ST463/ST683 Linear Models 1

## by Katarina Domijan, Catherine Hurley

These are the notes for ST463/ST683 Linear Models 1 course offered by the Mathematics and Statistics Department at Maynooth University. This module is offered at as a part of of MSc in Data Science and Data Analytics. It is an introductory course for students who have basic background in Statistics, Data analysis, R Programming and linear algebra (matrices). […] There are many good resources, e.g. Weisberg (2005), Fox (2005), Fox (2016), Ramsey and Schafer (2002), Draper and Smith (1966). We will use Minitab and R (R Core Team 2017). To create this document, I am using the bookdown package … Read more →

# Course Notes for IS 6489, Statistics and Predictive Analytics

## by Jeff Webb

Course notes for IS 6489. […] These are the course notes for IS 6489, Statistics and Predictive Analytics, offered through the Information Systems (IS) department in the University of Utah’s David Eccles School of Business. This is an exciting time for data analysis! The field has undergone a revolution in the last 15 years with increases in computing power and the availability of “big data” from web-based systems of data collection. “Data science” is the umbrella term that describes the result of this revolution—a new discipline at the intersection of many traditional fields such as … Read more →

# Bayesian Basics

## by Michael Clark m-clark.github.io

This document provides an introduction to Bayesian data analysis. It is conceptual in nature, but uses the probabilistic programming language Stan for demonstration (and its implementation in R via rstan). From elementary examples, guidance is provided for data preparation, efficient modeling, diagnostics, and more. […] Michael Clark m-clark.github.io … Read more →

# Data Science Live Book

## by Pablo Casas

An intuitive and practical approach to data analysis, data preparation and machine learning, suitable for all ages! […] This book is now available at Amazon. Check it out! 📗 🚀. Link to the black & white version, also available on full-color. It can be shipped to over 100 countries. 🌎 The book will facilitate the understanding of common issues when data analysis and machine learning are done. Building a predictive model is as difficult as one line of R code: That’s it. But, data has its dirtiness in practice. We need to sculp it, just like an artist does, to expose its information in order … Read more →

# Introduction to Data Science

## by Rafael A. Irizarry

This book introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression and machine learning and helps you develop skills such as R programming, data wrangling with dplyr, data visualization with ggplot2, file organization with UNIX/Linux shell, version control with GitHub, and reproducible document preparation with R markdown. Read more →

# Modern Data Visualization with R

## by Robert Kabacoff

This is an illustrated guide for creating data visualizations in R. […] This is the online version of “Modern Data Visualization with R”, published by CRC Press. A print version is also available from Amazon. R is an amazing platform for data analysis, capable of creating almost any type of graph. This book helps you create the most popular visualizations - from quick and dirty plots to publication-ready graphs. The text relies heavily on the ggplot2 package for graphics, but other approaches are covered as well. My goal is make this book as helpful and user-friendly as possible. Any … Read more →

# Tidy tools for supporting fluent workflow in temporal data analysis

## by Earo Wang

This is the website for my PhD thesis at Monash University (Australia), titled “Tidy tools for supporting fluent workflow in temporal data analysis”. … Read more →

# What They Forgot to Teach You About R

## by Jennifer Bryan, Jim Hester, Shannon Pileggi, E. David Aja

Jennifer Bryan Jim Hester Shannon Pileggi E. David Aja This book is a work in progress. This book focuses on content intrinsically related to the infrastructure surrounding data analysis in R, but does not delve into the data analysis itself. A holistic workflow provides guidance on project-oriented workflows that address common sources of friction in data analysis. Personal R administration empowers R users to confidently manage their R programming environment. All is Fail showcases functions, options, and RStudio capabilities for debugging code, facilitating more efficient resolution of … Read more →