Regression and Analysis of Variance

by Trevor Hefley

2024-07-24

Course notes for Regression and Analysis of Variance (STAT 705) at Kansas State University for Summer 2024 […] This document contains the course notes for Regression and Analysis of Variance at Kansas State University (STAT 705). During the semester we will cover the basics such as regression and ANOVA modeling, parameter estimation, model checking, inference, and prediction. We may also cover modern topics such as regularization, random effects, generalized linear models, machine learning approaches, and Bayesian regression and … Read more →

1

Surrogates

by Robert B. Gramacy

2024-07-22

Surrogates: a new graduate level textbook on topics lying at the interface between machine learning, spatial statistics, computer simulation, meta-modeling (i.e., emulation), and design of experiments. Gaussian process emphasis facilitates flexible nonparametric and nonlinear modeling, with applications to uncertainty quantification, sensitivity analysis, calibration of computer models to field data, sequential design and (blackbox) optimization under uncertainty. Presentation targets numerically competent scientists in the engineering, physical, and biological sciences. Treatment includes historical perspective and canonical examples, but primarily concentrates on modern statistical methods, computation and implementation in R at modern scale. Rmarkdown facilitates a fully reproducible tour complete with motivation from, application to, and illustration with, compelling real-data examples. Read more →

2

Ordinary Differential Equation Modelling with ‘ecode’

by Haoran Wu, Chen Peng

2024-07-19

ecode, a novel package for modelling ecological populations and communities using ordinary differential equation systems, designed with a user-friendly framework. By following a three-cycle procedure, users can easily construct ecological models and explore their behaviors through a wide range of graphical, analytical, and numerical techniques. The package incorporates advanced techniques such as grid search methods and simulated annealing algorithms, enabling users to iteratively refine their models and achieve accurate predictions. Notably, ecode minimises external dependencies, ensuring … Read more →

3

Analysing Data using Linear Models

by Stéphanie M. van den Berg

2024-07-02

This is the data analysis textbook used for study programmes at the faculty of BMS at the University of Twente. […] This book is for bachelor students in social, behavioural and management sciences that want to learn how to analyse their data, with the specific aim to answer research questions. The book has a practical take on data analysis: how to do it, how to interpret the results, and how to report the results. All techniques are presented within the framework of linear models: this includes simple and multiple regression models, linear mixed models and generalised linear models. This … Read more →

4

Modelling Space and Time with GAMS: spatially and temporally varying coefficient models

by Lex Comber

2024-06-03

This is a workshop introducing GGP-GAMs as a method for undertaking spatially and temporally varying coefficient models […] GAM (General Additive Models) are emerging as the goto approach for all kinds of data science activities. GAMS perform as well or better than most machine learning models and they are relatively fast. They are powerful and quick but critically they offer a middle ground between overly simple but interpretable standard statistical approaches, and efficient but opaque machine leaning algorithms, where it is difficult to understand how one variable relates to an outcome. … Read more →

5

Interpretable Machine Learning

by Christoph Molnar

2024-05-26

Machine learning algorithms usually operate as black boxes and it is unclear how they derived a certain decision. This book is a guide for practitioners to make machine learning decisions interpretable. […] Machine learning has great potential for improving products, processes and research. But computers usually do not explain their predictions which is a barrier to the adoption of machine learning. This book is about making machine learning models and their decisions interpretable. After exploring the concepts of interpretability, you will learn about simple, interpretable models such as … Read more →

6

AUSTRALIAN CANCER ATLAS 2.0 – Technical report

by the Australian Cancer Atlas team

2024-05-24

This eBook provides further details about the methodology used to develop the Australian Cancer Atlas 2.0 […] The Australian Cancer Atlas 2.0 – Technical Report is designed to provide further details about the methodology used to develop the Australian Cancer Atlas 2.0. This technical report includes information about how the data were obtained for the Atlas, including data sources and ethical and data custodian approvals, how the different statistical measures were defined and the Bayesian statistical models used to calculate them, along with details about the methods used to visualise the … Read more →

7

Contrasting Two Family Models

by José Becerra

2024-05-22

Comparing married dual income with married single income families […] In a Q&A format with AI, this narrative addresses the often undervalued work done within homes when we measure economic activity, such as the … Read more →

8

A More Principled Adventure in Topic Models

by Sarah Urbut

2024-05-17

This is a minimal example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook. […] Please Enjoy my book on modified topic models! We call this AladDyn: A dynamic genetically informed and principled way of calculating conditionally updated transitions that inform alternative diseases. Copyright © 2023 Sarah Urbut All rights … Read more →

9

Spatio-Temporal Statistics - Final Portfolio

by Robert Sholl

2024-05-13

This is a the final portfolio for STAT 764 for Robert Sholl […] The following bookdown document contains the required pieces of the portfolio assignment in Spatio-Temporal Statistics at Kansas State University. The structure of this document will be as follows: Assignment 1 Data Prep Longitude models Latitude models Predictions Assignment 2 Data Prep Models Model Checks Assignment 3 Data Prep Data Exploration Variable Prep Grass Model Fitting Grass Model Checking Grass Graphics Crop Model Fitting Crop Model Checking Crop Graphics Closing Discussion Journal Entries Final … Read more →

10

2024 Introduction to Cognitive Modeling

by Chih-Chung Ting

2024-05-06

Welcome to the “code session”. In this website, we will use R to explore the concepts of cognitive modeling introduced in the first hour using R. [...] This week, you will learn basic R functions, including calculations (e.g., +, -, x, /), loops, and logical operations, commonly used in transforming equations into programming language. In addition to these functions, you need to understand the meaning of each notation in the equation and where to find the values for each notation. Usually, this information is explicitly described in the paper. Let’s set aside cognitive models for now ... Read more →

11

lefko3: a gentle introduction

by Richard P. Shefferson

2024-03-30

This book covers the ins and outs of developing and analyzing matrix projection models and integral projection models in R using the CRAN-based package lefko3. It covers all aspects of building and analyzing these models, from life history model development all the way to the development of replicated, stochastic, density dependent projection simulations. [...] All content copyright 2022 Richard P. Shefferson This book is dedicated to the people of Ukraine, who are teaching the world every day that all people have the inherent human right to self-determination. Richard P. Shefferson ... Read more →

12

A Bayesian Introduction to Fish Population Analysis

by Joseph E. Hightower

2024-02-17

This book is intended to be a bridge between traditional fisheries analytical methods and Bayesian statistics. It is a hands-on introduction to models for estimating abundance, survival, growth, recruitment, and population trends. […] This book is based in large part on material I developed while teaching (1991-2014) at NC State University. My hope is that the book will be a bridge between traditional fisheries analytical methods and Bayesian approaches that offer many advantages in ecological modeling. The book might be useful as an upper-level undergraduate or early graduate text, or for … Read more →

13

Innovation Handbook

by Lucas Nelson

2024-01-04

This is an informal collection of resources dedicated to summarizing, discussing, and advancing innovation in Audit Analytics. [...] It would be an understatement to say there is plenty of buzz surrounding the Audit Analytics practice: adopting Databricks has introduced compute resources that were previously unattainable; the recent class of new hires display a highly technical skill set and equally passionate “innovation-set”; our Data Science neighbors’ work using large language models opens up a whole new realm of revolutionizing audit procedures. With this buzz comes a lot of great ... Read more →

14

Bayes meets the Lifetime

by Sarah Urbut

2024-01-04

This is a minimal example of using the bookdown package to write a book. The HTML output format for this example is bookdown::gitbook, set in the _output.yml file. [...] This is a book written in Markdown. Please contact Sarah Urbut for questions. I’m really excited about these latent dirichlet allocation models, specifically for their flexibility in modeling time dependent changes in comorbidity profiles. However, I think much remains to be discovered. In the following pages, I hope you’ll enjoy: First, we discuss the general framework for topic models, and introduce typical notations ... Read more →

15

Statistical Methods for Environmental Mixtures

by Andrea Bellavia

2023-12-29

Statistical Methods for Environmental Mixtures

Humans are simultaneously exposed to a large number of environmental hazards. To allow a more accurate identification of the risks associated with environmental exposures and developing more targeted public health interventions, it is crucial that population-based studies account for the complexity of such exposures as environmental mixtures. This poses several analytic challenges and often requires the use of extensions of standard regression approaches or more flexible techinques for high-dimensional data. This document presents an extended version of the class material that was used in an introductory two-weeks course on statistical approaches for environmental mixtures. The main challanges and limitations of standard regression techniques are outlined, and recent methodological developments are introduced in a rigorous yet non-theoretical way. The course was designed for students and postdocs in environmental health with basic preliminary knoweldge on linear and logistic regression models. Sources and code examples to conduct a thorough analysis in R are also included. Read more →

16

Multi-level Modeling: Nested and Longitudinal data

by Marc Scott

2023-10-11

Multi-level Modeling: Nested and Longitudinal data

This is a minimal example of using the bookdown package to write a book. set in the _output.yml file. The HTML output format for this example is bookdown::gitbook, [...] This is a course on models for multilevel nested data. These data arise in nested designs, which are quite common to education and applied social, behavioral and policy science. Traditional methods, such as OLS regression, are not appropriate in this setting, as they fail to model the complex correlational structure that is induced by these designs. Proper inference requires that we include aspects of the design in the ... Read more →

17

Behavior Analysis with Machine Learning Using R

by Enrique Garcia Ceja

2023-09-27

Behavior Analysis with Machine Learning Using R teaches you how to train machine learning models in the R programming language to make sense of behavioral data collected with sensors and stored in electronic records. This book introduces machine learning concepts and algorithms applied to a diverse set of behavior analysis problems by focusing on practical aspects. Some of the topics include how to: Build supervised models to predict indoor locations based on Wi-Fi signals, recognize physical activities from smartphone sensors, use unsupervised learning to discover criminal behavioral patterns, build deep learning models to analyze electromyography signals, CNNs to detect smiles in images and much more. Read more →

18

Towards building stock decarbonisation: A streamlined algorithm for Measurement and Verification at scale.

by MEng Student, Jamie Williams., MEng Student, Callum Robertson., PhD Student, Karla Gonzalez., Dr Massimiliano Manfren.

2023-07-27

Towards building stock decarbonisation: A streamlined algorithm for Measurement and Verification at scale. […] This project used Highfield Campus, Southampton, as a case study to demonstrate how the energy modelling process can be streamlined at scale. The algorithm shown is able to individually create 3 parameter change point models for each building in a large dataset simultaneously, whilst uniquely selecting the best fitting balance point for each. This document will explain how the code can be replicated or customised for use in other datasets. The full repository can be found on Github here. … Read more →

19

Simulation Models of Cultural Evolution in R

by Alex Mesoudi

2023-06-24

This tutorial shows how to create very simple simulation or agent-based models of cultural evolution in R [...] This tutorial shows how to create very simple simulation or agent-based models of cultural evolution in R (R Core Team 2021). Currently these are: Each model is contained in a separate RMarkdown (Rmd) file. You can either (i) download each of these Rmd files from https://github.com/amesoudi/cultural_evolution_ABM_tutorial then open them in RStudio or another IDE, executing the code as you read the explanatory text, or (ii) read the online version of the tutorial at ... Read more →

20

Applied longitudinal data analysis in brms and the tidyverse

by A Solomon Kurz

2023-06-12

This project is a reworking of Singer and Willett’s classic (2003) text within a contemporary Bayesian framework with emphasis of the brms and tidyverse packages within the R computational framework. […] This project is based on Singer and Willett’s classic (2003) text, Applied longitudinal data analysis: Modeling change and event occurrence. You can download the data used in the text at http://www.bristol.ac.uk/cmm/learning/support/singer-willett.html and find a wealth of ideas on how to fit the models in the text at https://stats.idre.ucla.edu/other/examples/alda/. My contributions show … Read more →

21

Generalized Linear Mixture Model

by Ying Lu and Marc Scott

2023-05-25

This is a minimal example of using the bookdown package to write a book. set in the _output.yml file. The HTML output format for this example is bookdown::gitbook, [...] This is a course in advanced statistical techniques that covers generalized linear models and extensions that are commonly used in health and policy research. Assuming a strong foundation in the general linear model (linear regression and ANOVA) and exposure to the linear mixed model (a.k.a. multilevel models), this course focuses on data analysis that utilizes models for categorical, discrete or limited outcomes, some ... Read more →

22

Applied Bayesian Modeling and Prediction

by Trevor Hefley

2023-04-26

Course notes for Applied Bayesian Modeling and Prediction (STAT 768) at Kansas State University for Spring 2023 semester […] This document contains the course notes for Applied Bayesian Modeling and Prediction (STAT 768) at Kansas State University. During the semester we will cover the basics such as the Bayesian model development, implementation, checking, and inference/prediction. We will focus on formulating and implementing bespoke Bayesian models that are tailored to answer scientific questions or applied problems ranging from environmental management to … Read more →

23

Statistical Modeling II: SDS383D

by Antonio R. Linero

2023-03-22

These notes cover the second semester in a two-semester sequence on statistical modeling. It focuses on constructing, drawing conclusions from, and critiquing probabilistic models. Planned topics include generalized linear models, the bootstrap, hierarchichal models, nonparametric estimation, and generalized estimating equations. [...] In this collection of notes, we briefly outline the main thrust of this course by discussing (at a high level of generality) the topic of probabilistic modeling. Throughout this course we will focus on the application of probabilistic modeling with an eye ... Read more →

24

A Brief Introduction to Bayesian Inference

by Johnny van Doorn

2023-02-14

A brief introduction to Bayesian concepts, based on a beer-tasting experiment. […] This book is still a work in progress If you encounter any errors/issues, you can reach me here. This booklet offers an introduction to Bayesian inference. We look at how different models make different claims about a parameter, how they learn from observed data, and how we can compare these models to each other. We illustrate these ideas through an informal beer-tasting experiment conducted at the University of Amsterdam.1 A key concept in Bayesian inference is predictive quality: how well did a model, or … Read more →

25

Machine Learning Part II

by Dr. Hailiang Du

2023-02-03

These are the course notes for the Machine Learning module (MATH42815) at Durham University. […] Welcome to the material for the second half of the Machine Learning module (MATH42815) at Durham University. These pages consist of relevant lecture notes that will be updated as the course progresses. I would recommend that you use the HTML version of these notes (they have been designed for use in this way), however, there is also a pdf version of these notes. In this second half of the module, we will first look into the simple yet powerful tree-based models and then dive into the famous yet … Read more →

26

Statistical rethinking with brms, ggplot2, and the tidyverse: Second edition

by A Solomon Kurz

2023-01-26

This book is an attempt to re-express the code in the second edition of McElreath’s textbook, ‘Statistical rethinking.’ His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] This ebook is based on the second edition of Richard McElreath’s (2020a) text, Statistical rethinking: A Bayesian course with examples in R and Stan. My contributions show how to fit the models he covered with Paul Bürkner’s brms package (Bürkner, 2017, 2018, 2022j), which makes it easy to fit Bayesian regression models in R (R … Read more →

27

Doing Bayesian Data Analysis in brms and the tidyverse

by A Solomon Kurz

2023-01-26

This project is an attempt to re-express the code in Kruschke’s (2015) textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] Kruschke began his text with “This book explains how to actually do Bayesian data analysis, by real people (like you), for realistic data (like yours).” In the same way, this project is designed to help those real people do Bayesian data analysis. My contribution is converting Kruschke’s JAGS and Stan code for use in Bürkner’s brms package (Bürkner, 2017, 2018, 2022g), … Read more →

28

Statistical rethinking with brms, ggplot2, and the tidyverse

by A Solomon Kurz

2023-01-25

This project is an attempt to re-express the code in McElreath’s textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] I love McElreath’s (2015) Statistical rethinking text. It’s the entry-level textbook for applied researchers I spent years looking for. McElreath’s freely-available lectures on the book are really great, too. However, I prefer using Bürkner’s brms package (Bürkner, 2017, 2018, 2022i) when doing Bayesian regression in R. It’s just spectacular. I also prefer plotting with … Read more →

29

Recoding Introduction to Mediation, Moderation, and Conditional Process Analysis

by A Solomon Kurz

2023-01-25

This ebook is an effort to connect Hayes’s conditional process analysis work with the Bayesian paradigm. Herein I refit his models with my favorite R package for Bayesian regression, Bürkner’s brms, and use the tidyverse for data manipulation and plotting. […] Andrew Hayes’s (2018) text, Introduction to mediation, moderation, and conditional process analysis: A regression-based approach, has become a staple in social science graduate education. Hayes’s work has been from a frequentist OLS perspective. This book is an effort to connect his work with the Bayesian paradigm. Herein I refit his … Read more →

30

Mixed-effects model and its application in neuroscience

by Zhaoxia Yu, Michele Guindani, Steven F. Grieco, Lujia Chen, Todd C. Holmes, Xiangmin Xu

2023-01-17

This is the online supplemental to accomany Yu Z, Guindani M, Grieco SF, Chen L, Holmes TC, Xu X. (2022) Beyond t-Test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron. 110: 21-23. […] This is the online supplemental to accompany Yu Z, Guindani M, Grieco SF, Chen L, Holmes TC, Xu X. (2022) Beyond t-Test and ANOVA: applications of mixed-effects models for more rigorous statistical analysis in neuroscience research. Neuron. 110: 21-23. https://doi.org/10.1016/j.neuron.2021.10.030. The data used in this supplementary can be … Read more →

31

PrioriTree: an Interactive Utility for Improving Geographic Phylodynamic Analyses in BEAST

by Jiansi Gao, Michael R. May, Bruce Rannala, Brian R. Moore

2022-12-20

PrioriTree is an interactive browser utility—distributed as an R package (see https://jsigao.shinyapps.io/prioritree/ for a demo of the utility)—designed to help researchers specify input files for—and process output files from—analyses of biogeographic history performed using the BEAST software package (Drummond et al. 2012; Suchard et al. 2018). The discrete-geographic models implemented in BEAST (Lemey et al. 2009; Edwards et al. 2011) contain many parameters that must be inferred from minimal information (the single geographic area in which each pathogen occurs); inferences under this … Read more →

32

An introduction to using RUV-III-NB: Removing unwanted variation in single cell data

by Alysha M De Livera and Agus Salim

2022-12-06

This is an introduction to using RUV-III-NB (Salim et al. 2022) for removing unwanted variations from single cell RNA sequencing data. […] This is an introduction to using RUV-III-NB for removing unwanted variation from single cell RNA sequencing data (Salim et al. 2022). The first section RUV-III-NB model below briefly explains the statistical methodology behind RUV-III-NB. Readers who are not interested in the methodology can move straight to Preliminary settings. RUV-III-NB models the raw count for gene (g) and cell (c), (y_{gc}), as Negative Binomial (NB), (y_{gc} \sim … Read more →

33

Supervised Machine Learning

by Michael Foley

2022-12-05

These are my personal notes related to supervised machine learning techniques. […] Machine learning (ML) develops algorithms to identify patterns in data (unsupervised ML) or make predictions and inferences (supervised ML). Supervised ML trains the machine to learn from prior examples to predict either a categorical outcome (classification) or a numeric outcome (regression), or to infer the relationships between the outcome and its explanatory variables. Two early forms of supervised ML are linear regression (OLS) and generalized linear models (GLM) (Poisson and logistic regression). These … Read more →

34

An Introduction to Statistical Learning with the tidyverse and tidymodels

by Taylor Dunn

2022-10-24

Working through ISLR with the tidyverse and tidymodels […] I am a data scientist and statistician who is (mostly) self-taught from textbooks and generous people sharing their work online. Inspired by projects like Solomon Kurz’s recoding of Statistical Rethinking and Emil Hvitfeldt’s ISLR tidymodels labs, I decided to publicly document my notes and code as I work through An Introduction to Statistical Learning, 2nd edition by Gareth James, Daniela Witten, Trevor Hastie, and Robert Tibshirani. I prefer to work with the tidyverse collection of R packages, and so will be using those to wrangle … Read more →

35

Modern R with the tidyverse

by Bruno Rodrigues

2022-10-24

This book will teach you how to use R to solve your statistical, data science and machine learning problems. Importing data, computing descriptive statistics, running regressions (or more complex machine learning models) and generating reports are some of the topics covered. No previous experience with R is needed. […] I have been working on this on and off for the past 4 years or so. In 2022, I have updated the contents of the book to reflect updates introduced with R 4.1 and in several packages (especially those from the {tidyverse}). I have also cut some content that I think is not that … Read more →

36

Rethinking Companion

by Wade VanderWright

2022-09-30

This is a minimal example of using the bookdown package to write a book. The HTML output format for this example is bookdown::gitbook, set in the _output.yml file. [...] This is a companion book written in Markdown for McElreath’s Statistical Rethinking (2020). You can set up your R console by running: The Golem of Prague and statistical golems (models) are powerful but lack wisdom. As McElreath tells us, there are many kinds of golems and figuring out how to build the one you need to carry out the task at hand can be tricky. Figure 1.1 In addition, novel research often requires novel ... Read more →

37

Model Based Sampling with EMC2 - Extended Models of Choice

by Reilly Innes, Niek Stevenson, Russell Boag, Andrew Heathcote

2022-09-12

This bookdown provides a tutorial on how to use EMC2. […] This project provides a tutorial for learning about cognitive models, namely evidence accumulation models (EAMs), their implementation and their applications. The tutorial was prepared by Andrew Heathcote, Dora Matzke, Russell Boag, Niek Stevenson and Reilly Innes. The GitHub repository for this project can be found here. The accompanying samples for the code provided here can be found at here. Throughout this guide, we’ll use some complex terminology for different aspects of modelling. We know that models exist, and in particular, … Read more →

38

Practical 1 - Exploring relationships and fitting linear models

by Team in Room 420 - Megan Ruffle, Naphon Olley, Jennifer James, William Ryan

2022-08-04

This is an example of using the bookdown package to write a ‘book’ for a lab. The output format for this example is bookdown::gitbook. […] Intended learning outcomes: produce scatterplots of report quality; fit linear regression models by using lm; interpret least squares estimates of model parameters; calculate least squares estimates of model parameters for simple linear regression. In the lectures we have examined relationships between variables using scatterplots. We have considered how to fit a linear model, through estimating the parameters such as (\beta) in the linear model (Y_i = … Read more →

39

Mixed BART Models: maths and discussion

by Bruna Wundervald

2022-08-02

Mixed BART Models: maths and discussion […] This documents works as a summary of our work towards building a “Mixed model” style BART. All the maths in detail will be written here, as well as our simulation … Read more →

40

Tools for human-screen interactions

by cjlortie

2022-05-31

An adaptationist programme for digital work and life […] Screens are a portal to information and to one another. Nearly 5 billion people as of 2022 use the internet. Treating screens and digital time only as a pathology neglects the inherent capacity for screens as a tool to promote higher levels of performance and novel approaches to problem solving. Mental models and the associated cognitive architecture that we frame conceptually to decision making is critical for better choices. The screen adaptation theory (SAT) is proposed herein as a heuristic to enable individuals to use evidence … Read more →

41

Categorical Regression in Stata and R

by Rose Werth

2022-05-19

This website contains lessons and labs to help you code categorical regression models in either Stata or R. […] This website houses all the information you need learn the basics of coding a number of different categorical and count models in Stata and R. It will not contain all the information taught in class, but will allow you to bridge that knowledge into running these models on your own. The Stata labs on this website were adapted from materials by Ewurama Okai. This course will contain 8 labs and an optional review lab at the end of the course. Our lab sessions will alternate between … Read more →

42

Primer on Mathematical Statistics

by Peter K. Dunn

2022-03-31

A primer of mathematical statistics, before reading generalized linear models […] This book is a primer on basic mathematical statistics and matrices. This book was originally prepared for students about to undertake MTH301 Reading in Advanced Mathematics at USC, and reading Generalized Linear Models with Examples in R (by Dunn & … Read more →

43

Data Analytics: A Small Data Approach

by Shuai Huang & Houtao Deng

2022-01-02

This book is suitable for an introductory course of data analytics to help students understand some main statistical learning models, such as linear regression, logistic regression, tree models and random forests, ensemble learning, sparse learning, principal component analysis, kernel methods including the support vector machine and kernel regression, etc. Data science practice is a process that should be told as a story, rather than a one-time implementation of one single model. This process is a main focus of this book, with many course materials about exploratory data analysis, residual analysis, and flowcharts to develop and validate models and data pipelines. Read more →

44

Discovering Structural Equation Modeling Using Stata R, the Tidyverse, and Lavaan

by Nicholas R. Jenkins

2021-12-27

This project replicates the Stata code in Acocks’s (2013) text with R, the tidyverse, and lavaan. […] This project is a guide through Alan Acock’s Discovering Structural Equation Modeling Using Stata in R using the Tidyverse, and lavaan packages. The data needed to replicate the analyses in the book can be found on the book’s website here: Discovering Structural Equation Modeling Using Stata. My goal is to show how to fit these models in R and visualize their results. This is also, ver much, a work in progress. I assume that you have some familiarity with R and the tidyverse and won’t spend … Read more →

45

Step by Step I: Linear Models

by Sérgio Moreira

2021-12-04

My Coding Index assembles and organizes in one place all the relevant code and online resources I have been using to teach RStudio and in my professional practice. This file is a working document and will be regularly updated with reviews and new contents. […] Step by Step I assembles in one place the tutorials I have been using to teach and apply to practice simple, multiple and hierarchical linear models. This file is a working document and will be regularly updated with reviews and new contents. My name is Sérgio. I have a PhD in Social Psychology, I am a consultant on social issues in … Read more →

46

EDM - Dengue and Urbanization

by Gabriel Carrasco-Escobar

2021-12-03

EDM - Dengue and Urbanization […] These codes are based on Sugihara 2012 paper and code repository. Documentation could be found here. References Lorenz chaotic attractor Constructing Empirical Dynamic Models: Taken’ Theorem State Space Reconstruction: Convergent Cross … Read more →

47

Portfolio, Churn & Customer Value

by Hugo Cornet, Pierre-Emmanuel Diot, Guillaume Le Halper, Djawed Mancer

2021-11-28

This research paper aims at modelling customer portfolio, churn and customer value. […] This paper is being realized as part of our last year in master’s degree in economics. It aims at studying a firm’s most valuable asset namely its customers. To that end, we adopt a quantitative approach based on econometrics and data analysis with a threefold purpose to : After having defined the subject’s key concepts, we apply duration models and machine learning techniques to a kaggle dataset related to customers of a fictional telecommunications service provider (TSP). Keywords: customer portfolio … Read more →

48

Time Series Analysis

by Michael Foley

2021-11-06

Time series analysis using R. […] These notes are based on the Time Series with R skill track at DataCamp and Rob Hyndman’s Forecasting: Principles and Practice (Rob J Hyndman 2021). I organized them into a section on working with a tsibble (time series tibble) (chapter 1), a section on data exploration (chapter 2), and then four sections on models. Forecasts aren’t necessarily based on time series models - you can perform a cross-sectional regression analysis of features, possibly including time-related features such as month of year (chapter 3). Time series forecasts are a specific type … Read more →

49

BS0004 Code Annotations

by Kevin, Kevin, and Kevin

2021-11-04

This is a minimal example of using the bookdown package to write a book. The HTML output format for this example is bookdown::gitbook, set in the _output.yml file. [...] This document documents (no pun intended) my code and my workflow for the cancer dataset that Sean found a couple of days ago. I intend to - using the dataset - build several supervised machine learning classifiers to predict the status of cancer patients. I will then evaluate the performance of these models using the content taught in week 9; Furthermore, if possible, I think I will also try to perform GO term ... Read more →

50

The Association Between Travel and Urban Form

by Shen Qu

2021-11-02

This is a field paper using the bookdown package. The output format for this example is bookdown::gitbook. […] This field paper is for discussing the relationship between travel and urban form.1 The initial motivation is curious about how the travel distance is affected by urban densities. Is this relationship existed universally or just context-dependent? Are these models in literature replicable or reproducible? Part I reviews the related literature and tries to cover the main theories and research in this field. Travel patterns or behaviors as the variable of interest, is associated with … Read more →

51

Introduction to R for Econometrics

by Kieran Marray (Tinbergen Institute)

2021-10-25

Introduction to R for Econometrics […] This is a short introduction to R to go with the first year econometrics courses at the Tinbergen Institute. It is aimed at people who are relatively new to R, or programming in general.1 The goal is to give you enough of knowledge of the fundamentals of R to write and adapt code to fit econometric models to data, and to simulate your own data, working alone or with others. You will be able to: read data from csv files, plot it, manipulate it into the form you want, use sets of functions others have built (packages), write your own functions to compute … Read more →

52

Data Skills for Reproducible Science

by psyteachr.github.io

2021-08-22

This course provides an overview of skills needed for reproducible research and open science using the statistical programming language R. Students will learn about data visualisation, data tidying and wrangling, archiving, iteration and functions, probability and data simulations, general linear models, and reproducible workflows. Learning is reinforced through weekly assignments that involve working with different types of data. Read more →

53

A Quick Introduction to bbsBayes

by Adam C. Smith, and Brandon P.M. Edwards

2021-08-11

This is a document to support a ~2hr workshop. […] This is a 2-hour introductory workshop/demonstration of the R-package bbsBayes (https://github.com/BrandonEdwards/bbsBayes). This package allows anyone to apply the hierarchical Bayesian models used to estimate status and trends from the North American Breeding Bird Survey. The package also lets the user generate a suite of alternative metrics using the existing model output from the annual CWS analyses. Everyone is welcome! Some familiarity working with R is required if you’d also like to run the code yourself during the workshop, and the … Read more →

54

Recitation 7 Note

by Yiming Gong

2021-08-05

This is the class note for recitation 7. […] Note that the plot might be different. In this plot, the simulated line is heavily influenced by the extreme value at around ((9,-10)). One way to make linear models more robust is to use a different distance measure. For example, instead of root-mean-squared distance, you could use mean-absolute distance: We can see that the simulated line is less affected by the extreme value. One challenge with performing numerical optimization is that it’s only guaranteed to find one local optimum. What’s the problem with optimizing a three parameter model … Read more →

55

Rasch Measurement Theory Analysis in R: Illustrations and Practical Guidance for Researchers and Practitioners

by Stefanie Wind & Cheng Hua

2021-07-21

The purpose of the proposed book is to illustrate techniques for conducting Rasch measurement theory analyses using existing R packages.The book will include some background information about Rasch models, but the primary objective will be to demonstrate how to apply the models to data using R packagesand interpretthe results. The academic level for this book is graduate students or professionals who have already been exposed to Rasch measurementtheory. Read more →

56

An Introduction to Game Theory

by Yuleng Zeng

2021-07-21

This is an introduction to Game Theory. The project started when I sat in on Tobias Heinrich’s class (POLI 725: International Conflict) in Fall 2019. I was given the opportunity to provide an introduction to basic game theory concepts and methods. Thank again to Toby for the trust and the opportunity. My intention is to build upon the short introduction and potentially expand it into course, with a heavy focus on models used in International Relations. If you have suggestions or find any errors, please do shoot me an … Read more →

57

Lecture 2 Note

by Yiming

2021-07-17

This is the class note for lecture 2. […] There is an extensive range of packages in R. For collecting and analyzing financial time series, some of the packages we will use include: Financial data collection from internet (tidyquant) Time series (xts,zoo) Non-linear volatility models (rugarch) Regime modeling (fxregime) The first package (tidyquant) facilitates collecting financial data from the internet sites: https://fred.stlouisfed.org/(FRED) Interest rates of US Government Bonds/Bills Interest rates of Foreign governments Foreign exchange rates Commodities: West-Texas-Intermediate Crude … Read more →

58

Editable Distributed Hydrological Model

by Kan Lei

2021-04-29

This is a minimal example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook. […] This document is the use guide for EDHM and some other concept about the hydrological models (HM) building. In Chapter 1 explain the basic concept of hydrological cycle and the important concept and idea of EDHM. In chapter 2 show the workflow of using a hydrological model with EDHM and the way to explain a new model. Chapter 3 and 4 show the basic information, e.g. input data, parameters and output data of every module or model. EDHM is a R package for … Read more →

59

Building Pension Models and Actuarial Tools Using R and R Shiny

by Tommy Cornally

2021-04-08

Building Pension Models and Actuarial Tools Using R and R Shiny […] This interactive report has been created to demonstrate the functionality of R Markdown and Bookdown, as an additional tool that actuaries and others can avail of. Each page of the application (with the exception of the combined SORP Calculator and Drawdown Simulation) has been rebuilt using miniUI [1], an R library used to create Shiny Widgets. However, these “mini-apps” do not fully replicate the main application. Similarly, this interactive report does not intend to replace the original - it merely serves to compliment … Read more →

60

Beyond Multiple Linear Regression

by Paul Roback and Julie Legler

2021-01-26

An applied textbook on generalized linear models and multilevel models for advanced undergraduates, featuring many real, unique data sets. It is intended to be accessible to undergraduate students who have successfully completed a regression course. Even though there is no mathematical prerequisite, we still introduce fairly sophisticated topics such as likelihood theory, zero-inflated Poisson, and parametric bootstrapping in an intuitive and applied manner. We believe strongly in case studies featuring real data and real research questions; thus, most of the data in the textbook arises from collaborative research conducted by the authors and their students, or from student projects. Our goal is that, after working through this material, students will develop an expanded toolkit and a greater appreciation for the wider world of data and statistical modeling. Read more →

61

MODELS TO INFORM THE SAFE COLLECTION AND TRANSFUSION OF DONATED BLOOD

by W. Alton Russell

2021-01-18

Doctoral dissertation. […] Copywrite 2021 by W. Alton Russell. All Rights Reserved. Re-distributed by Stanford University under license with the author. Donated blood is a critical component of health systems around the world, but its collection and transfusion involve risk for both donors and recipients. Transfusion-transmitted diseases and non-infectious transfusion-related adverse events pose a risk to transfusion recipients, and repeat blood donation can cause or exacerbate iron deficiency among donors. This dissertation describes four decision-analytic modeling projects that elucidate … Read more →

62

Analysis

by vincentsaudanba

2020-12-15

Analysis […] Does money buy everything? This is a question that we can ask ourselves and especially when we talk about the movie industry where money, success and appearance are often highlighted. Through this project, we want to see if there is a relationship between the budget allocated to a specific movie and the profits generated by it. We will also compare this relationship through several aspects such as movie genres, main characters & production studios. After our analysis, our models showed a significant positive relationship between budget & profits. This relationship differs when … Read more →

63

Explanatory Model Analysis

by Przemyslaw Biecek and Tomasz Burzykowski

2020-12-12

This book introduces unified language for exploration, explanation and examination of predictive machine learning models. […] … Read more →

64

ECO 397 Book Review

by Clayton Engelby

2020-12-09

ECO 397 Book Review […] Cathy O’Neill’s book “Weapons of Math Destruction” is an analysis of big data and its use of machine learning programs that aim to maximize market efficiency. In the process of doing so, she coins the initialism WMDs as logical flaws in the models that skew results in one way or another. Her argument focuses on the fact that more often than not, these failures result in the worsening of ongoing structural violence and only add fuel to the fire for recidivism rates, bankruptcies, mortgage defaults, college dropouts, and health-related deaths. While there is absolutely … Read more →

65

oSCR vignettes

by Chris Sutherland, Dan Linden & Gates Dupont

2020-08-13

This bookdown is where we will collate oSCR vignettes. […] The goal of this bookdown is to provide a complete overview of the theory and methodology of two topics (at this point) within spatial capture-recapture: More broadly, by providing them with a thorough discussion of these advanced topics, we aim to empower our users to apply these tools in their own research. The main function in oSCR performs likelihood analysis of several classes of spatial capture-recapture (SCR) models. There are also a suite of helper functions for formatting and processing data objects. Here are a few of the … Read more →

66

Study design for spatial capture-recapture

by Chris Sutherland

2020-06-17

This book accompanies of the R package oSCR with a specific focus the design of spatial capture-recapture studies and details of the oSCR function scrdesignGA(). […] Why we did this The main function in oSCR does likelihood analysis of several classes of spatial capture-recapture (SCR) models. THere are also a suite of helper fnctions for formatting and processing data objects. Here are a few of the things that motivated our development of the package: So, using this book of course requires that the oSCR package is loaded: But you will also need a few others: If you have any issues or … Read more →

67

MatrixOptim

by Edward J. Xu

2020-05-18

Data-Driven Decision Making under Uncertainty in Matrix. […] Mathematical programmes are among the most widely used models in operational research and management science. (Williams 2013) Williams, H Paul. 2013. Model Building in Mathematical Programming. John Wiley & … Read more →

68

Exploratory Data Analysis with R

by Roger D. Peng

2020-05-01

This book covers the essential exploratory techniques for summarizing data with R. These techniques are typically applied before formal modeling commences and can help inform the development of more complex statistical models. Exploratory techniques are also important for eliminating or sharpening potential hypotheses about the world that can be addressed by the data you have. We will cover in detail the plotting systems in R as well as some of the basic principles of constructing informative data graphics. We will also cover some of the common multivariate statistical techniques used to visualize high-dimensional data. Read more →

69

A notebook Paper Causal Effect Inference with Deep Latent-Variable Models

by Trong-Thang Pham

2020-04-12

This is a notebook I write about CEVAE original paper. […] This is a notebook I write about CEVAE original paper (Louizos et al. 2017). Some grammar mistakes are going to be made. The authors’ source code: https://github.com/AMLab-Amsterdam/CEVAE Here is some background knowledge that will help you to understand the whole paper. cond() is a notation I made up to stand for “is conditioned on”. For example, y is conditioned on x, then I call it y cond(x). Actually using y|x is more elegant but if we talking normally without the y, using | or |x is too short, in my opinion. do() is represented … Read more →

70

Hands-On Machine Learning with R

by Bradley Boehmke & Brandon Greenwell

2020-02-01

A Machine Learning Algorithmic Deep Dive Using R. […] This book is sold by Taylor & Francis Group, who owns the copyright. The physical copies are available at Taylor & Francis and Amazon. Welcome to Hands-On Machine Learning with R. This book provides hands-on modules for many of the most common machine learning methods to include: You will learn how to build and tune these various models with R packages that have been tested and approved due to their ability to scale well. However, our motivation in almost every case is to describe the techniques in a way that helps develop intuition for … Read more →

71

Recoding Introduction to Mediation, Moderation, and Conditional Process Analysis

by A Solomon Kurz

2019-12-21

This project is an effort to connect his Hayes’s conditional process analysis work with the Bayesian paradigm. Herein I refit his models with my favorite R package for Bayesian regression, Bürkner’s brms, and use the tidyverse for data manipulation and plotting. […] Andrew Hayes’s Introduction to Mediation, Moderation, and Conditional Process Analysis text, the second edition of which just came out, has become a staple in social science graduate education. Both editions of his text have been from a frequentist OLS perspective. This project is an effort to connect his work with the Bayesian … Read more →

72

Course Handouts for Bayesian Data Analysis Class

by Mark Lai

2019-12-14

This is a collection of my course handouts for PSYC 621 class in the 2019 Spring semester. Please contact me [mailto:hokchiol@usc.edu] for any errors (as I’m sure there are plenty of them). […] This is a collection of my course handouts for PSYC 621 class. The materials are based on the book by McElreath (2016), the brms package (Bürkner 2017), and the STAN language. Please contact me for any errors (as I’m sure there are plenty of them). Bürkner, Paul-Christian. 2017. “brms: An R Package for Bayesian Multilevel Models Using Stan.” Journal of Statistical Software 80 (1): 1–28. … Read more →

73

Bayesian Hierarchical Models in Ecology

by Steve Midway

2019-10-28

This is a book that is build on lectures from a course of the same name. […] Welcome to Bayesian Hierarchical Models in Ecology. This is an ebook that is also serving as the course materials for a graduate class of the same name. There will be numerous and on-going changes to this book, so please check back. And don’t hesistate to email me if you have questions, comments, or for anything else. To start, let’s calrify the title of this text—it should be Hierarchical Models in Ecology Using Bayesian Inference. A Bayesian Hierarchical Model is more a term of convenience than accuracy, as … Read more →

74

Introduction to Time Series Analysis and Forecasting in R

by Tejendra Pratap Singh

2019-08-19

Scripts from the online course on Time Series and Forecasting in R. […] Selecting the model. Due to seasonality involved, simple models will not be able to capture it. We therefore use the seasonal ARIMA and exponential smoothing models. Exponential smoothing models have seasonality built in it by construction. Complex models like mixed models and neural nets will be an overkill. … Read more →

75

Common statistical tests are linear models: a work through

by Steve Doogue

2019-07-09

This is a minimal example of using the bookdown package to write a book. The output format for this example is bookdown::gitbook. […] This is a reworking of the book Common statistical tests are linear models (or: how to teach stats), written by Jonas Lindeløv. The book beautifully demonstrates how many common statistical tests (such as the t-test, ANOVA and chi-squared) are special cases of the linear model. The book also demonstrates that many non-parametric tests, which are needed when certain test assumptions do not hold, can be approximated by linear models using the rank of values. … Read more →

76

Seeing through the developping lens:

by Paul Langard

2019-05-21

Seeing through the developping lens: […] Through this project, we aim to decipher post-transcriptional regulation network in the developping lens. In the past decades, post-transcriptional gene regulation (PTGR) was shown to be of particular importance in the developping lens. Indeed, the alteration of PTGR network can result in abnormal development of the lens, of the eye. For example, mutations in RNA binding proteins such as Celf1, Stau2, Tdrd7 has been associated to eye’s defects in animal models. mutation in RNA binding protein Tdrd7 was associated with juvenile cataract in human and … Read more →

77

Statistical Rethinking with brms, ggplot2, and the tidyverse

by A Solomon Kurz

2019-05-05

This project is an attempt to re-express the code in McElreath’s textbook. His models are re-fit in brms, plots are redone with ggplot2, and the general data wrangling code predominantly follows the tidyverse style. […] I love McElreath’s Statistical Rethinking text. It’s the entry-level textbook for applied researchers I spent years looking for. McElreath’s freely-available lectures on the book are really great, too. However, I prefer using Bürkner’s brms package when doing Bayeian regression in R. It’s just spectacular. I also prefer plotting with Wickham’s ggplot2, and coding with … Read more →

78

MODELING MELODIC DICTATION

by David John Baker

2019-04-15

This dissertation models both individual and musical features that contribute to processes involved in melodic dictation. […] All students pursuing a Bachelor’s degree in Music from universities accredited by the National Association of Schools of Music must learn to take melodic dictation (NASM 2019, sec. VIII.6.B.2.A). Melodic dictation is a cognitively demanding process that requires students to hear a melody, then without any access to an external reference, transcribe the melody within a limited time frame. As of 2019, there are 643 Schools of Music belonging to National Association of … Read more →

79

BIOL3360 - Analysis and Communication of Biological Data:

by janengelstaedter

2019-04-05

This online textbook contains learning material for the UQ (The University of Queensland) course BIOL3360: Analysis and Communication of Biological Data. This book is organised with each chapter corresponding to lectures from the Mathematical Modelling component of the course. This book contains many code chunks that can be copied and pasted into an R console to create Shiny apps of the models being discussed. Content and figures were created by Jan Engelstädter. Online version including Shiny apps were created by Nicole Fortuna. Read more →

80

UPR-PRISE Data Science Workshop 01/26/2019

by Felix E. Rivera-Mariani, PhD

2019-01-25

This manual is part of data science workshop titled GPS of Data Analytics: Making the Witness (the Data) Confess. The output format for was elaborated with bookdown::gitbook. […] Welcome to the data science workshop titled The GPS of Data Analytics: Making the Witness (the Data) Confess. In this workshop, sponsored by the University of Puerto Rico Ponce Research Initiative for Scientific Enhancement, students will learn and implement different aspects of data science, from establishing a set of tools necessary to carry out data science to deploying statistical models through coding, … Read more →

81

Notes for ST463/ST683 Linear Models 1

by Katarina Domijan, Catherine Hurley

2018-11-12

These are the notes for ST463/ST683 Linear Models 1 course offered by the Mathematics and Statistics Department at Maynooth University. This module is offered at as a part of of MSc in Data Science and Data Analytics. It is an introductory course for students who have basic background in Statistics, Data analysis, R Programming and linear algebra (matrices). […] There are many good resources, e.g. Weisberg (2005), Fox (2005), Fox (2016), Ramsey and Schafer (2002), Draper and Smith (1966). We will use Minitab and R (R Core Team 2017). To create this document, I am using the bookdown package … Read more →

82

Graphical & Latent Variable Modeling

by Michael Clark m-clark.github.io

2018-09-15

This document focuses on structural equation modeling. It is conceptually based, and tries to generalize beyond the standard SEM treatment. It includes special emphasis on the lavaan package. Topics include: graphical models, including path analysis, bayesian networks, and network analysis, mediation, moderation, latent variable models, including principal components analysis and ‘factor analysis’, measurement models, structural equation models, mixture models, growth curves, item response theory, Bayesian nonparametric techniques, latent dirichlet allocation, and more. Read more →

83

Generalized Additive Models

by Michael Clark m-clark.github.io

2024-07-25*

An introduction to generalized additive models (GAMs) is provided, with an emphasis on generalization from familiar linear models. It makes extensive use of the mgcv package in R. Discussion includes common approaches, standard extensions, and relations to other techniques. More technical modeling details are described and demonstrated as well. […] Michael Clark m-clark.github.io … Read more →

84

ISLR tidymodels labs

by Emil Hvitfeldt

2024-07-25*

Emil Hvitfeldt This book aims to be a complement to the 2nd edition An Introduction to Statistical Learning book with translations of the labs into using the tidymodels set of packages. The labs will be mirrored quite closely to stay true to the original material. All listed changes will be relative to the 1st edition. This book was written in RStudio using Quarto. The website is hosted via GitHub Pages, and the complete source is available on GitHub. This version of the book was built with R version 4.3.1 (2023-06-16) and Quarto version 1.4.104 and the following … Read more →

85

Tidy Modeling with R

by Max Kuhn and Julia Silge

2024-07-25*

The tidymodels framework is a collection of R packages for modeling and machine learning using tidyverse principles. This book provides a thorough introduction to how to use tidymodels, and an outline of good methodology and statistical practice for phases of the modeling process. […] Welcome to Tidy Modeling with R! This book is a guide to using a collection of software in the R programming language for model building called tidymodels, and it has two main goals: First and foremost, this book provides a practical introduction to how to use these specific R packages to create models. We … Read more →

86