Chapter 1 Preface

This book accompanies the course I give at Ben-Gurion University, named “Introduction to Data Science”. This is an introductory-level, hands-on focused course, designed for students with basic background in statistics and econometrics, and without programming experience. It introduces students to different tools needed for building a data science pipeline, including data processing, analysis, visualization and modeling. The course is taught in R environment.

Many of the contents in this book are taken from BGU’s “R” course, given at the department of Industrial Engineering and Management.

The chapters in this book are arranged (roughly) according to the order of classes throughout the semester. Students are encouraged to go through the book during the lectures, and after class. Note that the contents are likely to be updated during the course, hence, it is not recommended to work with its offline versions.

For reproducing my results you will want to run set.seed(1).

1.1 Acknowledgements

I would like to thank Dr. Jonathan D. Rosenblatt for allowing me using the lecture notes from his R (BGU course). Many of the materials in this book are taken from Jonathan’s course, so if you want to expand beyond this book, you may want to go there.