# Data Scientist Handbook

*2020-07-30*

# Welcome Data Science Lovers

Department of Statistics

Faculty of STEM

Tangerang, Banten

Info: siregarbakti@gmail.com

## Preface

This book is starting from introductory level, hands-on practical courses, designed for `smart people`

with a basic background in maths or statistics, econometrics, and computer science, even without programming experience. It will introduce you to different perspectives of data science, including data processing, analysis, visualization, and modeling. Each chapter would be practically discussed in RStudio Integrated Development Environment (IDE).

As you may already know, data science has taken the world by storm. Every field of study and area of business has been affected, as people increasingly realize the value of the incredible quantities of data being generated. But to extract value from those data, one needs to be trained in the proper data science skills. The R programming language has become one of the most famous for data science. It`s flexibility, power, sophistication, and expressiveness have made it an invaluable tool for data scientists around the world.

Herewith, you will learn from the fundamentals of R programming, learn how to write functions, import data, manipulate, visualize, EDA, Modeling, and soon. I highly recommend this book, if you desired to have a solid foundation on which to build your data science journey. So, please enjoy your time to keep learn and practices!

## About Me

I started using R in 2012 when I was a college undergraduate student, working on my homework in computational statistics at the University of Sumatera Utara. I have entered an applied mathematics major with a pure math concentration. After college in October 2013, I enrolled in a Management Trainee (MT) at Sinar Mas Insurance as an Underwriter for about three months. Then move to a similar employee training process called Office Deployment Program (ODP) at MPM Finance as a Credit Analyst (about six months). After all of this experience, I realize that to upgrade my study to the next level because I was considered my self is not good enough to be successful in my carrier with limited knowledge. Therefore, while I followed the training process I always tried to apply to several foreign universities.

In September 2014, I was lucky to say I get the opportunity to continue my master’s degree at National Sun Yat-Sen University (Taiwan) on a scholarship program. In this university I registered in the department of applied mathematics with a statistics concentration. I learned a lot on this campus, starting from the perspective of life, culture, respect for others, always being on time, increasing my statistical knowledge, and so on. During my study, I was working with my professor as a teaching assistant and I was lucky to get an extra allowance. For that am focus on teaching, I have to attend our Lab meeting ones or twice a week, helping university events, organize some trip activities, and joint some research with (Prof. Mei-Hui Guo and Prof. Chung Chang). They advise me to learn at least the basics of Mandarin to survive in Taiwan and must be consistently improving my programming skills with R, Python, SAS, and MATLAB to handle all my homework as well.

In the next step of my career, I went to the department of engineering at PT Andalan Furnindo (Samora Group) as a data analyst. It’s one of the best Sugar Industri in Indonesia, where I have been for at least one year since 2017. Then I felt that not all of my knowledge could be used in this company, not even able to develop it at a better level. So with a great challenge, I decided to move to a new place that is Matana University in 2018, here I began to dedicate myself by teaching business statistics and continue to develop with new sciences. Here I learn a lot and teach about data science, I am focusing on teaching data structure and algorithm, database system, computational statistics, econometrics, time series, calculus, optimizations, research methods, etc. Above of all, I confident to say that I have an expert skill with some programming languages such as R, Python, SQL, etc, and also I have good enough ability to use Business Intelligence Tools, for instance, Tableau and SAS.

This book comes from my teaching experiences in R and thoroughly adopted from different stages of improvements. Most of the material has been taken from my class in data structure and algorithms, computational statistics, and database systems. A few of them may be collected from Coursera, Data Camp, Data Flair, R-tutorial, and so on.