Data Management Using R
Last updated on 2021-12-12
Welcome to dmur
!
This is a book for anyone who are interested in manipulating and processing medical data. This book introduces different aspects of data management and how to implement these in R using RStudio. While there are a plethora of great R books covering a variety of data management topics, I hope this book would serve as a self-learning guide to avoid roadblocks and frustrations before becoming fully comfortable with using R. Many beginners find themselves wanting to develop data management skills in R, but lose their patience after they encounter a steep learning curve of R and several months of frustration. If you feel nostalgic about this, this book is for you.
It is intended for non-technical audience and Stata users to:
- Serve as a guidebook to R code for data management
- Serve as a R code reference manual for
mStats
package - Provide task-centered examples addressing common data management problems
- Assist people in transitioning to R
Under construction
This book is still UNDER CONSTRUCTION. If you have any comments or suggestions, feel free to contact me at dr.myominnoo@gmail.com. Thank you!
if you have a dataset that you think would be suitable for inclusion in this text (as an example or for an exercise), I would love to hear about it.
Inspirations
Tutorials and books that provided ideas and knowledge of this book are credited within their respective pages. More generally, the following sources provided inspiration for this book:
- Data Management Using Stata: A Practical Handbook
- UCLA’s Stata tutorials
- R for applied epidemiology and public health
- R for Data Science book (R4DS)
- bookdown: Authoring Books and Technical Documents with R Markdown
- Netlify hosts this website
The design of this book is based on the codes obtained from Peter Higgins. Kudos to Peter!
How to use this handbook
- Browse the pages in the Table of Contents, or use the search box
- Click the “copy” icons to copy code
- You can follow along with the example datasets we will download in the next chapter 1.8.
Software versions
The knitr package (Xie 2015) and the bookdown package (Xie 2021) were used to compile this text. The R session information is shown below:
::session_info() xfun
## R version 4.1.0 (2021-05-18)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur 10.16
##
## Locale: en_US.UTF-8 / en_US.UTF-8 / en_US.UTF-8 / C / en_US.UTF-8 / en_US.UTF-8
##
## Package version:
## base64enc_0.1.3 bookdown_0.24.4 bslib_0.3.1 compiler_4.1.0
## digest_0.6.29 evaluate_0.14 fastmap_1.1.0 fs_1.5.2
## glue_1.5.1 graphics_4.1.0 grDevices_4.1.0 highr_0.9
## htmltools_0.5.2 jquerylib_0.1.4 jsonlite_1.7.2 knitr_1.36
## magrittr_2.0.1 methods_4.1.0 R6_2.5.1 rappdirs_0.3.3
## rlang_0.4.12 rmarkdown_2.11 sass_0.4.0 stats_4.1.0
## stringi_1.7.6 stringr_1.4.0 tinytex_0.35 tools_4.1.0
## utils_4.1.0 xfun_0.28 yaml_2.2.1
packageVersion("tidyverse")
## [1] '1.3.1'
packageVersion("dplyr")
## [1] '1.0.7'
packageVersion("magrittr")
## [1] '2.0.1'
packageVersion("mStats")
## [1] '4.0.0'
Terms of Use and Contribution
License
Data Management Using R, 2021
This work is licensed by Applied Epi Incorporated under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Academic courses and training programs are welcome to use this handbook with their students, but please send us an email to let me know. If you have questions about your intended use, email dr.myominnoo@gmail.com.
Citation
Oo, Myo Minn. Data Management Using R. 2021.
Contribution
If you would like to make a content contribution, please contact with us first via Github issues or by email. We are implementing a schedule for updates and are creating a contributor guide.
Please note that the epiRhandbook project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.