Best Coding Practices for R
I Introduction
CoverPage
1
Introduction
II Structure
2
Folder Structure
2.1
Organizing files
2.2
Create Projects
2.3
Naming files
2.4
Folders Based on File-Type
2.5
Creating Sub-folders
2.6
Conclusion
3
Code Structure
3.1
Create Sections
3.2
Order of Code
3.3
Indentation
3.4
Conclusion’
4
Functions
4.1
Metadata or Information header
4.2
Pass everything through parameters
4.3
Use Return Statement
4.4
Keep a consistency in Return Type
4.5
Use Sensible Names for parameters too…
4.6
use tryCatch
4.7
Write simple and unique functions
4.8
Don’t load libraries or source code inside a function
4.9
Use Package::Function() approach
4.9.1
You should load libraries in the order of their usage
4.10
Conclusion
5
Naming Conventions
5.1
Popular naming conventions
5.1.1
camelCase
5.1.2
PascalCase
5.1.3
snake_case
5.2
Informative Names
5.3
Conclusions
6
Environment Management
6.1
Avoid package dependencies when possible
6.2
renv for package management
6.3
config for external dependencies
6.4
Conclusion
7
data Management
7.1
Keep a Copy or your Data
7.2
Don’t use numbers for columns
7.3
Keep Meaningful and proper column names
7.4
Use Databases
7.5
Use Efficient Packages
7.5.1
data.table
7.5.2
Matrix
7.5.3
disk.frame
7.5.4
modeldb
7.5.5
dbplot
7.5.6
sparklyr
7.6
Conclusion
8
Debugging
8.1
Browser() and print() are your friend
8.2
Read the functions
8.3
Version Control System
8.4
Make small commits
8.5
Use curly brackets
8.6
Always use named parameters
8.7
Log the errors
8.8
Don’t Use already used names
8.9
Use Simple code
8.10
Conclusion
III Memory
9
Type System
9.1
Things you should know
9.1.1
R don’t have scalar data types
9.1.2
Dates are basically integers under the hood.
9.1.3
POSIXlt are basically lists under the hood
9.1.4
Integers are smaller than numeric
9.1.5
define your datatypes before the variable
9.1.6
lists are better than dataframe under a loop
9.1.7
use lists whenever possible
9.2
Choose data types carefully
9.3
don’t change datatypes
9.4
Future of type-system in R
9.5
Conclusion
10
Pass By Value-Reference
10.1
Understanding the system
10.1.1
Pass by Value
10.1.2
Pass by reference
10.2
Copy on modify
10.3
for pass by reference
10.4
Conclusion
11
Release Memory
11.1
use rm()
11.2
use gc()
11.2.1
R version 3.5
11.2.2
R version 4.0
11.2.3
Inside a heavy loop
11.2.4
anything that takes more than 30 seconds
11.3
Conclusion
IV Speed
12
For Loops
13
Multithreading
14
Vectorize
15
Benchmarking
16
packages
V Production Tools
17
Docker
18
Proxy Server
19
Cloud Services
VI Shiny Tips
20
Speed
21
Memory
Published with bookdown
Best Coding Practices for R
Chapter 18
Proxy Server