Chapter 4 Keystrokes

This is a highly controversial topic. People could regard that as a disadvantage as well. But I beg to differ because once you understand a syntax very well you don’t need an explaination of it ever again. Most common of these function are lm(), rnorm(), <-, == etc… from base R. You use these syntax everyday without ever looking for documentation for these functions. Because I am sure you have learned them pretty well. But they are nothing more than just plain symbols.

Take for example If I ask you to add 3 4 times using \(3+3+3+3\) or multiply 3 4 times using \(3\times3\times3\times3\) you would be frustrated while \(3\times4\) and \(3^4\) is much more consise. Same logic for is far more understandable. Symbols like \(\sum|\int|\ln|\pi|\) can take some time to understand but will help explain complex ideas easily later on.

data.table has a very clear and consise syntax. For example:

  1. group by is optional
  2. select arguement is optional
  3. you don’t have to decide arguement names
  4. .SD == Subset of Data
  5. .SDcols == columns to be chosen for .SD
  6. .() == list()
  7. := == update or append data
  8. “:=” == multiple updates in a data
  9. Automatic conversion of list into columns like :
## $mpg
## [1] 21 21 22 21 18 18
## 
## $cyl
## [1] 6 6 4 6 8 6
## 
## $disp
## [1] 160 160 108 258 360 225
## 
## $hp
## [1] 110 110  93 110 175 105
## 
## $drat
## [1] 3 3 3 3 3 2
## 
## $wt
## [1] 2 2 2 3 3 3
## 
## $qsec
## [1] 16 17 18 19 17 20
## 
## $vs
## [1] 0 0 1 1 0 1
## 
## $am
## [1] 1 1 1 0 0 0
## 
## $gear
## [1] 4 4 4 3 3 3
## 
## $carb
## [1] 4 4 1 1 2 1

Here it will return a list of vectors

##    mpg cyl disp  hp drat wt qsec vs am gear carb
## 1:  21   6  160 110    3  2   16  0  1    4    4
## 2:  21   6  160 110    3  2   17  0  1    4    4
## 3:  22   4  108  93    3  2   18  1  1    4    1
## 4:  21   6  258 110    3  3   19  1  0    3    1
## 5:  18   8  360 175    3  3   17  0  0    3    2
## 6:  18   6  225 105    2  3   20  1  0    3    1

While this will return a data.frame as we needed. We will explain each of these points later on in the book. For the time being just try to understand that even less keystrokes can help you a lot in EDA ( exploratory data analysis ).

Lets start exploration of the package together.