Chapter 5 General Queries

Here we will make a few queries to the database that are general in nature. General queries are ones where we want to select particular columns and also where we want to remove or delete items from the database.

5.1 Distinct Queries

Here we will use distinct to look at the distinct values in a particular column. This allows us to get a high-level overview of what our data looks like.

Teachers

Table 5.1: 2 records
school
Myers Middle School
F.D. Roosevelt HS

Student

Table 5.2: 2 records
school
MS
GP

Next we can order our data in a particular way as well.

Notice that there is something very peculiar about this SQL statement. If we write it out as a normal SQL statement it will not work.

## Error in postgresqlExecStatement(conn, statement, ...) : 
##   RS-DBI driver: (could not Retrieve the result : ERROR:  column "medu" does not exist
## LINE 1: select Medu from student;
##                ^
## HINT:  Perhaps you meant to reference the column "student.Medu" or the column "student.Fedu".
## )
## Warning in postgresqlQuickSQL(conn, statement, ...): Could not create execute:
## select Medu from student;
## NULL

Instead we are required to utilize quotes around the column names since it is Camelbacked: e.g. Medu vs. medu. We also are required to utilize the **** to escapte the quotes around the column name as well.

##   Medu
## 1    0
## 2    1
## 3    3
## 4    2
## 5    4
##    school G3
## 1      GP 20
## 2      GP 19
## 3      GP 19
## 4      MS 19
## 5      GP 19
## 6      GP 19
## 7      GP 18
## 8      GP 18
## 9      GP 18
## 10     GP 18

5.2 WHERE Queries

Where Clause in SQL

Here we are select the schools and the G3 grade where G3 is greater than 15.

##    school G3
## 1      GP 19
## 2      GP 16
## 3      GP 16
## 4      GP 17
## 5      GP 16
## 6      GP 18
## 7      GP 18
## 8      GP 20
## 9      GP 16
## 10     GP 16

Here we are select the schools and the G3 grade where G3 is equal to 4. Utilizing the operators >,<,= allows us to filter our data and retrieve the data we want to look at.

5.3 AND, OR, NOT Queries

SQL Queries with And, OR, NOT.

AND

Here we only return one row where G3 = 4 and Medu = 4.

Table 5.3: 1 records
school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason guardian traveltime studytime failures schoolsup famsup paid activities nursery higher internet romantic famrel freetime goout Dalc Walc health absences G1 G2 G3
GP F 17 U GT3 T 4 3 other other reputation mother 1 2 2 no no yes no yes yes yes yes 3 4 5 2 4 1 22 6 6 4
school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason guardian traveltime studytime failures schoolsup famsup paid activities nursery higher internet romantic famrel freetime goout Dalc Walc health absences G1 G2 G3
GP F 17 U GT3 T 4 3 other other reputation mother 1 2 2 no no yes no yes yes yes yes 3 4 5 2 4 1 22 6 6 4

OR

Here we return 10 rows where G3 = 4 or Mother’s Education = 4. This helps us to filter and sort data when we want to find something in particular.

Table 5.4: Displaying records 1 - 10
school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason guardian traveltime studytime failures schoolsup famsup paid activities nursery higher internet romantic famrel freetime goout Dalc Walc health absences G1 G2 G3
GP F 18 U GT3 A 4 4 at_home teacher course mother 2 2 0 yes no no no yes yes no no 4 3 4 1 1 3 6 5 6 6
GP F 15 U GT3 T 4 2 health services home mother 1 3 0 no yes yes yes yes yes yes yes 3 2 2 1 1 5 2 15 14 15
GP M 16 U LE3 T 4 3 services other reputation mother 1 2 0 no yes yes yes yes yes yes no 5 4 2 1 2 5 10 15 15 15
GP F 17 U GT3 A 4 4 other teacher home mother 2 2 0 yes yes no no yes yes no no 4 1 4 1 1 1 6 6 5 6
GP F 15 U GT3 T 4 4 teacher health reputation mother 1 2 0 no yes yes no yes yes yes no 3 3 3 1 2 2 0 10 8 9
GP M 15 U LE3 T 4 4 health services course father 1 1 0 no yes yes yes yes yes yes no 4 3 3 1 3 5 2 14 14 14
GP M 15 U GT3 T 4 3 teacher other course mother 2 2 0 no yes yes no yes yes yes no 5 4 3 1 2 3 2 10 10 11
GP F 16 U GT3 T 4 4 health other home mother 1 1 0 no yes no no yes yes yes no 4 4 4 1 2 2 4 14 14 14
GP F 16 U GT3 T 4 4 services services reputation mother 1 3 0 no yes yes yes yes yes yes no 3 2 3 1 2 2 6 13 14 14
GP M 16 U LE3 T 4 3 health other home father 1 1 0 no no yes yes yes yes yes no 3 1 3 1 3 5 4 8 10 10
##    school sex age address famsize Pstatus Medu Fedu     Mjob     Fjob
## 1      GP   F  18       U     GT3       A    4    4  at_home  teacher
## 2      GP   F  15       U     GT3       T    4    2   health services
## 3      GP   M  16       U     LE3       T    4    3 services    other
## 4      GP   F  17       U     GT3       A    4    4    other  teacher
## 5      GP   F  15       U     GT3       T    4    4  teacher   health
## 6      GP   M  15       U     LE3       T    4    4   health services
## 7      GP   M  15       U     GT3       T    4    3  teacher    other
## 8      GP   F  16       U     GT3       T    4    4   health    other
## 9      GP   F  16       U     GT3       T    4    4 services services
## 10     GP   M  16       U     LE3       T    4    3   health    other
##        reason guardian traveltime studytime failures schoolsup famsup paid
## 1      course   mother          2         2        0       yes     no   no
## 2        home   mother          1         3        0        no    yes  yes
## 3  reputation   mother          1         2        0        no    yes  yes
## 4        home   mother          2         2        0       yes    yes   no
## 5  reputation   mother          1         2        0        no    yes  yes
## 6      course   father          1         1        0        no    yes  yes
## 7      course   mother          2         2        0        no    yes  yes
## 8        home   mother          1         1        0        no    yes   no
## 9  reputation   mother          1         3        0        no    yes  yes
## 10       home   father          1         1        0        no     no  yes
##    activities nursery higher internet romantic famrel freetime goout Dalc Walc
## 1          no     yes    yes       no       no      4        3     4    1    1
## 2         yes     yes    yes      yes      yes      3        2     2    1    1
## 3         yes     yes    yes      yes       no      5        4     2    1    2
## 4          no     yes    yes       no       no      4        1     4    1    1
## 5          no     yes    yes      yes       no      3        3     3    1    2
## 6         yes     yes    yes      yes       no      4        3     3    1    3
## 7          no     yes    yes      yes       no      5        4     3    1    2
## 8          no     yes    yes      yes       no      4        4     4    1    2
## 9         yes     yes    yes      yes       no      3        2     3    1    2
## 10        yes     yes    yes      yes       no      3        1     3    1    3
##    health absences G1 G2 G3
## 1       3        6  5  6  6
## 2       5        2 15 14 15
## 3       5       10 15 15 15
## 4       1        6  6  5  6
## 5       2        0 10  8  9
## 6       5        2 14 14 14
## 7       3        2 10 10 11
## 8       2        4 14 14 14
## 9       2        6 13 14 14
## 10      5        4  8 10 10

NOT

Here we are looking at results where Fathers Education (Fedu) does not equal 4.

Table 5.5: Displaying records 1 - 10
school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason guardian traveltime studytime failures schoolsup famsup paid activities nursery higher internet romantic famrel freetime goout Dalc Walc health absences G1 G2 G3
GP F 17 U GT3 T 1 1 at_home other course father 1 2 0 no yes no no no yes yes no 5 3 3 1 1 3 4 5 5 6
GP F 15 U LE3 T 1 1 at_home other other mother 1 2 3 yes no yes no yes yes yes no 4 3 2 2 3 3 10 7 8 10
GP F 15 U GT3 T 4 2 health services home mother 1 3 0 no yes yes yes yes yes yes yes 3 2 2 1 1 5 2 15 14 15
GP F 16 U GT3 T 3 3 other other home father 1 2 0 no yes yes no yes yes no no 4 3 2 1 2 5 4 6 10 10
GP M 16 U LE3 T 4 3 services other reputation mother 1 2 0 no yes yes yes yes yes yes no 5 4 2 1 2 5 10 15 15 15
GP M 16 U LE3 T 2 2 other other home mother 1 2 0 no no no no yes yes yes no 4 4 4 1 1 3 0 12 12 11
GP M 15 U LE3 A 3 2 services other home mother 1 2 0 no yes yes no yes yes yes no 4 2 2 1 1 1 0 16 18 19
GP F 15 U GT3 T 2 1 services other reputation father 3 3 0 no yes no yes yes yes yes no 5 2 2 1 1 4 4 10 12 12
GP M 15 U GT3 T 4 3 teacher other course mother 2 2 0 no yes yes no yes yes yes no 5 4 3 1 2 3 2 10 10 11
GP M 15 U GT3 A 2 2 other other home other 1 3 0 no yes no no yes yes yes yes 4 5 2 1 1 3 0 14 16 16

Combining AND, OR, NOT

Table 5.6: Displaying records 1 - 10
school sex age address famsize Pstatus Medu Fedu Mjob Fjob reason guardian traveltime studytime failures schoolsup famsup paid activities nursery higher internet romantic famrel freetime goout Dalc Walc health absences G1 G2 G3
GP F 18 U GT3 A 4 4 at_home teacher course mother 2 2 0 yes no no no yes yes no no 4 3 4 1 1 3 6 5 6 6
GP F 15 U LE3 T 1 1 at_home other other mother 1 2 3 yes no yes no yes yes yes no 4 3 2 2 3 3 10 7 8 10
GP F 15 U GT3 T 4 2 health services home mother 1 3 0 no yes yes yes yes yes yes yes 3 2 2 1 1 5 2 15 14 15
GP F 16 U GT3 T 3 3 other other home father 1 2 0 no yes yes no yes yes no no 4 3 2 1 2 5 4 6 10 10
GP M 16 U LE3 T 4 3 services other reputation mother 1 2 0 no yes yes yes yes yes yes no 5 4 2 1 2 5 10 15 15 15
GP M 16 U LE3 T 2 2 other other home mother 1 2 0 no no no no yes yes yes no 4 4 4 1 1 3 0 12 12 11
GP F 17 U GT3 A 4 4 other teacher home mother 2 2 0 yes yes no no yes yes no no 4 1 4 1 1 1 6 6 5 6
GP M 15 U LE3 A 3 2 services other home mother 1 2 0 no yes yes no yes yes yes no 4 2 2 1 1 1 0 16 18 19
GP M 15 U GT3 T 3 4 other other home mother 1 2 0 no yes yes yes yes yes yes no 5 5 1 1 1 5 0 14 15 15
GP F 15 U GT3 T 4 4 teacher health reputation mother 1 2 0 no yes yes no yes yes yes no 3 3 3 1 2 2 0 10 8 9

The AND OR

Here we tell SQL that we want all the G3 grades that are > 10 and also that the school should be GP OR Fedu should equal 4. The backslashes allow us to escape the single quotes that are necessary when using RPostgresql syntax.

Table 5.7: Displaying records 1 - 10
G3 school Fedu
6 GP 4
10 GP 1
15 GP 2
10 GP 3
15 GP 3
11 GP 2
6 GP 4
19 GP 2
15 GP 4
9 GP 4

The double NOT or NOT NOT

Here we tell SQL that we want to return all values where Fedu and Medu are not equal to 4.

##    school sex age address famsize Pstatus Medu Fedu     Mjob    Fjob     reason
## 1      GP   F  18       U     GT3       A    4    4  at_home teacher     course
## 2      GP   F  17       U     GT3       T    1    1  at_home   other     course
## 3      GP   F  15       U     LE3       T    1    1  at_home   other      other
## 4      GP   F  16       U     GT3       T    3    3    other   other       home
## 5      GP   M  16       U     LE3       T    2    2    other   other       home
## 6      GP   F  17       U     GT3       A    4    4    other teacher       home
## 7      GP   M  15       U     LE3       A    3    2 services   other       home
## 8      GP   M  15       U     GT3       T    3    4    other   other       home
## 9      GP   F  15       U     GT3       T    4    4  teacher  health reputation
## 10     GP   F  15       U     GT3       T    2    1 services   other reputation
##    guardian traveltime studytime failures schoolsup famsup paid activities
## 1    mother          2         2        0       yes     no   no         no
## 2    father          1         2        0        no    yes   no         no
## 3    mother          1         2        3       yes     no  yes         no
## 4    father          1         2        0        no    yes  yes         no
## 5    mother          1         2        0        no     no   no         no
## 6    mother          2         2        0       yes    yes   no         no
## 7    mother          1         2        0        no    yes  yes         no
## 8    mother          1         2        0        no    yes  yes        yes
## 9    mother          1         2        0        no    yes  yes         no
## 10   father          3         3        0        no    yes   no        yes
##    nursery higher internet romantic famrel freetime goout Dalc Walc health
## 1      yes    yes       no       no      4        3     4    1    1      3
## 2       no    yes      yes       no      5        3     3    1    1      3
## 3      yes    yes      yes       no      4        3     2    2    3      3
## 4      yes    yes       no       no      4        3     2    1    2      5
## 5      yes    yes      yes       no      4        4     4    1    1      3
## 6      yes    yes       no       no      4        1     4    1    1      1
## 7      yes    yes      yes       no      4        2     2    1    1      1
## 8      yes    yes      yes       no      5        5     1    1    1      5
## 9      yes    yes      yes       no      3        3     3    1    2      2
## 10     yes    yes      yes       no      5        2     2    1    1      4
##    absences G1 G2 G3
## 1         6  5  6  6
## 2         4  5  5  6
## 3        10  7  8 10
## 4         4  6 10 10
## 5         0 12 12 11
## 6         6  6  5  6
## 7         0 16 18 19
## 8         0 14 15 15
## 9         0 10  8  9
## 10        4 10 12 12

5.4 Insert Queries

Insert into PostgreSQL using RPostgreSQL

  • Inserting a single list of values into PostgreSQL.

Ok, now lets INSERT some data into our PostgreSQL database. We will want to develop a query string and send this to the database via dbSendQuery() from the RPostgreSQL package.

Then when we call the new data we can see that we have updated the row (observations) to 14 and have added the data in our query.

##    Year FTP UEMP   MAN     LIC      GR CLEAR     WM  NMAN   GOV   HE     WE
## 1  1961 320 11.0 455.5  178.50  215.98  93.4 558724 538.1 133.9 2.98 117.18
## 2  1962 320  7.0 480.2  156.41  180.48  88.5 538584 547.6 137.6 3.09 134.02
## 3  1963 320  5.2 506.1  198.02  209.57  94.4 519171 562.8 143.6 3.23 141.68
## 4  1964 320  4.3 535.8  222.10  231.67  92.0 500457 591.0 150.3 3.33 147.98
## 5  1965 320  3.5 576.0  301.92  297.65  91.0 482418 626.1 164.3 3.46 159.85
## 6  1966 320  3.2 601.7  391.22  367.62  87.4 465029 659.8 179.5 3.60 157.19
## 7  1967 320  4.1 577.3  665.56  616.54  88.3 448267 686.2 187.5 3.73 155.29
## 8  1968 320  3.9 596.9 1131.21 1029.75  86.1 432109 699.6 195.4 2.91 131.75
## 9  1969 320  3.6 613.5  837.60  786.23  79.0 416533 729.9 210.3 4.25 178.74
## 10 1970 320  7.1 569.3  794.90  713.77  73.9 401518 757.8 223.8 4.47 178.30
## 11 1971 320  8.4 548.8  817.74  750.43  63.4 387046 755.3 227.7 5.04 209.54
## 12 1972 320  7.7 563.4  583.17 1027.38  62.5 373095 787.0 230.9 5.47 240.05
## 13 1973 320  6.3 609.3  709.59  666.50  58.9 359647 819.8 230.2 5.76 258.05
##      HOM   ACC    ASR
## 1   8.60 39.17 306.18
## 2   8.90 40.27 315.16
## 3   8.52 45.31 277.53
## 4   8.89 49.51 234.07
## 5  13.07 55.05 230.84
## 6  14.57 53.90 217.99
## 7  21.36 50.62 286.11
## 8  28.03 51.47 291.59
## 9  31.49 49.16 320.39
## 10 37.39 45.80 323.03
## 11 46.26 44.54 357.38
## 12 47.24 41.03 422.07
## 13 52.33 44.17 473.01

5.5 Update Queries

The UPDATE command allows us to update the records in a table to new data. We may want to do this in our Shiny Applications or from the Rstudio console itself. Here we will update the last column that we inserted into the Detroit table as we had an error in column 2.

## data frame with 0 columns and 0 rows

Now let’s check our results.

##    Year FTP UEMP   MAN     LIC      GR CLEAR     WM  NMAN   GOV   HE     WE
## 1  1961 320 11.0 455.5  178.50  215.98  93.4 558724 538.1 133.9 2.98 117.18
## 2  1962 320  7.0 480.2  156.41  180.48  88.5 538584 547.6 137.6 3.09 134.02
## 3  1963 320  5.2 506.1  198.02  209.57  94.4 519171 562.8 143.6 3.23 141.68
## 4  1964 320  4.3 535.8  222.10  231.67  92.0 500457 591.0 150.3 3.33 147.98
## 5  1965 320  3.5 576.0  301.92  297.65  91.0 482418 626.1 164.3 3.46 159.85
## 6  1966 320  3.2 601.7  391.22  367.62  87.4 465029 659.8 179.5 3.60 157.19
## 7  1967 320  4.1 577.3  665.56  616.54  88.3 448267 686.2 187.5 3.73 155.29
## 8  1968 320  3.9 596.9 1131.21 1029.75  86.1 432109 699.6 195.4 2.91 131.75
## 9  1969 320  3.6 613.5  837.60  786.23  79.0 416533 729.9 210.3 4.25 178.74
## 10 1970 320  7.1 569.3  794.90  713.77  73.9 401518 757.8 223.8 4.47 178.30
## 11 1971 320  8.4 548.8  817.74  750.43  63.4 387046 755.3 227.7 5.04 209.54
## 12 1972 320  7.7 563.4  583.17 1027.38  62.5 373095 787.0 230.9 5.47 240.05
## 13 1973 320  6.3 609.3  709.59  666.50  58.9 359647 819.8 230.2 5.76 258.05
##      HOM   ACC    ASR
## 1   8.60 39.17 306.18
## 2   8.90 40.27 315.16
## 3   8.52 45.31 277.53
## 4   8.89 49.51 234.07
## 5  13.07 55.05 230.84
## 6  14.57 53.90 217.99
## 7  21.36 50.62 286.11
## 8  28.03 51.47 291.59
## 9  31.49 49.16 320.39
## 10 37.39 45.80 323.03
## 11 46.26 44.54 357.38
## 12 47.24 41.03 422.07
## 13 52.33 44.17 473.01

5.6 Delete Queries

We can also delete specific rows based on the data of one or more columns. This will allow us to remove mistakes we have made in our tables without having to DROP the entire table itself.

5.7 SQL Count, Avg, & Sum

We can count how many distinct opponents there are that the Dodgers played against in the table.

##   count
## 1    17

We can average the temp across all of the games played. We can see that the average temp for all games played is around 73°F

##        avg
## 1 73.14815

Here we can get the average temp from all games played in the month of April.

##   avg
## 1  63

Here we can also get the average temp of all games played in October. Looks like the average temp was 83°F

##        avg
## 1 83.66667

Seems close to 84°F if we round up. So, let’s go ahead and do that by adding the ROUND() argument to our SQL statement.

##   round
## 1    84

5.8 LIKE

5.8.1 WILDCARDS


Symbol Description Example
% zero or more characters bl% finds bl, black, blue, and blob
_ single character h_t finds hot, hat, and hit
[] any single character within brackets h[oa]t finds hot and hat
^ any character not in brackets h[^oa]t finds hit
- a range of characters c[a-b]t finds cat and cbt

We can utilize the Wildcards to look up different results in a table. Here we are utilzing the dodgers table.

Here we can look at all of the opponents who start with a P in the dodgers table by utilizing the following syntax. This looks for any opponent who starts with a capital P.

##    opponent
## 1   Pirates
## 2   Pirates
## 3   Pirates
## 4    Padres
## 5    Padres
## 6    Padres
## 7    Padres
## 8    Padres
## 9    Padres
## 10 Phillies
## 11 Phillies
## 12 Phillies
## 13   Padres
## 14   Padres
## 15   Padres

Or we can look at all of the opponents who end with an s. Which looks to be nearly eveyrone in the team. For brevity sake we will only look at the head of the table.

##   opponent
## 1  Pirates
## 2  Pirates
## 3  Pirates
## 4   Padres
## 5   Padres
## 6   Padres

5.9 NOT LIKE

Here we look at all of the opponents who do not end in a s. Which seems to eliminate nearly everyone but one team: The White Sox.

##    opponent
## 1 White Sox
## 2 White Sox
## 3 White Sox

5.10 LIKE any position

Here we look at the teams that have nal in any position in their name. Of course our results return the Cardinals & Nationals.

##     opponent
## 1  Nationals
## 2  Nationals
## 3  Nationals
## 4  Cardinals
## 5  Cardinals
## 6  Cardinals
## 7  Cardinals
## 8  Cardinals
## 9  Cardinals
## 10 Cardinals

Here we can use the **_** to find any opponent that starts with any letter but has et and any character after it. Which will return Mets for us.

##   opponent
## 1     Mets
## 2     Mets
## 3     Mets
## 4     Mets
## data frame with 0 columns and 0 rows

5.11 Between

Here we can select all of when the temperature was between 60 and 70°F.

##   temp
## 1   67
## 2   65
## 3   60
## 4   63
## 5   64
## 6   66

We can also utilize this to subset other columns where the criteria matches the temp. We are only selecting the opponents and temp that was between 60 and 70°F.

##    opponent temp
## 1   Pirates   67
## 2    Padres   65
## 3    Braves   60
## 4    Braves   63
## 5    Braves   64
## 6 Nationals   66