7 SQL: Retrieve data from multiple tables
In this chapter, we will discuss how we can combine (join) multiple tables in order to retrieve data. We will again use the schema music_marketplace
that describes a hypothetical marketplace for music. Recall that the schema includes the following tables:
songs
: stores information about songs.label
: stores information about music labels.artist
: stores information about artists.artist_songs
: stores which songs are produced by which artists.charts
: stores the maximum rank that a songs achieves in the charts (the hypothetical marketplace has top-selling charts).
Recall that the Entity Relationship diagram of this schema is as follows:
First, we will connect to the database (Section 4.1):
library(DBI)
library(odbc)
library(RODBC)
library(tidyverse)
= DBI::dbConnect(odbc::odbc(),
con Driver= "MySQL",
Server='mysql-isys3350.bc.edu',
UID='your_username',
PWD= 'your_password',
Port= 3306)
Joins allow us to combine columns from two or more tables into a single result. There are various types of joins, including:
- Inner join: retrieves only rows that satisfy the join condition
- Left outer join: retrieves only rows that satisfy the join condition plus rows that do not satisfy the join condition from the left table.
- Right outer join: retrieves only rows that satisfy the join condition plus rows that do not satisfy the join condition from the right table.
We discuss these joins next.
7.1 Inner join
In general, to join data from two tables, we need to have the same column in both tables. For instance, in Figure 5.1, we can join tables charts
and songs
on their common column trackid
. So if we want to get the song information of all tracks that are included in the charts, we can write the following query:
SELECT *
FROM music_marketplace.charts INNER JOIN music_marketplace.songs
ON music_marketplace.charts.trackid = music_marketplace.songs.trackid;
trackId | maxRank | durationInCharts | firstDateInCharts | lastDateInCharts | title | subtitle | trackid | label | genre | released | labelId | price |
---|---|---|---|---|---|---|---|---|---|---|---|---|
887832 | 95 | 0 | 2016-01-17 16:38:32 | 2016-01-17 16:38:32 | K&G Beat | Original Mix | 887832 | Planet Mu | Dubstep | 2015-10-15 | 7397 | 1.49 |
907738 | 42 | 20 | 2015-08-16 12:21:35 | 2015-09-05 15:43:31 | Sodium | Original Mix | 907738 | Dangerous New Age | Drum & Bass | 2015-08-10 | 13562 | 1.49 |
3711813 | 93 | 3 | 2015-12-27 16:38:57 | 2015-12-30 16:39:06 | Nightmares in Reality feat. Maksim | Original Mix | 3711813 | Inspected Records | Glitch Hop | 2015-07-03 | 18995 | 1.49 |
4423896 | 56 | 301 | 2015-06-27 16:23:18 | 2016-04-23 16:39:36 | Strandbar | Disko | 4423896 | Olsen Records | Indie Dance / Nu Disco | 2015-09-11 | 30474 | 1.49 |
4927722 | 57 | 266 | 2015-03-29 16:22:17 | 2015-12-20 16:38:10 | The Fifth Hand | Original Mix | 4927722 | Delicieuse Records | Electronica / Downtempo | 2015-09-28 | 28282 | 1.49 |
4927724 | 20 | 276 | 2015-03-08 00:13:31 | 2015-12-09 16:38:08 | True Thoughts feat. Anna Naklab | Original Mix | 4927724 | Delicieuse Records | Electronica / Downtempo | 2015-09-28 | 28282 | 1.49 |
4969569 | 62 | 48 | 2016-02-12 16:40:14 | 2016-03-31 16:39:52 | Business | Original Mix | 4969569 | VP Records | Reggae / Dancehall / Dub | 2016-02-01 | 18027 | 1.49 |
4969576 | 60 | 6 | 2016-03-25 16:40:03 | 2016-03-31 16:39:52 | Bruk It | Original Mix | 4969576 | VP Records | Reggae / Dancehall / Dub | 2016-02-01 | 18027 | 1.49 |
6065153 | 60 | 7 | 2015-03-14 16:22:47 | 2015-03-21 16:22:52 | Music Is My Life | Original Mix | 6065153 | Mr. Carter | Deep House | 2015-03-09 | 26002 | 1.49 |
6079233 | 35 | 7 | 2015-04-11 16:26:24 | 2015-04-18 16:26:23 | Basis | Original Mix | 6079233 | Material | Tech House | 2015-04-06 | 5066 | 1.49 |
Note that the above join returns all columns from both tables!
=
operator because we wanted to retrieve matching rows. Other logical operators can also be used.
trackid
column in table charts is a foreign key that references the trackid
column in table songs
. Joining on foreign keys is not required; we can join on any other columns that make sense even if their relationship is not defined in the database.
One cumbersome feature of the previous example is that we need to identify the table every time we access the joined column. To simplify, we can use table aliases as follows:
SELECT *
FROM music_marketplace.charts t1 INNER JOIN music_marketplace.songs t2
ON t1.trackid = t2.trackid;
trackId | maxRank | durationInCharts | firstDateInCharts | lastDateInCharts | title | subtitle | trackid | label | genre | released | labelId | price |
---|---|---|---|---|---|---|---|---|---|---|---|---|
887832 | 95 | 0 | 2016-01-17 16:38:32 | 2016-01-17 16:38:32 | K&G Beat | Original Mix | 887832 | Planet Mu | Dubstep | 2015-10-15 | 7397 | 1.49 |
907738 | 42 | 20 | 2015-08-16 12:21:35 | 2015-09-05 15:43:31 | Sodium | Original Mix | 907738 | Dangerous New Age | Drum & Bass | 2015-08-10 | 13562 | 1.49 |
3711813 | 93 | 3 | 2015-12-27 16:38:57 | 2015-12-30 16:39:06 | Nightmares in Reality feat. Maksim | Original Mix | 3711813 | Inspected Records | Glitch Hop | 2015-07-03 | 18995 | 1.49 |
4423896 | 56 | 301 | 2015-06-27 16:23:18 | 2016-04-23 16:39:36 | Strandbar | Disko | 4423896 | Olsen Records | Indie Dance / Nu Disco | 2015-09-11 | 30474 | 1.49 |
4927722 | 57 | 266 | 2015-03-29 16:22:17 | 2015-12-20 16:38:10 | The Fifth Hand | Original Mix | 4927722 | Delicieuse Records | Electronica / Downtempo | 2015-09-28 | 28282 | 1.49 |
4927724 | 20 | 276 | 2015-03-08 00:13:31 | 2015-12-09 16:38:08 | True Thoughts feat. Anna Naklab | Original Mix | 4927724 | Delicieuse Records | Electronica / Downtempo | 2015-09-28 | 28282 | 1.49 |
4969569 | 62 | 48 | 2016-02-12 16:40:14 | 2016-03-31 16:39:52 | Business | Original Mix | 4969569 | VP Records | Reggae / Dancehall / Dub | 2016-02-01 | 18027 | 1.49 |
4969576 | 60 | 6 | 2016-03-25 16:40:03 | 2016-03-31 16:39:52 | Bruk It | Original Mix | 4969576 | VP Records | Reggae / Dancehall / Dub | 2016-02-01 | 18027 | 1.49 |
6065153 | 60 | 7 | 2015-03-14 16:22:47 | 2015-03-21 16:22:52 | Music Is My Life | Original Mix | 6065153 | Mr. Carter | Deep House | 2015-03-09 | 26002 | 1.49 |
6079233 | 35 | 7 | 2015-04-11 16:26:24 | 2015-04-18 16:26:23 | Basis | Original Mix | 6079233 | Material | Tech House | 2015-04-06 | 5066 | 1.49 |
Of course, the join condition can be extended:
SELECT *
FROM music_marketplace.charts t1 INNER JOIN music_marketplace.songs t2
ON t1.trackid = t2.trackid AND t1.maxRank > 95;
trackId | maxRank | durationInCharts | firstDateInCharts | lastDateInCharts | title | subtitle | trackid | label | genre | released | labelId | price |
---|---|---|---|---|---|---|---|---|---|---|---|---|
6191715 | 96 | 0 | 2015-03-28 16:24:43 | 2015-03-28 16:24:43 | A Woman Alone | Original Mix | 6191715 | Robsoul Recordings | House | 2015-03-20 | 2204 | 1.49 |
6274556 | 96 | 1 | 2015-03-15 16:22:09 | 2015-03-16 16:22:13 | Disgustingly Beautiful feat. Isa Kurz | Original Mix | 6274556 | Tru Thoughts | Electronica / Downtempo | 2015-03-09 | 7505 | 1.49 |
6306434 | 99 | 1 | 2015-04-19 16:26:20 | 2015-04-20 16:25:52 | So Me Born | Original Mix | 6306434 | H2O Records | Reggae / Dancehall / Dub | 2015-03-10 | 6254 | 1.49 |
6318110 | 99 | 1 | 2015-03-15 16:26:28 | 2015-03-16 16:26:32 | Ready For Sunshine | Original Mix | 6318110 | Statement! | Trance | 2015-03-09 | 3175 | 1.49 |
6334749 | 97 | 0 | 2015-03-30 16:26:35 | 2015-03-30 16:26:35 | Debug | Original Mix | 6334749 | Tronic | Techno | 2015-03-23 | 12576 | 1.49 |
6348920 | 100 | 0 | 2015-03-14 16:24:06 | 2015-03-14 16:24:06 | The Blood Ties | Original Mix | 6348920 | Rebirth Society | Hardcore / Hard Techno | 2015-03-09 | 26914 | 1.49 |
6360895 | 97 | 0 | 2015-03-28 16:24:22 | 2015-03-28 16:24:22 | Wasted Youth | Original Mix | 6360895 | Hexagon Recordings | Hardcore / Hard Techno | 2015-03-21 | 40248 | 1.49 |
6361054 | 99 | 0 | 2015-03-29 16:24:29 | 2015-03-29 16:24:29 | Boyah Bass | Original Mix | 6361054 | Out Of This World Records | Hardcore / Hard Techno | 2015-03-23 | 46737 | 1.49 |
6362843 | 98 | 0 | 2015-03-16 16:23:16 | 2015-03-16 16:23:16 | Precious | Original Mix | 6362843 | Wolf + Lamb Records | Electronica / Downtempo | 2015-03-09 | 12379 | 1.49 |
6370452 | 96 | 0 | 2015-04-19 16:25:27 | 2015-04-19 16:25:27 | Railway | Original Mix | 6370452 | Inwave | Minimal / Deep Tech | 2015-04-13 | 42452 | 1.49 |
And finally, we can join more than two tables on different columns:
SELECT t1.trackid, maxRank, title, remixer, name
FROM music_marketplace.charts t1 INNER JOIN music_marketplace.songs t2
ON t1.trackid = t2.trackid AND t1.maxRank > 95
INNER JOIN music_marketplace.artist_songs t3 ON
= t3.trackid
t2.trackid INNER JOIN music_marketplace.artist t4 ON
= t4.artistId; t3.artistId
trackid | maxRank | title | remixer | name |
---|---|---|---|---|
6191715 | 96 | A Woman Alone | False | Philip Weeks |
6274556 | 96 | Disgustingly Beautiful feat. Isa Kurz | False | Manu Delago |
6306434 | 99 | So Me Born | False | Assassin Agent Sasco |
6318110 | 99 | Ready For Sunshine | False | Same K |
6334749 | 97 | Debug | False | Raffaele Rizzi |
6348920 | 100 | The Blood Ties | False | Mental Crush |
6360895 | 97 | Wasted Youth | False | Fresh Otis |
6361054 | 99 | Boyah Bass | False | DJ Punch |
6362843 | 98 | Precious | False | Cameo Culture |
6370452 | 96 | Railway | False | Vinyl Speed Adjust |
t1
to specify from which table we want to retrive column trackid
, as trackid
appears in multiple tables. We do not need to use the table name for the other four columns as they are unuiquely defined.
7.2 Outer joins
Inner joins retrieve only rows that satisfy the join condition. But what if we want to keep all rows from a table and match information from another table whenever such information exists?
7.2.1 Left joins
For instance, table label
is incomplete: it does not store information about all labels that appear in the songs
table. We can use a left join to combine tables songs
and label
such that we keep all the information we want from songs
and add the link
from table label
whenever that is available:
SELECT title, t1.label, link
FROM music_marketplace.songs t1 LEFT JOIN music_marketplace.label t2
ON t1.label = t2.label
order by title;
title | label | link |
---|---|---|
#015 | Faut Section | NA |
#DACKSQUAD | Rottun Recordings | /label/rottun-recordings/8587 |
#Ecstasy | Indeependence Recordings | NA |
#Frei | Auftakt Records | NA |
#Hey | Beat Service Audio (RazNitzanMusic) | /label/beat-service-audio-raznitzanmusic/23431 |
#Hey | RNM Bundles | NA |
#MGTOW | The Sektorz Electro | NA |
#MGTOW | The Sektorz Electro | NA |
#SuperHero | Mondo Records | NA |
#TAVLD | Evolution Records | NA |
label
the LEFT JOIN will return the information of the left table and add NULL
(NA
) values in the column link.
label
is not a foreign key. The two columns have an undefinded relationship according to the ERD of Figure 5.1; but conceptually, we know that they store the same values, and as a result, we can use them to join the two tables.
7.2.2 Right joins
The right outer joins are the same as left outer joins, with the only difference that we keep all rows from the right table:
SELECT title, t1.label, link
FROM music_marketplace.label t2 RIGHT JOIN music_marketplace.songs t1
ON t1.label = t2.label
order by title;
title | label | link |
---|---|---|
#015 | Faut Section | NA |
#DACKSQUAD | Rottun Recordings | /label/rottun-recordings/8587 |
#Ecstasy | Indeependence Recordings | NA |
#Frei | Auftakt Records | NA |
#Hey | Beat Service Audio (RazNitzanMusic) | /label/beat-service-audio-raznitzanmusic/23431 |
#Hey | RNM Bundles | NA |
#MGTOW | The Sektorz Electro | NA |
#MGTOW | The Sektorz Electro | NA |
#SuperHero | Mondo Records | NA |
#TAVLD | Evolution Records | NA |
label
and song
in the query, and as a result, I got the same result as with the left join.
7.3 Unions
Similar to joins, unions combine data from multiple tables. But instead of combining columns, unions combine rows (similar to bind_rows
in R—but note that joins are different than bind_columns
).
Here is an example of a union between tables songs
and label
:
SELECT label
FROM music_marketplace.songs
UNION
SELECT link as label
FROM music_marketplace.label
ORDER BY label;
label |
---|
Inc |
Inc. |
#138 |
#Goldrush Recordings (Armada) |
& |
+Mas Label |
/label/100-percent-pure/1470 |
/label/1605/10775 |
/label/17-steps/42211 |
/label/2-owls-records/33454 |
For comments, suggestions, errors, and typos, please email me at: kokkodis@bc.edu