10 SQL: Advanced functions
In this chapter, we will present some advanced MySQL functions. We will again use the schema music_marketplace that describes a hypothetical marketplace for music. Recall that the schema includes the following tables:
songs: stores information about songs.label: stores information about music labels.artist: stores information about artists.artist_songs: stores which songs are produced by which artists.charts: stores the maximum rank that a songs achieves in the charts (the hypothetical marketplace has top-selling charts).
Recall that the Entity Relationship diagram of this schema is as follows:
Figure 10.1: The Entity Relationship diagram of schema music_marketplace.
First, we will connect to the database (Section 4.1):
library(DBI)
library(odbc)
library(RODBC)
library(tidyverse)con = DBI::dbConnect(odbc::odbc(),
Driver= "MySQL",
Server='mysql-isys3350.bc.edu',
UID='your_username',
PWD= 'your_password',
Port= 3306)
10.1 CAST
Often we need to change the data type of a column (e.g., from a string to a number). The function CAST allows us to change the type as follows:
SELECT CAST("10" AS DECIMAL);| CAST(“10” AS DECIMAL) |
|---|
| 10 |
CAST values into DATE, DECIMAL, and CHAR
SELECT CAST(10 AS CHAR);| CAST(10 AS CHAR) |
|---|
| 10 |
Here is another example where we adjust the datatype of column released from table songs to DATE:
SELECT released, CAST(released AS DATE)
FROM music_marketplace.songs;| released | CAST(released AS DATE) |
|---|---|
| 2016-01-18 | 2016-01-18 |
| 2016-02-23 | 2016-02-23 |
| 2016-02-23 | 2016-02-23 |
| 2016-02-23 | 2016-02-23 |
| 2016-01-25 | 2016-01-25 |
| 2015-10-12 | 2015-10-12 |
| 2015-09-25 | 2015-09-25 |
| 2015-09-30 | 2015-09-30 |
| 2015-09-30 | 2015-09-30 |
| 2015-09-30 | 2015-09-30 |
In the previous, we can compare the data type of the original column (
POSIXct) and the datatype of the adjusted one (date).
10.2 ROUND, ABS, SQRT
SQL offers functions that work with numerical data. For instance:
SELECT ROUND(price,1), SQRT(maxRank), ABS(firstDateInCharts - lastDateInCharts),firstDateInCharts - lastDateInCharts
FROM music_marketplace.songs s INNER JOIN
music_marketplace.charts c on s.trackid = c.trackId
LIMIT 10;| ROUND(price,1) | SQRT(maxRank) | ABS(firstDateInCharts - lastDateInCharts) | firstDateInCharts - lastDateInCharts |
|---|---|---|---|
| 1.5 | 9.746794 | 0 | 0 |
| 1.5 | 6.480741 | 89032196 | -89032196 |
| 1.5 | 9.643651 | 3000049 | -3000049 |
| 1.5 | 7.483315 | 9796001618 | -9796001618 |
| 1.5 | 7.549834 | 891001593 | -891001593 |
| 1.5 | 4.472136 | 901162477 | -901162477 |
| 1.5 | 7.874008 | 118999938 | -118999938 |
| 1.5 | 7.745967 | 5999949 | -5999949 |
| 1.5 | 7.745967 | 7000005 | -7000005 |
| 1.5 | 5.916080 | 6999999 | -6999999 |
In the previous,
ROUND(, l)keepsldecimal places.
10.3 NOW, CURDATE, MONTH, YEAR, DATEDIFF
Similarly, we can use date-specific functions:
SELECT NOW(), CURDATE(), MONTH(CURDATE())| NOW() | CURDATE() | MONTH(CURDATE()) |
|---|---|---|
| 2021-11-06 16:41:49 | 2021-11-06 | 11 |
Or, we can calculate the difference between dates:
SELECT DATEDIFF(CURDATE(), released), released, CURDATE()
FROM music_marketplace.songs;| DATEDIFF(CURDATE(), released) | released | CURDATE() |
|---|---|---|
| 2119 | 2016-01-18 | 2021-11-06 |
| 2083 | 2016-02-23 | 2021-11-06 |
| 2083 | 2016-02-23 | 2021-11-06 |
| 2083 | 2016-02-23 | 2021-11-06 |
| 2112 | 2016-01-25 | 2021-11-06 |
| 2217 | 2015-10-12 | 2021-11-06 |
| 2234 | 2015-09-25 | 2021-11-06 |
| 2229 | 2015-09-30 | 2021-11-06 |
| 2229 | 2015-09-30 | 2021-11-06 |
| 2229 | 2015-09-30 | 2021-11-06 |
10.4 CASE, IF
SQL allows us to implement logical comparisons through the CASE, IF, and IFNULL functions. The CASE function tests multiple logical expressions listed in WHEN clauses as follows:
SELECT released,
CASE
WHEN DATEDIFF(CURDATE(), released) < 1000 THEN "Very recent"
WHEN DATEDIFF(CURDATE(), released) < 2000 THEN "Recent"
ELSE "Not recent"
END AS Recent
FROM music_marketplace.songs;| released | Recent |
|---|---|
| 2016-01-18 | Not recent |
| 2016-02-23 | Not recent |
| 2016-02-23 | Not recent |
| 2016-02-23 | Not recent |
| 2016-01-25 | Not recent |
| 2015-10-12 | Not recent |
| 2015-09-25 | Not recent |
| 2015-09-30 | Not recent |
| 2015-09-30 | Not recent |
| 2015-09-30 | Not recent |
WHEN, THEN, ELSE, and END AS that are necessary for the CASE clause to work.
The IF function is very similar with the ifelse() function we have seen in R:
SELECT released, IF(DATEDIFF(CURDATE(), released) < 1000, "Recent", "Not recent") as Recent
FROM music_marketplace.songs;| released | Recent |
|---|---|
| 2016-01-18 | Not recent |
| 2016-02-23 | Not recent |
| 2016-02-23 | Not recent |
| 2016-02-23 | Not recent |
| 2016-01-25 | Not recent |
| 2015-10-12 | Not recent |
| 2015-09-25 | Not recent |
| 2015-09-30 | Not recent |
| 2015-09-30 | Not recent |
| 2015-09-30 | Not recent |
10.5 CONCAT_WS, LOWER, UPPER, LENGTH
SQL also offers various functions that operate on strings. For instance, we have already seen function CONCAT_WS which works similarly to the paste() function in R. The LOWER and UPPER functions adjust a string to lowercase and uppercase respectively. Finally the function LENGTH returns the number of characters of a string, similar to the nchar function in R:
SELECT CONCAT_WS(" ", "Title:", LOWER(title)) as concat_title, UPPER(genre), LENGTH(label)
FROM music_marketplace.songs;| concat_title | UPPER(genre) | LENGTH(label) |
|---|---|---|
| Title: if i want you | BREAKS | 11 |
| Title: suki | MINIMAL / DEEP TECH | 17 |
| Title: goofy's nightmare | MINIMAL / DEEP TECH | 17 |
| Title: suki | MINIMAL / DEEP TECH | 17 |
| Title: golden nightfall | HOUSE | 14 |
| Title: hope for the right | HOUSE | 14 |
| Title: the night's already started | HOUSE | 14 |
| Title: yungle | DRUM & BASS | 16 |
| Title: sail | ELECTRONICA / DOWNTEMPO | 16 |
| Title: chromatik [orbzone] | ELECTRONICA / DOWNTEMPO | 16 |
For comments, suggestions, errors, and typos, please email me at: kokkodis@bc.edu