10 SQL: Advanced functions
In this chapter, we will present some advanced MySQL functions. We will again use the schema music_marketplace
that describes a hypothetical marketplace for music. Recall that the schema includes the following tables:
songs
: stores information about songs.label
: stores information about music labels.artist
: stores information about artists.artist_songs
: stores which songs are produced by which artists.charts
: stores the maximum rank that a songs achieves in the charts (the hypothetical marketplace has top-selling charts).
Recall that the Entity Relationship diagram of this schema is as follows:
First, we will connect to the database (Section 4.1):
library(DBI)
library(odbc)
library(RODBC)
library(tidyverse)
= DBI::dbConnect(odbc::odbc(),
con Driver= "MySQL",
Server='mysql-isys3350.bc.edu',
UID='your_username',
PWD= 'your_password',
Port= 3306)
10.1 CAST
Often we need to change the data type of a column (e.g., from a string to a number). The function CAST
allows us to change the type as follows:
SELECT CAST("10" AS DECIMAL);
CAST(“10” AS DECIMAL) |
---|
10 |
CAST
values into DATE
, DECIMAL
, and CHAR
SELECT CAST(10 AS CHAR);
CAST(10 AS CHAR) |
---|
10 |
Here is another example where we adjust the datatype of column released
from table songs
to DATE
:
SELECT released, CAST(released AS DATE)
FROM music_marketplace.songs;
released | CAST(released AS DATE) |
---|---|
2016-01-18 | 2016-01-18 |
2016-02-23 | 2016-02-23 |
2016-02-23 | 2016-02-23 |
2016-02-23 | 2016-02-23 |
2016-01-25 | 2016-01-25 |
2015-10-12 | 2015-10-12 |
2015-09-25 | 2015-09-25 |
2015-09-30 | 2015-09-30 |
2015-09-30 | 2015-09-30 |
2015-09-30 | 2015-09-30 |
In the previous, we can compare the data type of the original column (
POSIXct
) and the datatype of the adjusted one (date
).
10.2 ROUND
, ABS
, SQRT
SQL offers functions that work with numerical data. For instance:
SELECT ROUND(price,1), SQRT(maxRank), ABS(firstDateInCharts - lastDateInCharts),firstDateInCharts - lastDateInCharts
FROM music_marketplace.songs s INNER JOIN
on s.trackid = c.trackId
music_marketplace.charts c LIMIT 10;
ROUND(price,1) | SQRT(maxRank) | ABS(firstDateInCharts - lastDateInCharts) | firstDateInCharts - lastDateInCharts |
---|---|---|---|
1.5 | 9.746794 | 0 | 0 |
1.5 | 6.480741 | 89032196 | -89032196 |
1.5 | 9.643651 | 3000049 | -3000049 |
1.5 | 7.483315 | 9796001618 | -9796001618 |
1.5 | 7.549834 | 891001593 | -891001593 |
1.5 | 4.472136 | 901162477 | -901162477 |
1.5 | 7.874008 | 118999938 | -118999938 |
1.5 | 7.745967 | 5999949 | -5999949 |
1.5 | 7.745967 | 7000005 | -7000005 |
1.5 | 5.916080 | 6999999 | -6999999 |
In the previous,
ROUND(, l)
keepsl
decimal places.
10.3 NOW
, CURDATE
, MONTH
, YEAR
, DATEDIFF
Similarly, we can use date-specific functions:
SELECT NOW(), CURDATE(), MONTH(CURDATE())
NOW() | CURDATE() | MONTH(CURDATE()) |
---|---|---|
2021-11-06 16:41:49 | 2021-11-06 | 11 |
Or, we can calculate the difference between dates:
SELECT DATEDIFF(CURDATE(), released), released, CURDATE()
FROM music_marketplace.songs;
DATEDIFF(CURDATE(), released) | released | CURDATE() |
---|---|---|
2119 | 2016-01-18 | 2021-11-06 |
2083 | 2016-02-23 | 2021-11-06 |
2083 | 2016-02-23 | 2021-11-06 |
2083 | 2016-02-23 | 2021-11-06 |
2112 | 2016-01-25 | 2021-11-06 |
2217 | 2015-10-12 | 2021-11-06 |
2234 | 2015-09-25 | 2021-11-06 |
2229 | 2015-09-30 | 2021-11-06 |
2229 | 2015-09-30 | 2021-11-06 |
2229 | 2015-09-30 | 2021-11-06 |
10.4 CASE
, IF
SQL allows us to implement logical comparisons through the CASE
, IF
, and IFNULL
functions. The CASE
function tests multiple logical expressions listed in WHEN
clauses as follows:
SELECT released,
CASE
WHEN DATEDIFF(CURDATE(), released) < 1000 THEN "Very recent"
WHEN DATEDIFF(CURDATE(), released) < 2000 THEN "Recent"
ELSE "Not recent"
END AS Recent
FROM music_marketplace.songs;
released | Recent |
---|---|
2016-01-18 | Not recent |
2016-02-23 | Not recent |
2016-02-23 | Not recent |
2016-02-23 | Not recent |
2016-01-25 | Not recent |
2015-10-12 | Not recent |
2015-09-25 | Not recent |
2015-09-30 | Not recent |
2015-09-30 | Not recent |
2015-09-30 | Not recent |
WHEN
, THEN
, ELSE
, and END AS
that are necessary for the CASE
clause to work.
The IF
function is very similar with the ifelse()
function we have seen in R:
SELECT released, IF(DATEDIFF(CURDATE(), released) < 1000, "Recent", "Not recent") as Recent
FROM music_marketplace.songs;
released | Recent |
---|---|
2016-01-18 | Not recent |
2016-02-23 | Not recent |
2016-02-23 | Not recent |
2016-02-23 | Not recent |
2016-01-25 | Not recent |
2015-10-12 | Not recent |
2015-09-25 | Not recent |
2015-09-30 | Not recent |
2015-09-30 | Not recent |
2015-09-30 | Not recent |
10.5 CONCAT_WS
, LOWER
, UPPER
, LENGTH
SQL also offers various functions that operate on strings. For instance, we have already seen function CONCAT_WS
which works similarly to the paste()
function in R. The LOWER
and UPPER
functions adjust a string to lowercase and uppercase respectively. Finally the function LENGTH
returns the number of characters of a string, similar to the nchar
function in R:
SELECT CONCAT_WS(" ", "Title:", LOWER(title)) as concat_title, UPPER(genre), LENGTH(label)
FROM music_marketplace.songs;
concat_title | UPPER(genre) | LENGTH(label) |
---|---|---|
Title: if i want you | BREAKS | 11 |
Title: suki | MINIMAL / DEEP TECH | 17 |
Title: goofy's nightmare | MINIMAL / DEEP TECH | 17 |
Title: suki | MINIMAL / DEEP TECH | 17 |
Title: golden nightfall | HOUSE | 14 |
Title: hope for the right | HOUSE | 14 |
Title: the night's already started | HOUSE | 14 |
Title: yungle | DRUM & BASS | 16 |
Title: sail | ELECTRONICA / DOWNTEMPO | 16 |
Title: chromatik [orbzone] | ELECTRONICA / DOWNTEMPO | 16 |
For comments, suggestions, errors, and typos, please email me at: kokkodis@bc.edu