10 SQL: Advanced functions

In this chapter, we will present some advanced MySQL functions. We will again use the schema music_marketplace that describes a hypothetical marketplace for music. Recall that the schema includes the following tables:

  • songs: stores information about songs.
  • label: stores information about music labels.
  • artist: stores information about artists.
  • artist_songs: stores which songs are produced by which artists.
  • charts: stores the maximum rank that a songs achieves in the charts (the hypothetical marketplace has top-selling charts).

Recall that the Entity Relationship diagram of this schema is as follows:

The Entity Relationship diagram of schema `music_marketplace`.

Figure 10.1: The Entity Relationship diagram of schema music_marketplace.

 

First, we will connect to the database (Section 4.1):

library(DBI)
library(odbc)
library(RODBC)
library(tidyverse)
con = DBI::dbConnect(odbc::odbc(), 
                    Driver= "MySQL", 
                    Server='mysql-isys3350.bc.edu', 
                    UID='your_username', 
                    PWD= 'your_password',
                    Port= 3306)

 

10.1 CAST

Often we need to change the data type of a column (e.g., from a string to a number). The function CAST allows us to change the type as follows:

SELECT  CAST("10" AS DECIMAL);
Table 10.1: 1 records
CAST(“10” AS DECIMAL)
10


We can CAST values into DATE, DECIMAL, and CHAR


SELECT  CAST(10 AS CHAR);
Table 10.2: 1 records
CAST(10 AS CHAR)
10

Here is another example where we adjust the datatype of column released from table songs to DATE:

SELECT  released, CAST(released AS DATE) 
FROM music_marketplace.songs;
Table 10.3: Displaying records 1 - 10
released CAST(released AS DATE)
2016-01-18 2016-01-18
2016-02-23 2016-02-23
2016-02-23 2016-02-23
2016-02-23 2016-02-23
2016-01-25 2016-01-25
2015-10-12 2015-10-12
2015-09-25 2015-09-25
2015-09-30 2015-09-30
2015-09-30 2015-09-30
2015-09-30 2015-09-30


In the previous, we can compare the data type of the original column (POSIXct) and the datatype of the adjusted one (date).

10.2 ROUND, ABS, SQRT

SQL offers functions that work with numerical data. For instance:

SELECT  ROUND(price,1), SQRT(maxRank), ABS(firstDateInCharts - lastDateInCharts),firstDateInCharts - lastDateInCharts  
FROM music_marketplace.songs s INNER JOIN 
music_marketplace.charts c on s.trackid = c.trackId
LIMIT 10;
Table 10.4: Displaying records 1 - 10
ROUND(price,1) SQRT(maxRank) ABS(firstDateInCharts - lastDateInCharts) firstDateInCharts - lastDateInCharts
1.5 9.746794 0 0
1.5 6.480741 89032196 -89032196
1.5 9.643651 3000049 -3000049
1.5 7.483315 9796001618 -9796001618
1.5 7.549834 891001593 -891001593
1.5 4.472136 901162477 -901162477
1.5 7.874008 118999938 -118999938
1.5 7.745967 5999949 -5999949
1.5 7.745967 7000005 -7000005
1.5 5.916080 6999999 -6999999


In the previous, ROUND(, l) keeps l decimal places.

10.3 NOW, CURDATE, MONTH, YEAR, DATEDIFF

Similarly, we can use date-specific functions:

SELECT NOW(), CURDATE(), MONTH(CURDATE())
Table 10.5: 1 records
NOW() CURDATE() MONTH(CURDATE())
2021-11-06 16:41:49 2021-11-06 11

Or, we can calculate the difference between dates:

SELECT DATEDIFF(CURDATE(), released), released, CURDATE() 
FROM music_marketplace.songs;
Table 10.6: Displaying records 1 - 10
DATEDIFF(CURDATE(), released) released CURDATE()
2119 2016-01-18 2021-11-06
2083 2016-02-23 2021-11-06
2083 2016-02-23 2021-11-06
2083 2016-02-23 2021-11-06
2112 2016-01-25 2021-11-06
2217 2015-10-12 2021-11-06
2234 2015-09-25 2021-11-06
2229 2015-09-30 2021-11-06
2229 2015-09-30 2021-11-06
2229 2015-09-30 2021-11-06

10.4 CASE, IF

SQL allows us to implement logical comparisons through the CASE, IF, and IFNULL functions. The CASE function tests multiple logical expressions listed in WHEN clauses as follows:

SELECT released,
CASE 
  WHEN DATEDIFF(CURDATE(), released) < 1000 THEN "Very recent"
  WHEN DATEDIFF(CURDATE(), released) < 2000 THEN "Recent"
  ELSE "Not recent"
END AS Recent
FROM music_marketplace.songs;
Table 10.7: Displaying records 1 - 10
released Recent
2016-01-18 Not recent
2016-02-23 Not recent
2016-02-23 Not recent
2016-02-23 Not recent
2016-01-25 Not recent
2015-10-12 Not recent
2015-09-25 Not recent
2015-09-30 Not recent
2015-09-30 Not recent
2015-09-30 Not recent


Note the special keywords WHEN, THEN, ELSE, and END AS that are necessary for the CASE clause to work.


The IF function is very similar with the ifelse() function we have seen in R:

SELECT released, IF(DATEDIFF(CURDATE(), released) < 1000, "Recent", "Not recent") as Recent
FROM music_marketplace.songs;
Table 10.8: Displaying records 1 - 10
released Recent
2016-01-18 Not recent
2016-02-23 Not recent
2016-02-23 Not recent
2016-02-23 Not recent
2016-01-25 Not recent
2015-10-12 Not recent
2015-09-25 Not recent
2015-09-30 Not recent
2015-09-30 Not recent
2015-09-30 Not recent

10.5 CONCAT_WS, LOWER, UPPER, LENGTH

SQL also offers various functions that operate on strings. For instance, we have already seen function CONCAT_WS which works similarly to the paste() function in R. The LOWER and UPPER functions adjust a string to lowercase and uppercase respectively. Finally the function LENGTH returns the number of characters of a string, similar to the nchar function in R:

SELECT CONCAT_WS(" ", "Title:", LOWER(title)) as concat_title, UPPER(genre), LENGTH(label)
FROM music_marketplace.songs;
Table 10.9: Displaying records 1 - 10
concat_title UPPER(genre) LENGTH(label)
Title: if i want you BREAKS 11
Title: suki MINIMAL / DEEP TECH 17
Title: goofy's nightmare MINIMAL / DEEP TECH 17
Title: suki MINIMAL / DEEP TECH 17
Title: golden nightfall HOUSE 14
Title: hope for the right HOUSE 14
Title: the night's already started HOUSE 14
Title: yungle DRUM & BASS 16
Title: sail ELECTRONICA / DOWNTEMPO 16
Title: chromatik [orbzone] ELECTRONICA / DOWNTEMPO 16

For comments, suggestions, errors, and typos, please email me at: