3.8 SQL: Intro

  • SQL (pronounced S-Q-L or SEQUEL) is a language designed to query relational databases
  • Used by most financial and commercial companies
  • The result of an SQL query is always a table
  • It’s a nonprocedural language: define inputs and outputs; how the statement is executed is left to the optimizer
  • How long SQL queries take depends on optimization that is opaque to user (which is great!)
  • SQL is a language that works with many (non-)commercial products:
    • Oracle Database, SQL Server (MS), MySQL, PostgreSQL, SQLite (all three open-source), Google BigQuery, Amazon Redshift…
    • Performance will vary, but generally faster than standard data frame manipulation in R (and much more scalable)
  • dplyr is being slowly developed to work with databases