Data Analytics Module
Lecturer: Hans van der Zwan
Lab 02
Topic: summary statistics
Use the HP_LONDON_JAN19.xlsx file to generate Table 1.1
Table 1
Summary statistics properties sold in London in January 2019
TYPE | COUNT | AVERAGE | MEDIAN | SD | PERC_ABOVE_1mln |
FLAT | 782 | 592,700 | 445,000 | 612,700 | 9.1 |
DETACHED | 37 | 1,498,800 | 1,250,000 | 873,100 | 62.2 |
SEMI-DETACHED | 125 | 982,200 | 805,000 | 642,400 | 32.8 |
TERRACED | 442 | 876,100 | 630,000 | 919,700 | 23.5 |
Note. Average, median and standard deviation (SD) are in GB Pounds. The last column gives the percentages of properties sold with a selling price above 1 mln GPB. Underlying data are retrieved from http://landregistry.data.gov.uk/app/ppd/.
The file hp_london_January16171819.xlsx contains information about the properties sold in London in January 2016, January 2017, January 2018 and January 2019.
Analyze the data in this file:
Gather the GDP per capita data per country from the World Bank open data website. Copy the data for the year 2017 to a new worksheet.
Gather the life expectancy data from the same website. Copy the data for the year 2017 to a new worksheet.
Gather the population data from the same website. Copy the data for the year 2017 in a new worksheet.
Join the GDP per capita data for 2017 with the life expectancy data and with the population data.
Use the file 2019_PM10_NL.csv.
This file contains hourly PM10 levels in 2019 for … stations in The Netherlands; data source: https://luchtmeetnet.nl.
Choose one of these four locations: (1) Amsterdam-Stadhouderskade, (2) Rotterdam-Schiedamsevest, (3) Den Haag-Amsterdamse Veerkade or (4) Utrecht-Kardinaal de Jong weg. COpy paste the data for this loation to a new worksheet in Excel.
one way to calculate the median per category in MS Excel is by the use of a so called array function. See this website for an example.↩