Section 19 Enrichment Overview
To discover which variables, not included in the Original data-set, are known to influence property prices I conducted an online literature search. In the Literature Search subsection, I detail the results.
My literature search identified potential gaps in the Original Data and I performed a significant data enrichment process. This process is documented in the Enrichment Process subsection, where I talk through the custom R code which I wrote to extract data from online data providers.
The Enriched data set includes local area information from the Google Maps and Google Radar Search API services. For financial information such as rental values, the Enriched Data set contains data from the Seattle based real estate and rental marketplace Zillow. In the Results subsection I present and evaluate the Enriched Data set using geospatial plots.
Literature Review
The original data set only included information on the intrinsic characeteristics of a property. For example there were no variables pertaining to the neighbourhood of a property nor to macroeconomic conditions (eg. the state of the mortgage market, unemployment levels,National and local government policies, demographics etc).
Using only the original data could result in biased parameter estimates or the significance level of a variable being overstated. For example, the data comes from the period from May 2014 to May 2015 when the external environment in the US was relatively stable. This means that the level of noise in the data could be artificially low and the explanatory power of micro variables such as bedroom number overstated.
An online literature search helped to mitigate this risk and identified variables which previous research studies found to be significant predictors of property price. The table below shows the variables identified, the reference article and the matching field in the database. Article 1 is (Galati, Teppa, and Alessie 2011), Article 2 is (Ezgi CANDAS and YOMRALIOGLU 2015), Website 1 is (RightMove 2017)
Micro-variable | Reference | Original Data-Set Field |
---|---|---|
Year of Construction | Article 1 | Construction Year |
Size of Living Room | Article 1 | Living Space |
Presence of Garage | Article 1 | Missing |
Presence of Garden | Article 1 | Lot Size |
Type of House | Article 1 | Number of Floors |
Large City vs not large | Article 1 | Missing |
Degree of Urbanization | Article 1 | Missing |
Floor No | Article 2 | Floors |
Heating System | Article 2 | Renovation Year |
Earthquake Zone | Article 2 | Not relevant |
Rental Value | Article 2 | Missing |
Land Value | Article 2 | Missing |
Parcel Area | Article 2 | Total Area |
Zoning | Article 2 | Zipcode |
Proximity to Amenities | Website 1 | Missing |
Number of Bedrooms | Website 1 | Bedrooms |
Number of Bathrooms | Website 1 | Bathrooms |
Condition of Interior | Website 1 | Condition/Grade |
References
Galati, Gabriele, Federica Teppa, and Rob JM Alessie. 2011. “Macro and Micro Drivers of House Price Dynamics: An Application to Dutch Data.”
Ezgi CANDAS, Seda BAGDATLI KALKAN, and Tahsin YOMRALIOGLU. 2015. “Determining the Factors Affecting Housing Prices.”
RightMove. 2017. “Positive and Negative Impacts on House Prices.” http://www.rightmove.co.uk/what-affects-house-prices.html.