Preface

My name is Sergio Berdiales and I am a Data Analyst with more than ten years experience in Customer Experience and Quality areas. If you want to know more about me or contact me you can visit my Linkedin profile or my Twitter account.

This is my final project for the Kschool Master on Data Science (8th edition). The main objective of this project is to show I can apply the acquired knowledge during the master’s course in a practical way .

The Master on Data Science of Kschool is a 230-hour course which includes Python and R programming, Statistics, Machine Learning methods, Visualization tools, a Deep Learning introduction and much more (use of Git / Github, linux command line, Jupyter and Google Collab notebooks, etc). And all of these in a very practical and useful way. If you are interested in becoming a Data Scientist this course could be your first step.

  • Structure of this document

This document is divided in two basic parts.

    1. Project Memory. The memory of the project spans from the “Preface” to the “References” section. The purpose of this part is to briefly explain what the objectives of this project are, which methodology I used in order to achieve these objectives, and what my final conclusions are.
    1. R and Python scripts. In this part, I included all the R code used (this R code is saved too as rmarkdown files in the Github project repository).
      The Python code is not included in this document but you have the links to the Google Collab notebooks at the Python scripts section. All the Python notebooks are in the Github repository project too as Jupyter notebooks.

This is a work in progress and my intention is for it to be just the start of something bigger. The next step is to improve the forecasts of pollutants levels including the weather forecasts in the models. So, I think this models would give more accurate predictions. Then, I would put the models in production in order to give real time predictions via Twitter.

I am learning. So, if you see something wrong in my code, my reasoning or anything else you think I could improve or fix, please, tell me. I would really appreciate your help. You can contact me via Linkedin profile, Twitter account or Gmail.

If you are interested in learning about air pollution my advice is to start visiting any of these web sites:

And if your interest is specifically about the situation of air quality in the city of Gijón I highly recommend you read these reports.

  • “Plan de mejora de la calidad del aire en la aglomeración área de Gijón.” (pdf here).

  • “Calidad del Aire y Salud en Asturias. Informe Epidemiológico 2016.” (pdf here).

  • “Informe de calidad del aire en Asturias.”(pdf here).

  • “Estudio de contribución de fuentes en las partículas PM10 en suspensión en la aglomeración área de Gijón y en la zona de Avilés.” (pdf here).

  • “Modelización de la contaminación por partículas PM10 en la aglomeración de Gijón.” (pdf here).