Currently Machine Learning (ML) techniques are applied in an infinity of fields to obtain knowledge from data. Among these fields today we can highlight the appearance and effect of the coronavirus disease (COVID-19) in all aspects of society. That is why, by completing this second part of the advanced regression and prediction course, it is intended to use the techniques learned during the practical and theoretical sessions to predict the number of new cases of COVID-19 in the province of Valencia.

In turn, since this work is nothing more than the continuation of the first part, which dealt with statistical tools, it is intended in such a project to treat previously cured COVID-19 data using ML algorithms, so that we have a database without missing values and only with the most relevant variables in this case (for more information on the treatment carried out on the database and the variables selected as relevant, go to Part I: Statistical tools).

Now, focusing on this second part of the project, the following organization has been followed to carry it out:

  1. Procedure followed in the creation, construction and development of ML models.

  2. ML models: analysis and visualization of the results obtained.

  3. Project conclusions and future lines.

It is worth mentioning that the free software R has been used to carry out this work, while the Bookdown format has been selected for its presentation. The reason for this choice has come mainly from the advantages that this format offers us, by combining the dynamism and interactivity that a Shiny can have, for example, with the format of a report.

Also, before starting with the explanation of the project, mention that the theoretical sessions and practical cases of the subject, carried out by (Nogales 2021).


Nogales, Javier. 2021. “Advanced Regression and Prediction.” Universidad Carlos III de Madrid 1: 129.