Notes for Predictive Modeling
MSc in Big Data Analytics at Carlos III University of Madrid
Last updated: 2024-04-28, v5.10.1
Preface
Welcome
Welcome to the notes for Predictive Modeling. The course is part of the MSc in Big Data Analytics from Carlos III University of Madrid.
The course is designed to have, roughly, one session per main topic in the syllabus. The schedule is tight due to time constraints, which will inevitably make the treatment of certain methods somehow superficial. Nevertheless, the course will hopefully give you a respectable panoramic view of different available statistical methods for predictive modeling. A broad view of the syllabus and its planning is:
- Introduction (first session)
- Linear models I (first/second session)
- Linear models II (second/third session)
- Linear models III (third/fourth session)
- Generalized linear models (fifth/sixth session)
- Nonparametric regression (sixth/seventh session)
Some logistics for the development of the course follow:
- Office hours are described in the Aula Global (right panel).
- Questions and comments during lectures are most welcome. Particularly if these are clarifications, comments, or alternative perspectives that may help the rest of the class. So just go ahead and fire!
- Detailed course evaluation guidelines can be found in the Aula Global. Recall that participation in lessons is positively evaluated.
Main references and credits
Several great reference books have been used for preparing these notes. The following list presents the books that have been consulted:
- Chacón and Duong (2018) (Section 6.1.4)
- DasGupta (2008) (Section 3.5.2)
- Durbán (2017) (Section 5.2.2)
- Fan and Gijbels (1996) (Sections 6.2, 6.2.3, and 6.2.4)
- Hastie, Tibshirani, and Friedman (2009) (Section 4.1)
- James et al. (2013) (Sections 2.2 – 2.7, 3.1, 3.5, and 3.6.3, 4.1)
- Kuhn and Johnson (2013) (Section 1.2)
- Li and Racine (2007) (Section 6.3)
- Loader (1999) (Section 6.5)
- McCullagh and Nelder (1983) (Sections 5.2 – 5.6)
- Peña (2002) (Sections 2.2 – 2.7, 3.5, and 5.2.1)
- Seber and Lee (2003) (Section 4.2)
- Seber (1984) (Section 4.3)
- Wand and Jones (1995) (Sections 6.1.2, 6.1.3, and 6.2.4)
- Wasserman (2004) (Sections 6.5)
- Wasserman (2006) (Sections 6.2.4)
- Wood (2006) (Sections 5.2.2 and 5.7)
These notes are possible due to the existence of the incredible pieces of software by Xie (2016), Xie (2020), Allaire et al. (2020), Xie and Allaire (2020), and R Core Team (2020). Also, certain hacks to improve the design layout have been possible due to the outstanding work of Úcar (2018). The icons used in the notes were designed by madebyoliver, freepik, and roundicons from Flaticon.
Last but not least, the notes have benefited from contributions from the following people:
- Ainara Apezteguía García (fixed a typo)
- Katherine Botz (performed a thorough proofreading of the course materials, fixing a large number of typos)
- Jorge Caballero Cárdenas (fixed five typos)
- Antonio Carrera Maestro (fixed two bugs and three typos)
- Marcos José Castillo Estévez (fixed two typos)
- Luis Cerdán Pedraza (performed an outstanding proofreading of the course materials fixing more than fifty typos and style issues)
- Frederik Chettouh (fixed a typo and two bugs)
- Gulnur Demir (fixed two typos)
- Andrés Escalante Ariza (fixed a typo)
- José Ángel Fernández (fixed several typos)
- Celia García Ramírez (fixed two bugs)
- Trinidad González Berzal (fixed a typo)
- David González González (fixed two typos)
- Antonio Marín Abril (fixed two bugs)
- Andrés Modet Álamo (performed an excellent review of the course materials detecting and fixing more than thirty typos and four bugs)
- Santiago Palmero Muñoz (fixed a typo and a bug)
- Federico Petraccaro (fixed three typos)
- Enrique Ramírez Díaz (fixed a typo)
- Pavel Razgovorov (fixed a typo)
- Cristina Rodríguez Beltrán (fixed a typo and two bugs)
- Manuel Rodríguez Ramírez (fixed two typos)
- Celia Romero González (fixed a typo)
- Carlota Royo Ruiz (fixed a bug)
- Leonardo Stincone (fixed a typo and a bug)
Contributions
Contributions, reporting of typos, and feedback on the notes are very welcome. Just send an email to edgarcia@est-econ.uc3m.es and give me a good reason for writing your name in the list of contributors!
License
All the material in these notes is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License (CC BY-NC-ND 4.0). You may not use this material except in compliance with the aforementioned license. The human-readable summary of the license states that:
- You are free to:
- Share – Copy and redistribute the material in any medium or format.
- Under the following terms:
- Attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial – You may not use the material for commercial purposes.
- NoDerivatives – If you remix, transform, or build upon the material, you may not distribute the modified material.
Citation
You may use the following \(\mathrm{B{\scriptstyle{IB}} \! T\!_{\displaystyle E} \! X}\) entry when citing these notes:
@book{Garcia-Portugues2024,
title = {Notes for Predictive Modeling},
author = {Garc\'ia-Portugu\'es, E.},
year = {2024},
note = {Version 5.10.1. ISBN 978-84-09-29679-8},
url = {https://bookdown.org/egarpor/PM-UC3M/}
}
You may also want to use the following template:
García-Portugués, E. (2024). Notes for Predictive Modeling. Version 5.10.1. ISBN 978-84-09-29679-8. Available at https://bookdown.org/egarpor/PM-UC3M/.