Notes for Nonparametric Statistics
MSc in Statistics for Data Science at Carlos III University of Madrid
The course is designed to have, roughly, one lesson per each main topic in the syllabus. The schedule is tight due to time constraints, which will inevitably make the treatment of certain methods a little superficial compared with what it would be the optimal. Nevertheless, the course will hopefully give you a respectable panoramic view of different available methods for nonparametric statistics. A broad view of the syllabus and its planning is:
- Introduction (first lesson)
- Kernel density estimation I (first/second lesson)
- Kernel density estimation II (second/third lesson)
- Kernel regression estimation I (fourth/fifth lesson)
- Kernel regression estimation II (fifth/sixth lesson)
- Nonparametric tests (seventh lesson)
Some logistics for the development of the course follow:
- The office hours are Tuesdays from 9:30 to 10:30, at the classroom in which the session is to take place. Please send me an email with a reasonable anticipation indicating that you intend to attend to the office hours, as otherwise I may not be around.
- Questions and comments during lectures are mostly welcome. So just go ahead and fire! Particularly if these are clarifications, comments or alternative perspectives that may help the rest of the class.
- Detailed course evaluation guidelines can be found here.
Main references and credits
Several great reference books have been used for preparing these notes. The following list presents the books that have been consulted1:
- D’Agostino and Stephens (1986) (Section 6.1).
- Chacón and Duong (2018) (Sections 3.1, 3.2, 3.3, and 3.4)
- Fan and Gijbels (1996) (Sections 4.1, 4.2, and 4.3)
- DasGupta (2008) (Sections 1.1, 1.3, 1.4, and 6.1).
- Li and Racine (2007) (Section 5.1)
- Loader (1999) (Section 5.4)
- Scott (2015) (Sections 2.1, 2.2, and 2.4)).
- Silverman (1986) (Sections 2.2 and 2.4).
- van der Vaart (1998) (Sections 1.3 and 1.4).
- Wand and Jones (1995) (Sections 2.2, 2.4, 2.4, 2.5, and 5.4)
- Wasserman (2004) (Sections 1.3, 5.4, and B).
- Wasserman (2006) (Sections 1.1, 4.3, and A).
These notes are possible due to the existence of the incredible pieces of software by Xie (2016a), Xie (2016b), Allaire et al. (2017), Xie and Allaire (2018), and R Core Team (2018). Also, certain hacks to improve the design layout have been possible due to the wonderful work of Úcar (2018). The icons used in the notes were designed by madebyoliver, freepik, and roundicons from Flaticon.
Contributions, reporting of typos, and feedback on the notes are very welcome. Either send an email to email@example.com or, if you are willing to provide several contributions, ask for access to the GitHub repository, so you can open a pull request and submit your modifications for approval. Give me a reason for writing your name in the list of contributors!
List of contributors:
- Manuel García Corbí (pointed towards several typos in the code)
- Rafael Monsalve Roquero (indicated a notational problem)
- Miguel Novillo Arana (indicated a typo in the code)
- María del Carmen Paternina Die (pointed out a missing factor in one formula; indicated a notational collision)
- Raquel Parra Suazo (pointed out a missing factor in one formula; indicated a notational collision)
- Adrián Torres Núñez (indicated and provided fixes for several typos)
- Jaime Ugarte Abollado (indicated a languague typo)
- Shenbin Zheng (pointed out a missing factor in one formula)
All the material in these notes is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License (CC BY-NC-ND 4.0). You may not use this material except in compliance with the former license. The human-readable summary of the license states that:
- You are free to:
- Share – Copy and redistribute the material in any medium or format.
- Under the following terms:
- Attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- NonCommercial – You may not use the material for commercial purposes.
- NoDerivatives – If you remix, transform, or build upon the material, you may not distribute the modified material.
D’Agostino, Ralph B., and Michael A. Stephens, eds. 1986. Goodness-of-Fit Techniques. Vol. 68. Statistics: Textbooks and Monographs. New York: Marcel Dekker, Inc.
Chacón, J. E., and T. Duong. 2018. Multivariate Kernel Smoothing and Its Applications. Vol. 160. Monographs on Statistics and Applied Probability. Boca Raton, FL: CRC Press. doi:10.1201/9780429485572.
Fan, J., and I. Gijbels. 1996. Local Polynomial Modelling and Its Applications. Vol. 66. Monographs on Statistics and Applied Probability. London: Chapman & Hall. doi:10.2307/2670134.
DasGupta, A. 2008. Asymptotic Theory of Statistics and Probability. Springer Texts in Statistics. New York: Springer. doi:10.1007/978-0-387-75971-5.
Li, Qi, and Jeffrey Scott Racine. 2007. Nonparametric Econometrics. Princeton, NJ: Princeton University Press.
Loader, C. 1999. Local Regression and Likelihood. Statistics and Computing. New York: Springer-Verlag. doi:10.2307/1270956.
Scott, D. W. 2015. Multivariate Density Estimation. Second. Wiley Series in Probability and Statistics. Hoboken: John Wiley & Sons, Inc. doi:10.1002/9781118575574.
Silverman, B. W. 1986. Density Estimation for Statistics and Data Analysis. Monographs on Statistics and Applied Probability. London: Chapman & Hall. doi:10.1007/978-1-4899-3324-9.
van der Vaart, A. W. 1998. Asymptotic Statistics. Vol. 3. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge: Cambridge University Press. doi:10.1017/CBO9780511802256.
Wand, M. P., and M. C. Jones. 1995. Kernel Smoothing. Vol. 60. Monographs on Statistics and Applied Probability. London: Chapman & Hall, Ltd. doi:10.1007/978-1-4899-4493-1.
Wasserman, L. 2004. All of Statistics. Springer Texts in Statistics. New York: Springer-Verlag. doi:10.1007/978-0-387-21736-9.
Wasserman, L. 2006. All of Nonparametric Statistics. Springer Texts in Statistics. New York: Springer-Verlag. doi:10.1007/0-387-30623-4.
Xie, Y. 2016a. Bookdown: Authoring Books and Technical Documents with R Markdown. The R Series. Boca Raton: Chapman & Hall/CRC. doi:10.1201/9781315204963.
Xie, Y. 2016b. knitr: A General-Purpose Package for Dynamic Report Generation in R. https://CRAN.R-project.org/package=knitr.
Allaire, J. J., J. Cheng, Y. Xie, J. McPherson, W. Chang, J. Allen, H. Wickham, A. Atkins, R. Hyndman, and R. Arslan. 2017. rmarkdown: Dynamic Documents for R. https://CRAN.R-project.org/package=rmarkdown.
Xie, Y., and J.J. Allaire. 2018. tufte: Tufte’s Styles for R Markdown Documents. https://CRAN.R-project.org/package=tufte.
R Core Team. 2018. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Úcar, I. 2018. “Energy Efficiency in Wireless Communications for Mobile User Devices.” PhD thesis, Universidad Carlos III de Madrid. https://enchufa2.github.io/thesis/.
List to be made more specific for certain references.↩