Chapter 4 Kernel regression estimation I
The relation between two random variables X and Y can be completely characterized by their joint cdf F or, equivalently, by their joint pdf f if (X,Y) is continuous. In the regression setting, we are interested in predicting/explaining the response Y by means of the predictor X from a sample (X1,Y1),…,(Xn,Yn). The role of the variables is not symmetric: X is used to predict/explain Y. See Section B.1 for a quick review on the relevant concepts of linear regression used in this chapter.
We first consider the simplest situation:119 a single continuous predictor X to predict a response Y.120 In this case, recall that the complete knowledge about Y when X=x is given by the conditional pdf fY|X=x(y)=f(x,y)fX(x). While this pdf provides full knowledge about Y|X=x, estimating it is also challenging: for each x we have to estimate a different curve! A simpler approach, yet still challenging, is to estimate the conditional mean (a scalar) for each x through the so-called regression function
m(x):=E[Y|X=x]=∫ydFY|X=x(y)=∫yfY|X=x(y)dy.
As we will see, this density-based view of the regression function is very useful to motivate estimators.
For the sake of introducing the main concepts in kernel regression estimation. On Chapter 5 we will see more general situations with several predictors, possibly non-continuous.↩︎
Formally, the response Y does not need to be continuous. We implicitly assume Y is continuous to use (4.2) as a motivation for (4.5), but the subsequent derivations in the chapter are also valid for non-continuous responses.↩︎