# Chapter 4 Kernel regression estimation I

The relation between two random variables $$X$$ and $$Y$$ can be completely characterized by their joint cdf $$F$$ or, equivalently, by their joint pdf $$f$$ if $$(X,Y)$$ is continuous. In the regression setting, we are interested in predicting/explaining the response $$Y$$ by means of the predictor $$X$$ from a sample $$(X_1,Y_1),\ldots,(X_n,Y_n)$$. The role of the variables is not symmetric: $$X$$ is used to predict/explain $$Y$$.

We first consider the simplest situation115: a single continuous predictor $$X$$ to predict a response $$Y$$.116 In this case, recall that the complete knowledge about $$Y$$ when $$X=x$$ is given by the conditional pdf $$f_{Y| X=x}(y)=\frac{f(x,y)}{f_X(x)}$$. While this pdf provides full knowledge about $$Y| X=x$$, estimating it is also challenging: for each $$x$$ we have to estimate a different curve! A simpler approach, yet still challenging, is to estimate the conditional mean (a scalar) for each $$x$$ through the so-called regression function

\begin{align} m(x):=\mathbb{E}[Y\vert X=x]=\int y\mathrm{d}F_{Y\vert X=x}(y)=\int yf_{Y\vert X=x}(y)\mathrm{d}y.\tag{4.1} \end{align}

As we will see, this density-based view of the regression function is very useful to motivate estimators.

1. For the sake of introducing the main concepts in kernel regression estimation. On Chapter 5 we will see more general situations with several predictors, possibly non-continuous.↩︎

2. Formally, the response $$Y$$ does not need to be continuous. We implicitly assume $$Y$$ is continuous to use (4.2) as a motivation for (4.5), but the subsequent derivations in the chapter are also valid for non-continuous responses.↩︎