Chapter 14 Quantile Regression

For academic review on quantile regression, check (Yu, Lu, and Stander 2003)

Linear Regression is based on the conditional mean function \(E(y|x)\)

In Quantile regression, we can view each points in the conditional distribution of y. Quantile regression estimates the conditional median or any other quantile of Y.

In the case that we’re interested in the 50th percentile, quantile regression is median regression, also known as least-absolute-deviations (LAD) regression, minimizes \(\sum_{i}|e_i|\)

Properties of estimators \(\beta\)

  • Asymptotically normally distributed

Advantages

  • More robust to outliers compared to OLS
  • In the case the dependent variable has a bimodal or multimodal (multiple humps with multiple modes) distribution, quantile regression can be extremely useful.
  • Avoids parametric distribution assumption of the error process. In another word, no assumptions regarding the distribution of the error term.
  • Better characterization of the data (not just its conditional mean)
  • is invariant to monotonic transformations (such as log) while OLS is not. In another word, \(E(g(y))=g(E(y))\)

Disadvantages

  • The dependent variable needs to be continuous with no zeroes or too many repeated values.

\[ y_i = x_i'\beta_q + e_i \]

Let \(e(x) = y -\hat{y}(x)\), then \(L(e(x)) = L(y -\hat{y}(x))\) is the loss function of the error term.

If \(L(e) = |e|\) (called absolute-error loss function) then \(\hat{\beta}\) can be estimated by minimizing \(\sum_{i}|y_i-x_i'\beta|\)

More specifically, the objective function is

\[ Q(\beta_q)=\sum_{i:y_i \ge x_i'\beta}^{N} q|y_i - x_i'\beta_q| + \sum_{i:y_i < x_i'\beta}^{N} (1-q)|y_i-x_i'\beta_q \]

where \(0<q<1\)

The sum penalizes \(q|e_i|\) for under-prediction and \((1-q)|e_i|\) for over-prediction

We use simplex method to minimize this function (cannot use analytical solution since it’s non-differentiable). Standard errors can be estimated by bootstrap.

The absolute-error loss function is symmetric.

Interpretation For the jth regressor (\(x_j\)), the marginal effect is the coefficient for the qth quantile

\[ \frac{\partial Q_q(y|x)}{\partial x_j} = \beta_{qj} \]

At the quantile q of the dependent variable y, \(\beta_q\) represents a one unit change in the independent variable \(x_j\) on the dependent variable y.

In other words, at the qth percentile, a one unit change in x results in \(\beta_q\) unit change in y.