Parallel Lines

Next, we will consider model 2 which assumes two regression lines with equal slopes but different intercept parameters as illustrated in the plot below.

\(\newline\) Within the parallel lines model, at any point \(x\), the two lines take the values

\[\begin{aligned} \alpha_1&+\beta(x-\bar{x}_{1.})\\ \alpha_2&+\beta(x-\bar{x}_{2.})\end{aligned}\]

\(\newline\) and so the difference between the lines is:

\[\begin{aligned} \alpha_1+\beta(x-\bar{x}_{1.})-(\alpha_2+\beta(x-\bar{x}_{2.})) & = \alpha_1+\beta x-\beta\bar{x}_{1.}-\alpha_2-\beta x+\beta\bar{x}_{2.}\\ & = \alpha_1-\alpha_2+\beta\bar{x}_{2.}-\beta\bar{x}_{1.}\\ & = \alpha_1-\alpha_2+\beta(\bar{x}_{2.}-\bar{x}_{1.})\end{aligned}\]

\(\newline\) This is simply the distance between the two regression lines (for example the distance between the blue and black lines in the figure above).

We can assess whether a single straight line, with no differences between the groups, model 3, is a suitable model for the data by constructing a C.I. for \(\alpha_1-\alpha_2+\beta(\bar{x}_{2.}-\bar{x}_{1.})\) and examining whether this interval contains 0.

95% Confidence interval for parallel lines model

\(\newline\) Data: \((y_{ij}, x_{ij}); \quad i=1,2; \quad j=1, \dots,n_i\).

\(\newline\) Model : \(E(y_{ij}) = \alpha_i+\beta(x_{ij}-\bar{x}_{i.})\)

\(\newline\) Calculate a 95% confidence interval (CI) for \(\alpha_1-\alpha_2+\beta(\bar{x}_{2.}-\bar{x}_{1.})\). The C.I. has the form

\[\mathbf{b}^T\boldsymbol{\hat{\beta}} \pm t(n_1+n_2-3; 0.975)\sqrt{\frac{RSS}{n_1+n_2-3}\mathbf{b}^T(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{b}}\]

where \(b =\left( \begin{array}{c} 1 \\ -1 \\ \bar{x}_{2.}-\bar{x}_{1.} \\ \end{array} \right)\)

\(\newline\) For this model,

\[E(\mathbf{Y}) = \mathbf{X}\boldsymbol{\beta}\]

\[\begin{aligned} \mathbf{X}& = \left( \begin{array}{ccc} 1 & 0 & (x_{11}-\bar{x}_{1.}) \\ . & . & . \\ . & . & . \\ 1 & 0 & (x_{1n_1}-\bar{x}_{1.}) \\ 0 & 1 & (x_{21}-\bar{x}_{2.}) \\ . & . & . \\ . & . & . \\ 0 & 1 & (x_{2n_2}-\bar{x}_{2.}) \\ \end{array} \right), \quad \quad \boldsymbol{\beta} = \left( \begin{array}{c} \alpha_1 \\ \alpha_2 \\ \beta \\ \end{array} \right), \quad \quad (\mathbf{X}^T\mathbf{X})^{-1} = \left( \begin{array}{cccc} \frac{1}{n_1} & 0 & 0 \\ 0 & \frac{1}{n_2} & 0 \\ 0 & 0 & \frac{1}{S_{x_1x_1}+S_{x_2x_2}} \\ \end{array} \right).\end{aligned}\]

\(\newline\) For our linear combination, \(\mathbf{b}^T\boldsymbol{\beta}\) of interest, we have to spot that

\[\mathbf{b} = \left( \begin{array}{c} 1 \\ -1 \\ (\bar{x}_{2.}-\bar{x}_{1.})\\ \end{array} \right)\]

A 95% C.I. for \(\mathbf{b}^T\boldsymbol{\beta}\) is

\[\begin{aligned} \mathbf{b}^T\boldsymbol{\hat{\beta}} &\pm t(n-p; 0.975)\sqrt{\frac{RSS}{n-p}\mathbf{b}^T(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{b}}\\ \hat{\alpha}_1-\hat{\alpha}_2+\hat{\beta}(\bar{x}_{2.}-\bar{x}_{1.}) &\pm t(n_1+n_2-3; 0.975)\sqrt{\frac{RSS}{n_1+n_2-3}\mathbf{b}^T(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{b}}, \end{aligned}\] where

\[\begin{aligned} \mathbf{b}^T(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{b} & = \left( \begin{array}{ccc} 1 & -1 & \{\bar{x}_{2.}-\bar{x}_{1.}\} \\ \end{array} \right)\left( \begin{array}{ccc} \frac{1}{n_1} & 0 & 0 \\ 0 & \frac{1}{n_2} & 0 \\ 0 & 0 & \frac{1}{S_{x_1x_1}+S_{x_2x_2}} \\ \end{array} \right)\left( \begin{array}{c} 1 \\ -1 \\ \{\bar{x}_{2.}-\bar{x}_{1.}\}\\ \end{array} \right)\\ & = \left( \begin{array}{ccc} \frac{1}{n_1} & -\frac{1}{n_2} & \frac{\{\bar{x}_{2.}-\bar{x}_{1.}\}}{S_{x_1x_1}+S_{x_2x_2}} \\ \end{array} \right)\left( \begin{array}{c} 1 \\ -1 \\ \{\bar{x}_{2.}-\bar{x}_{1.}\}\\ \end{array} \right)\\ & = \frac{1}{n_2}+\frac{1}{n_2}+\frac{(\bar{x}_{2.}-\bar{x}_{1.})^2}{S_{x_1x_1}+S_{x_2x_2}}. \end{aligned}\]

A 95% C.I. for \(\mathbf{b}^T\boldsymbol{\beta}\) is therefore

\[\hat{\alpha}_1-\hat{\alpha}_2+\hat{\beta}(\bar{x}_{2.}-\bar{x}_{1.}) \pm t(n-p,0.975) \sqrt{\left(\frac{RSS}{n-p}\right)\left(\frac{1}{n_1}+\frac{1}{n_2}+\frac{(\bar{x}_{2.}-\bar{x}_{1.})^2}{S_{x_1x_1}+S_{x_2x_2}}\right)}\]

\(n = n_1+n_2, p=3\) and so \(n-p = n_1+n_2-3\).

Interperetation of Confidence Interval

If this confidence interval contains 0, we cannot reject the single straight line model, model 3, and thus we stay with the single line model instead of the parallel lines model. A illustrative example is given below. Agian we have two lines in black and blue but we believe these lines to have the same slope and intercept parameters.