Respiratory Distress Syndrome
Premature babies often suffer from a variety of problems and respiratory distress syndrome (RDS) is a common, and serious, lung problem. It is thought that the occurrence of this syndrome might be related to a property of the blood called red cell deformability. This refers to the ability of red cells to change shape to pass through small pores. The rate (Lrate, on a log scale) of blood flow through a set of 3\(\mu\)m pores is recorded for two groups of babies, some of whom suffer from respiratory distress syndrome (RDS) and some who do not (No RDS). The gestational age (GA) in weeks of each baby is also recorded. These data were kindly provided by Queen Mother’s Hospital, Glasgow.
We proposed that the relationship between Lrate
and GA
does not appear to differ by whether the baby did or didn’t suffer from respiratory distress syndrome (RDS). To examine this formally we can use the following summary numbers for the data.
\[\begin{aligned} \bar{y}_{1.}& = 0.395833 \quad \bar{y}_{2.}=-0.2618182\\ \bar{x}_{1.}& = 32.333 \quad \bar{x}_{2.}=28.54545\\ S_{y_1y_1} & = 8.117929 \quad S_{y_2y_2}=3.806964\\ S_{x_1x_1} & = 64.66667 \quad S_{x_2x_2} = 102.7273\\ S_{x_1y_1}& = 11.13667\quad S_{x_2y_2}=13.06091\\ n_1& = 12 \quad n_2 =11\end{aligned}\]
For the “different lines” model we have
\[\begin{aligned} \hat{\beta_1} & = 0.1722\\ \\ \hat{\beta_2} & = 0.1271\\ \\ RSS& = 6.199374+2.146379=8.345753\end{aligned}\]
A C.I. for the difference in slopes is
\[0.0451 \pm 2.09 \sqrt{\frac{8.345753}{19}\left(\frac{1}{64.66667}+\frac{1}{102.7273}\right)}\]
i.e. \((0.0451 \pm 0.22)\) and hence (-0.17, 0.27).
Fitting a model in R
for a response of rate of blood flow, with a
continuous covariate for gestational age (GA), a factor covariate for
RDS group (RDS) and an interaction between gestational age and RDS group
provides the output on the following page.
##
## Call:
## lm(formula = Lrate ~ GA * RDS, data = rds)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.87508 -0.43260 0.00379 0.20733 1.43714
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.17250 2.67167 -1.936 0.0679 .
## GA 0.17222 0.08242 2.090 0.0503 .
## RDSRDS 1.28137 3.26526 0.392 0.6991
## GA:RDSRDS -0.04507 0.10521 -0.428 0.6731
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6628 on 19 degrees of freedom
## Multiple R-squared: 0.4207, Adjusted R-squared: 0.3292
## F-statistic: 4.599 on 3 and 19 DF, p-value: 0.01392
## Analysis of Variance Table
##
## Response: Lrate
## Df Sum Sq Mean Sq F value Pr(>F)
## GA 1 5.9335 5.9335 13.5081 0.001608 **
## RDS 1 0.0466 0.0466 0.1062 0.748129
## GA:RDS 1 0.0806 0.0806 0.1836 0.673147
## Residuals 19 8.3458 0.4393
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Neither the confidence interval and p-value for the interaction terms provide enough evidence to conclude that the two seperate regressions model is appropriate, so we will instead try to fit the “parallel lines” model to the data.
We now have
\[\begin{aligned} \hat{\beta} & = 0.1446\\ \\ RSS& = 8.426383\end{aligned}\]
A C.I. for the vertical separation between the regression lines is
\[0.110 \pm 2.09 \sqrt{\frac{8.426383}{20}\left(\frac{1}{12}+\frac{1}{11}+\frac{3.7879^2}{64.66667+102.7273}\right)}\]
i.e. \((0.110 \pm 0.692)\)
i.e. \((-0.58, 0.80)\).
Fitting a model in R
with only a continuous covariate of gestational
age and a factor for RDS group provides the following output:
##
## Call:
## lm(formula = Lrate ~ GA + RDS, data = rds)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.89309 -0.40617 -0.03309 0.26879 1.48324
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -4.27810 1.63292 -2.620 0.01640 *
## GA 0.14455 0.05017 2.881 0.00923 **
## RDSRDS -0.11010 0.33095 -0.333 0.74284
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6491 on 20 degrees of freedom
## Multiple R-squared: 0.4151, Adjusted R-squared: 0.3566
## F-statistic: 7.097 on 2 and 20 DF, p-value: 0.004686
## Analysis of Variance Table
##
## Response: Lrate
## Df Sum Sq Mean Sq F value Pr(>F)
## GA 1 5.9335 5.9335 14.0830 0.001253 **
## RDS 1 0.0466 0.0466 0.1107 0.742844
## Residuals 20 8.4264 0.4213
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
There is therefore no evidence, from both the confidence interval (-0.58, 0.8) and p-value for RDS (0.743), of a difference between the regression lines. We conclude that there is insufficient evidence of a difference in the rate of blood flow through pores between babies who suffer from RDS and those who do not and finally conclude that a single line is a good model for the data.
## Analysis of Variance Table
##
## Model 1: Lrate ~ GA * RDS
## Model 2: Lrate ~ GA + RDS
## Model 3: Lrate ~ GA
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 19 8.3458
## 2 20 8.4264 -1 -0.080630 0.1836 0.6731
## 3 21 8.4730 -1 -0.046627 0.1062 0.7481