Respiratory Distress Syndrome

Premature babies often suffer from a variety of problems and respiratory distress syndrome (RDS) is a common, and serious, lung problem. It is thought that the occurrence of this syndrome might be related to a property of the blood called red cell deformability. This refers to the ability of red cells to change shape to pass through small pores. The rate (Lrate, on a log scale) of blood flow through a set of 3\(\mu\)m pores is recorded for two groups of babies, some of whom suffer from respiratory distress syndrome (RDS) and some who do not (No RDS). The gestational age (GA) in weeks of each baby is also recorded. These data were kindly provided by Queen Mother’s Hospital, Glasgow.

We proposed that the relationship between Lrate and GA does not appear to differ by whether the baby did or didn’t suffer from respiratory distress syndrome (RDS). To examine this formally we can use the following summary numbers for the data.

\[\begin{aligned} \bar{y}_{1.}& = 0.395833 \quad \bar{y}_{2.}=-0.2618182\\ \bar{x}_{1.}& = 32.333 \quad \bar{x}_{2.}=28.54545\\ S_{y_1y_1} & = 8.117929 \quad S_{y_2y_2}=3.806964\\ S_{x_1x_1} & = 64.66667 \quad S_{x_2x_2} = 102.7273\\ S_{x_1y_1}& = 11.13667\quad S_{x_2y_2}=13.06091\\ n_1& = 12 \quad n_2 =11\end{aligned}\]

For the “different lines” model we have

\[\begin{aligned} \hat{\beta_1} & = 0.1722\\ \\ \hat{\beta_2} & = 0.1271\\ \\ RSS& = 6.199374+2.146379=8.345753\end{aligned}\]

A C.I. for the difference in slopes is

\[0.0451 \pm 2.09 \sqrt{\frac{8.345753}{19}\left(\frac{1}{64.66667}+\frac{1}{102.7273}\right)}\]

i.e. \((0.0451 \pm 0.22)\) and hence (-0.17, 0.27).

Fitting a model in R for a response of rate of blood flow, with a continuous covariate for gestational age (GA), a factor covariate for RDS group (RDS) and an interaction between gestational age and RDS group provides the output on the following page.

resp.separate <- lm(Lrate~GA*RDS,data=rds)
summary(resp.separate)
## 
## Call:
## lm(formula = Lrate ~ GA * RDS, data = rds)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.87508 -0.43260  0.00379  0.20733  1.43714 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)  
## (Intercept) -5.17250    2.67167  -1.936   0.0679 .
## GA           0.17222    0.08242   2.090   0.0503 .
## RDSRDS       1.28137    3.26526   0.392   0.6991  
## GA:RDSRDS   -0.04507    0.10521  -0.428   0.6731  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6628 on 19 degrees of freedom
## Multiple R-squared:  0.4207, Adjusted R-squared:  0.3292 
## F-statistic: 4.599 on 3 and 19 DF,  p-value: 0.01392
print(anova(resp.separate))
## Analysis of Variance Table
## 
## Response: Lrate
##           Df Sum Sq Mean Sq F value   Pr(>F)   
## GA         1 5.9335  5.9335 13.5081 0.001608 **
## RDS        1 0.0466  0.0466  0.1062 0.748129   
## GA:RDS     1 0.0806  0.0806  0.1836 0.673147   
## Residuals 19 8.3458  0.4393                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Neither the confidence interval and p-value for the interaction terms provide enough evidence to conclude that the two seperate regressions model is appropriate, so we will instead try to fit the “parallel lines” model to the data.

We now have

\[\begin{aligned} \hat{\beta} & = 0.1446\\ \\ RSS& = 8.426383\end{aligned}\]

A C.I. for the vertical separation between the regression lines is

\[0.110 \pm 2.09 \sqrt{\frac{8.426383}{20}\left(\frac{1}{12}+\frac{1}{11}+\frac{3.7879^2}{64.66667+102.7273}\right)}\]

i.e. \((0.110 \pm 0.692)\)

i.e. \((-0.58, 0.80)\).

Fitting a model in R with only a continuous covariate of gestational age and a factor for RDS group provides the following output:

resp.parallel<- lm(Lrate~GA+RDS,data=rds)
summary(resp.parallel)
## 
## Call:
## lm(formula = Lrate ~ GA + RDS, data = rds)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.89309 -0.40617 -0.03309  0.26879  1.48324 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)   
## (Intercept) -4.27810    1.63292  -2.620  0.01640 * 
## GA           0.14455    0.05017   2.881  0.00923 **
## RDSRDS      -0.11010    0.33095  -0.333  0.74284   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.6491 on 20 degrees of freedom
## Multiple R-squared:  0.4151, Adjusted R-squared:  0.3566 
## F-statistic: 7.097 on 2 and 20 DF,  p-value: 0.004686
print(anova(resp.parallel))
## Analysis of Variance Table
## 
## Response: Lrate
##           Df Sum Sq Mean Sq F value   Pr(>F)   
## GA         1 5.9335  5.9335 14.0830 0.001253 **
## RDS        1 0.0466  0.0466  0.1107 0.742844   
## Residuals 20 8.4264  0.4213                    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

There is therefore no evidence, from both the confidence interval (-0.58, 0.8) and p-value for RDS (0.743), of a difference between the regression lines. We conclude that there is insufficient evidence of a difference in the rate of blood flow through pores between babies who suffer from RDS and those who do not and finally conclude that a single line is a good model for the data.

resp.single <- lm(Lrate~GA,data=rds)
print(anova(resp.separate,resp.parallel,resp.single))
## Analysis of Variance Table
## 
## Model 1: Lrate ~ GA * RDS
## Model 2: Lrate ~ GA + RDS
## Model 3: Lrate ~ GA
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1     19 8.3458                           
## 2     20 8.4264 -1 -0.080630 0.1836 0.6731
## 3     21 8.4730 -1 -0.046627 0.1062 0.7481