20 Case B: Lung Function in 6 to 10 Year Old Children

Do smokers suffer reduced pulmonary function?

Some authors reported analyses of a study aimed at assessing children’s pulmonary function in the absence or presence of smoking cigarettes, as well as exposure to passive smoke from at least one parent. These papers represent some of the earliest attempts at systematic documentation regarding obvious signs of reduced pulmonary function from smoking and from exposure to second-hand smoke.

In this problem, interest concerns the relationship between smoking and forced exhalation volume (FEV), a measure of how much air somebody can forcibly exhale from their lungs.

Essentially, FEV is the amount of air an individual can exhale in the first second of a forceful breath. A sample of \(654\) youths, aged \(3\) to \(19\), in the area of East Boston during middle to late 1970’s was recorded. The dataset include the following: FEV (liters), AGE (years), HEIGHT (inches), GENDER (M/F), SMOKE (Y/N).

The data file is Lung.xlsx. The variables are:

Lung: Data Labels
Label Description
age discrete measure, positive integer (years)
fev continuous measure (liters)
ht continuous measure (inches)
sex discrete/nominal (Female coded 0, Male coded 1)
smoke discrete/nominal (Nonsmoker coded 0, Smoker coded 1)


Block I:

  1. Perform a detailed regression analysis on Model 2.
  2. Is the relationship between dependent variable and \(\textrm{Age}\) linear?
  3. Interpret the coefficients of the regression model.
  4. Test if the each parameter is significantly different from 0.
  5. Predict the dependent variable for Age 9, 10 and 11, in both smoke=0 and smoke=1.
  6. Interpret the \(F-\)test.
  7. Interpret the \(R^2\)

Block II:

  1. Analyze the dependent variable. Is it normally distributed?
  2. Analyze the independent variables and describe its relationship with FEV.
  3. Fit and interpret Models 1 and 2. Do Age and/or Smoke affects FEV?
  4. Fit and interpret Models 3 and 4. Which model do you recommend to forecast FEV?Why?
  5. Fit and interpret Model 5. Do you think this model is the best one? Why?
  6. Create a new variable based on Age. The new variable (Age_Group) is 1 if \(Age > 9\) and 0, otherwise.
  7. Fit and interpret the “best” model per Age Group. Using these models improve accuracy of FEV forecast?
  • Model 1: \[\text{fev}_i=\beta_0+\beta_1\text{Age}_i+\epsilon_i\]

  • Model 2: \[\text{fev}_i=\beta_0+\beta_1\text{Age}_i+\beta_2\text{Smoke}_i+\epsilon_i\]

  • Model 3: \[\text{fev}_i=\beta_0+\beta_1\text{Age}_i+\beta_2\text{Smoke}_i+\beta_3\text{Sex}_i+\epsilon_i\]

  • Model 4: \[\text{fev}_i=\beta_0+\beta_1\text{Age}_i+\beta_2\text{Smoke}_i+\beta_3\text{Sex}_i+\beta_4\text{ht}_i+\epsilon_i\]

  • Model 5: \[\text{fev}_i=\beta_0+\beta_1\text{Age}_i+\beta_2\text{Smoke}_i+\beta_3\text{Sex}_i+\beta_4\text{ht}_i+\beta_5\text{ht}^2_i+\beta_4\text{Age}_i\times\text{Smoke}_i+\epsilon_i\]


References: