D.1 Data sets

Adapted from here

D.1.1 Hospital Infection Risk

The hospital infection risk dataset consists of a sample of 113 hospitals in four regions of the U.S. The response variable is \(y\) = infection risk (percent of patients who get an infection) and the predictor variable is \(x\) = average length of stay (in days).

D.1.2 Skin Cancer Mortality

The dataset is used to study the relationship between the response “skin cancer mortality” and the predictor “latitude”.

The response variable y is the mortality due to skin cancer (number of deaths per 10 million people) and the predictor variable x is the latitude (degrees North) at the center of each of 49 states in the U.S. (skincancer.txt) (The data were compiled in the 1950s, so Alaska and Hawaii were not yet states, and Washington, D.C. is included in the data set even though it is not technically a state.)

D.1.3 Hand and Height

data for y = handspan (cm) and x = height (inches), for n = 167 students

D.1.4 Old Faithful geyser

y = time to next eruption and x = duration of last eruption for eruptions of the Old Faithful geyser.

D.1.5 Real State

y = sale price of a home and x = square foot area of home (realestate.txt).

D.1.6 Teen Birth Rate and Poverty Level Data

This dataset of size \(n = 51\) are for the \(50\) states and the District of Columbia in the United States. The variables are y = year 2002 birth rate per 1000 females 15 to 17 years old and x = poverty rate, which is the percent of the state’s population living in households with incomes below the federally defined poverty level. (Data source: Mind On Statistics, 3rd edition, Utts and Heckard).

D.1.7 Lung Function in 6 to 10 Year Old Children

The data are from n = 345 children between 6 and 10 years old. The variables are y = forced exhalation volume (FEV), a measure of how much air somebody can forcibly exhale from their lungs, and x = age in years. (Data source: The data here are a part of dataset given in Kahn, Michael (2005). “An Exhalent Problem for Teaching Statistics”, The Journal of Statistical Education, 13(2).