Inference for Regression

You might recall that we talked about linear regression a couple of months ago. A simple example is givrn by this look at our CDC data relating height and weight:

As we mentioned before, the correlation of about \(0.42\) is a quantitative assessment of the relationship between the variables and the formula \(W=4.87H-154.24\) yields an estimate of the weight \(W\) in terms of the height \(H\).

Here’s the thing though: If you look back at those previous notes you’ll find slightly different numbers. The reason is that the data is a random sample of the \(20,000\) men in the study. If we take a different random sample, then we’ll get different numbers. Thus, the coefficients in linear regression can be considered as sample statistics so we have standard errors and \(p\)-values associated with them.

Using R to run a regression test

Let’s discuss how we might interpret the following:

set.seed(1)
cdc = read.csv("https://www.marksmath.org/data/cdc.csv")
men = subset(cdc, gender=='m')
subset = men[sample(1:length(men$height),50),]
cdc_fit = lm(subset$weight~subset$height)
summary(cdc_fit)

## 
## Call:
## lm(formula = subset$weight ~ subset$height)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -73.120 -23.716  -8.848  17.896  93.392 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   -122.182    144.958  -0.843   0.4035  
## subset$height    4.504      2.043   2.205   0.0323 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 38.3 on 48 degrees of freedom
## Multiple R-squared:  0.09198,    Adjusted R-squared:  0.07307 
## F-statistic: 4.862 on 1 and 48 DF,  p-value: 0.03227

Questions

What is the formula relating weigth to height?
What does the formula predict for the weight of a man who is 72 inches tall?
What hypothesis might we check with this model?
Write down the result of the hypothesis test.

If I could run this type of hypothesis test in Javascript, I’d add it to my CFB Data demo!

Inference for Regression

11/29/2017

Using R to run a regression test

Questions