Stat 185 - Practice sheet for the final exam

We'll have an in class exam next week. Here's some key info:

  1. You should bring a calculator
  2. You'll be able to look up probabilities using both the normal and $T$-distributions on this table.
  3. You can look up your first name here to find which day you are scheduled to take the exam.

The problems will be very much like the ones below.


  1. Suppose we randomly select 105 competitors in the 2012 Boston Marathon found an average time of 253.77 minutes with a standard deviation of 45.72 minutes. We wish to write down a $98\%$ confidence interval for this data.
    1. Find the standard error associated with this sample.
    2. Use a normal table to find the $z^*$ value that corresponds to a $98\%$ confidence interval.
    3. Write down the $98\%$ confidence interval.
  2. Suppose we randomly select 4 runners from the 2012 Boston marathon and find their times in minutes to be
    273.8203.5259.4246.1
    1. Write down a formula showing that the mean of these times is $245.7$.
    2. Write down a formula showing that the standard deviation of these times is approximately $26.26$.
    3. Find the standard error associated with this sample.
    4. Write down a $95\%$ confidence interval for the average time of Boston Marathon runner based on this data.
  3. A random sample of 1200 runners in the 2012 Boston Marathon found that 507 of them were women. Run an hypothesis test to check the null hypothesis that half of marathon runners are women against the alternative hypothesis that less than half of marathon runners are women. Be sure to
    1. Clearly state your hypothesis in terms of differences,
    2. compute the standard error,
    3. compute the test-statistic, and
    4. state the conclusion.
    You might or might not feel the need to compute a $p$-value.
  4. In the 2012 Boston Marathon, there were
    • 7217 runners in their 40s with an average time of 255.2 minutes and a standard deviation of 43.7 minutes, and
    • 4156 runners in their 50s with an average time of 270.8 minutes and a standard deviation of 44.7 minutes.
    Use this information to test the null hypotesis that $\mu_{40}=\mu_{50}$ against the alternative hypthesis that $\mu_{40} < \mu_{50}$, where $\mu_{40}$ denotes the average time of runners in their forties and $\mu_{50}$ denotes the average time of runners in their fifties. Be sure to
    1. Clearly state your hypothesis in terms of differences,
    2. compute the standard error,
    3. compute the test-statistic, and
    4. state the conclusion.
    You might or might not feel the need to compute a $p$-value.
  5. In the 2012 Boston Marathon, there were 59 runners under the age of 40 who had also run the Boston Marathon in 2002 when they were under the age of 30. I computed the pairwise difference of those runners' times in 2012 minus their times in 2002 and found a mean of 26.1 minutes with a standard deviation of 32.7 minutes. Let's use this data to run a hypothesis test to see if runners slow down over this age range. Specifically, let $\mu_1$ denote their first time in 2002 and let $\mu_2$ denote their second time in 2012. Test the null hypothesis that $\mu_1 = \mu_2$ vs the alternative hypothesis that $\mu_1<\mu_2$ at the $99\%$ confidence level. Be sure to
    1. Clearly state your hypothesis in terms of differences,
    2. compute the standard error,
    3. compute the test-statistic, and
    4. state the conclusion.
    You might or might not feel the need to compute a $p$-value.
  6. The picture below shows a scatter plot for a random sample of 1200 runners in the Boston Marathon. The $x$-coordinate of each point corresponds to the runner's age and the $y$-coordinate corresponds to the runners time in minutes. The regression line for the data is also shown and has formula $$y = 0.719x + 227.6.$$
    1. What time does this regression model predict for a 56 year old runner?
    2. Which of the following could be a reasonable value for the correlation between age and time: 0.9, 0.2, -0.2, or -0.9?
    3. Suppose I run a linear regression test on this data and I get results like the following:
      LinregressResult(slope=0.71868458889427973, intercept=227.5951853339659, rvalue=0.258752071669993, pvalue=8.263097704087485e-20, stderr=0.13124270715070266)
      Can I conclude at the 99% level of conficence that there is a linear relationship between age and speed?