A problem

The following data records the average mercury content in dolphin muscle of 19 Risso’s dolphins from the Taiji area in Japan:

d = c(2.57,4.43,2.09,7.68,4.77,2.12,5.13,5.71,5.33,3.31,7.49,4.91,2.58,1.08,6.60,3.91,3.97,6.18,5.90)

While it might be important for researchers to assess the threat of mercury in the ocean, they do not want to go kill more dolphins to get that data. What can they conclude from this data set, even though it’s a bit too small to use a normal distribution?

The basics

The normal distribution, as awesome as it is, requires that we work with large sample sizes.

The \(t\)-distribution is similar but better suited to small sample sizes.

Just as with the normal distribution, there’s not just one \(t\)-distribution but, rather, a family of distributions.

Just as there’s a formula for the normal distribution, there’s a formula for the \(t\)-distribution. It’s a bit more complicated, though.

Like all continuous distributions, we compute probabilities with the \(t\)-distribution by computing the area under a curve. We do so using either a computer or a table.

The degree to which a \(t\)-distribution deviates from the normal is determined by its “degrees of freedom” parameter \(df\), which is one less than the sample size.

The mean of the \(t\)-distribution is zero and its variance is related to the degrees of freedom \(df\) by \[\sigma^2 = \frac{df}{df-2}.\]

Unlike the normal distribution, there’s no easy way to translate from a \(t\)-distribution with one standard deviation to another standard one. As a result, it’s less common to use tables and more common to use software than it is with the normal.

Given a particular number of degrees of freedom, however, there is a standard way to derive a \(t\)-score that’s analogous to the \(z\)-score for the normal distribution. This \(t\)-score is a crucial thing that you need to know when using tables for the \(t\)-distribution.

The \(t\)-test for one sample mean

Our main uses of the \(t\)-distribution will be to construct confidence intervals and perform hypothesis tests for relatively small data sets. The main tool for doing so is the so called \(t\)-test. The \(t\)-test is to the \(t\)-distribution as the \(p\)-test is to the normal distribution.

The \(t\)-test a little tricky due to the lack of a “standard” version of the \(t\)-distribution. R has a built command, called t.test that automates the procedure, though. While t.test uses the \(t\) distribution to deal with small sample sizes, it actually works well with samples of all sizes. The reason is that, as the degrees of freedom grows, the corresponding \(t\)-distribution approaches the normal distribution.

Example

Here’s how we can examine that data set on mercury levels in Risso’s dolphins using a \(t\)-test in R:

d = c(2.57,4.43,2.09,7.68,4.77,2.12,5.13,5.71,5.33,3.31,7.49,4.91,2.58,1.08,6.60,3.91,3.97,6.18,5.90)
t.test(d, mu=3.6)
## 
##  One Sample t-test
## 
## data:  d
## t = 2.1177, df = 18, p-value = 0.04838
## alternative hypothesis: true mean is not equal to 3.6
## 95 percent confidence interval:
##  3.607245 5.420123
## sample estimates:
## mean of x 
##  4.513684

There’s quite a bit going on here that we need to parse. First, there’s the command itself:

t.test(d, mu=3.6)

I guess it’s pretty clear that we’re applying t.test to our data set d. The mu=3.6 option is used to specify an assumed null-value that we are comparing the data against. Thus, in the output we see the lines:

## t = 2.1177, df = 18, p-value = 0.04838
## alternative hypothesis: true mean is not equal to 3.6

Note that the value of \(3.6\) appears in the statement of the alternative hypothesis; the associated \(p\)-value for the alternative hypothesis is, evidently, \(0.04838\). Thus, we can reject the null hypothesis that the true mean is \(3.6\).

The computation of the \(p\)-value is based on the \(t\)-score in much the same way that the \(p\)-value for the normal distribution is based on the \(z\)-score. In the output above, the \(t\)-score is indicated the t = 2.1177 in the output.

Finally, the \(95\%\) confidence interval is just what you’d probably think - we have a \(95\%\) level of confidence that the interval contains the actual mean. This confidence interval is constructed in exactly the same way that we’ve constructed confidence intervals before - it has the form \[[\bar{x} - ME, \bar{x} + ME].\]

For this particular problem, we might be interested in a one-sided hypothesis test to check, for example, if the mercury concentration is too high. Thus, we might run the command

t.test(d, mu=3.6, alternative = "greater")
## 
##  One Sample t-test
## 
## data:  d
## t = 2.1177, df = 18, p-value = 0.02419
## alternative hypothesis: true mean is greater than 3.6
## 95 percent confidence interval:
##  3.765526      Inf
## sample estimates:
## mean of x 
##  4.513684

Not surprisingly, the \(p\)-value is exactly half of the previous \(p\)-value and the confidence interval is a semi-infinite interval that lies entirely to the right of the mean.

Using the distribution itself

Direct use of the t.test command is the easiest way to apply a \(t\)-test and often the way to use the \(t\)-distribution. However, we should be able to gain a deeper understanding of the \(t\)-distribution and it’s relationship to the normal distribution by using the fundamental distribution functions:

  • dt - the density function
  • pt - the cumulative probability function
  • qt - the inverse of pt known as the quantile function

These are all analogous to the functions the corresponding functions dnorm, pnorm, and qnorm that we’ve used when exploring the normal distribution.

In fact, every distribution in R has functions that following this naming scheme.

dt

This function defines the density for the \(t\)-distribution. Thus, we can plot it like so:

f = function(x){dt(x,5)}
plot(f, -3,3)

pt

The cumulative probability distribution function for the \(t\)-distribution

Given a random variable \(X\) with a \(t\)-distribution with df degrees of freedom and a real number \(x\), pt(x,df) computes the area under the graph from \(-\infty\) to \(x\). In the picture below, \(df=18\) and \(x=-1\).

For example, the following command computes the probability that a random variable \(X\) generated by the \(t\)-distribution with \(18\) degrees of freedom is less than \(-1\):

pt(-1,18)
## [1] 0.1652825

This is illustrated by the following picture:

qt

The quantiles for the \(t\)-distribution

This is an inverse to pt. Thus, given a probability \(p\) and some number \(df\) of degrees of freedom, qt(p,df) computes the real number \(x\) such that pt(x,df) produces that probability \(p\). Symbolically, \[\texttt{pt}(\texttt{qt}(p,df),df) = p.\] Referring again to the picture above, we saw that \(\texttt{pt}(-1,18)=0.165\); thus, we should have \(\texttt{qt}(0.165,18)=-1\) (or maybe just close due to round off):

qt(0.165,18)
## [1] -1.001201

The interpretation is that just over \(16.5\%\) of the population lies below \(x=-1\) in this \(t\)-distribution.

Note that the qt command is quite handy for finding \(t^*\) multiplier values, as we’ll see below.

The \(t\)-score and the \(p\)-value

The \(t\)-score for a \(t\)-distribution is computed in much the same way that a \(z\)-score is computed for a normal distribution. That is, given a sample of size \(n\), with mean \(\bar{x}\), and standard deviation \(s\) the \(t\)-score of the observation \(x\) is \[T = \frac{x-\bar{x}}{s/\sqrt{n}}.\] In our dolphin example above, the assumed mean is \(3.6\) and the observed mean is \(4.513684\). We can compute the standard error easily enough:

n = length(d)
se = sd(d)/sqrt(n)
se
## [1] 0.4314481

Thus, the \(t\)-score is

t_score = (mean(d)-3.6)/se
t_score
## [1] 2.117715

In agreement with the first line from our application of t.test.

The \(p\)-value is then computed from the \(t\)-score in the same way that the normal \(p\)-value is computed from the \(z\)-score. In this case, we have an observation of the mean that is larger than the assumed mean. Thus, the probability that the mean might be this large or larger, assuming the null value is

1 - pt(t_score, 18)
## [1] 0.0241896

Of course, the alternative hypothesis, as stated by the output of t.test is that the true mean is unequal to 3.6; thus, we should double this number to obtain the result in that output.

Computing confidence intervals

Computing a confidence interval by hand is fairly easy, if you have the right information. The general form is

\[[\bar{x} - ME, \bar{x} + ME],\] where the margin of error \(ME\) has the form \[ME = t^* \times SE.\]

Getting \(t^*\)

We get \(t^*\) from the \(t\)-distribution in the same way that we get \(z^*\) from the normal distribution. Thus, to find a \(95\%\) confidence interval, we examine the following picture.:

Note that the gray portions must contain \(5\%\) of the area under the curve. As a result, the endpoints of the interval can be computed as follows:

c(qt(0.025, 18), qt(0.975,18))
## [1] -2.100922  2.100922

Of course, you should adjust the 18 to whatever value of \(df\) you need.

Note that we can also find that \(t^*=2.1\) in the \(t\)-table on our webpage, where we see something that looks like so:

one tail 0.100 0.050 0.025 0.010 0.005
two tails 0.200 0.100 0.050 0.020 0.010
df 1 3.08 6.31 12.71 31.82 63.66
2 1.89 2.92 4.30 6.96 9.92
18 1.33 1.73 2.10 2.55 2.88

The entries in this table are called critical \(t^*\) values. The columns indicate several common choices for confidence level and are alternately labeled either one-sided or two. The rows correspond to degrees of freedom. Now, look in the row where \(df=18\) and where the two-sided test is equal to 0.05. we see that \(t^*=2.1\).

Our confidence interval

With \(t^*\) in hand, we can compute the margin of error:

m = mean(d)
me = 2.1*sd(d)/sqrt(19)
c(m-me, m+me)
## [1] 3.607643 5.419725

Again, in agreement of the results of t.test.

Another example

Chips Ahoy!

In 1998, Nabisco announced its 1000 chip challenge, claiming that every bag of Chips Ahoy cookies contained at least 1000 chips. A group of students counted the chips in 16 bags of cookies and found the following data:

chips = c(1219,1214,1087,1200,1419,1121,1325,1345,1244,1258,1356,1132,1191,1270,1295,1135)
length(chips)
## [1] 16
  • Create a \(99\%\) confidence interval for the number of chips in the bag.
  • What is the \(p\)-value that the average number of chips is larger than 1000?

Solutions

There are a couple of approaches that we should understand.

Solution 1: The easiest approach is to just use the t.test.

For the first part a simple call to t.test with the conf.level parameter set to \(0.99\) should do it.

t.test(chips, conf.level=0.99)
## 
##  One Sample t-test
## 
## data:  chips
## t = 52.531, df = 15, p-value < 2.2e-16
## alternative hypothesis: true mean is not equal to 0
## 99 percent confidence interval:
##  1168.732 1307.643
## sample estimates:
## mean of x 
##  1238.188

I guess the confidence interval is \([1168.73, 1307.64]\). The \(p\)-value that we see doesn’t really help because we haven’t set the null value or alternative hypothesis. We can do this like so:

t.test(chips, conf.level=0.99, alternative = "greater", mu=1000)
## 
##  One Sample t-test
## 
## data:  chips
## t = 10.105, df = 15, p-value = 2.176e-08
## alternative hypothesis: true mean is greater than 1000
## 99 percent confidence interval:
##  1176.846      Inf
## sample estimates:
## mean of x 
##  1238.188

It looks like Nabisco is safe!

Solution 2: In order to do the HW, you’ll need to understand this from a slightly different perspective. From that perspective, all you really need to know is the mean and standard deviation of the data, rather than the whole data set. Of course, we can compute this with R:

c(mean(chips), sd(chips))
## [1] 1238.188   94.282

In the 2nd HW problem, that’s all you’re given. As before, the 99% confidence interval should have the form \[[1238.188 - ME, 1238.188 + ME].\] The margin of error \(ME\) has the form \[ME = t^* \times SE,\] where \(SE = \sigma/\sqrt{n} = 94.282/\sqrt{16} \approx 23.5705\) and \(t^*\) is chosen from the \(t\)-distribution with 15 degrees of freedom so that it’s corresponding two sided \(p\)-value is \(0.01\). We can do this in R as follows:

qt(0.005,15)
## [1] -2.946713

We can also look this up in a table to see that the value is about \(2.95\) putting this all together, we find that our confidence interval is \[[1238.188 - 2.95\times23.5705, 1238.188 + 2.95\times23.5705] \approx [1168.655,1307.721].\]

To compute the \(p\)-value for the alternative hypothesis that the number of the chips in the bag is greater than 1000, we first compute the \(t\)-statistic:

\[\frac{x-\bar{x}}{s/\sqrt(n)} = \frac{1238.188 - 1000}{94.282/\sqrt{16}} \approx 10.10534.\]

Finally, note that 1-pt(10.10534, 15) return the following computation suggests that we must reject the null hypothesis that the number of chips is only 1000.

1-pt(10.10534,15)
## [1] 2.176297e-08