Reotrospective and prospective looks

Intro to the \(t\)-distribution

A \(t\)-distribution is often used in place of a normal distribution when the sample size is too small for the normal distribution to be appropriate.

A problem

The following data records the average mercury content in dolphin muscle of 19 Risso’s dolphins from the Taiji area in Japan:

d = c(2.57,4.43,2.09,7.68,4.77,2.12,5.13,5.71,5.33,3.31,7.49,4.91,2.58,1.08,6.60,3.91,3.97,6.18,5.90)

While it might be important for researchers to assess the threat of mercury in the ocean, they do not want to go kill more dolphins to get that data. What can they conclude from this data set, even though it’s a bit too small to use a normal distribution?

The basics

The normal distribution, as awesome as it is, requires that we work with large sample sizes.

The \(t\)-distribution is similar but better suited to small sample sizes.

Just as with the normal distribution, there’s not just one \(t\)-distribution but, rather, a family of distributions.

Just as there’s a formula for the normal distribution, there’s a formula for the \(t\)-distribution. It’s a bit more complicated, though.

Like all continuous distributions, we compute probabilities with the \(t\)-distribution by computing the area under a curve. We do so using either a computer or a table.

The degree to which a \(t\)-distribution deviates from the normal is determined by its “degrees of freedom” parameter \(df\), which is one less than the sample size.

The mean of the \(t\)-distribution is zero and its variance is related to the degrees of freedom \(df\) by \[\sigma^2 = \frac{df}{df-2}.\]

Unlike the normal distribution, there’s no easy way to translate from a \(t\)-distribution with one standard deviation to another standard one. As a result, it’s less common to use tables and more common to use software than it is with the normal.

Given a particular number of degrees of freedom, however, there is a standard way to derive a \(t\)-score that’s analogous to the \(z\)-score for the normal distribution. This \(t\)-score is a crucial thing that you need to know when using tables for the \(t\)-distribution.

The \(t\)-test on the computer

Our main uses of the \(t\)-distribution will be to construct confidence intervals and perform hypothesis tests for relatively small data sets. The main tool for doing so is the so called \(t\)-test. The \(t\)-test is to the \(t\)-distribution as the \(p\)-test is to the normal distribution.

The \(t\)-test a little tricky due to the lack of a “standard” version of the \(t\)-distribution. R has a built command, called t.test that automates the procedure, though. While t.test uses the \(t\) distribution to deal with small sample sizes, it actually works well with samples of all sizes. The reason is that, as the degrees of freedom grows, the corresponding \(t\)-distribution approaches the normal distribution.

Example

Here’s how we can examine that data set on mercury levels in Risso’s dolphins using a \(t\)-test in R:

d = c(2.57,4.43,2.09,7.68,4.77,2.12,5.13,5.71,5.33,3.31,7.49,4.91,2.58,1.08,6.60,3.91,3.97,6.18,5.90)
t.test(d, mu=3.6)

## 
##  One Sample t-test
## 
## data:  d
## t = 2.1177, df = 18, p-value = 0.04838
## alternative hypothesis: true mean is not equal to 3.6
## 95 percent confidence interval:
##  3.607245 5.420123
## sample estimates:
## mean of x 
##  4.513684

There’s quite a bit going on here that we need to parse. First, there’s the command itself:

t.test(d, mu=3.6)

I guess it’s pretty clear that we’re applying t.test to our data set d. The mu=3.6 option is used to specify an assumed null-value that we are comparing the data against. Thus, in the output we see the lines:

## t = 2.1177, df = 18, p-value = 0.04838
## alternative hypothesis: true mean is not equal to 3.6

Note that the value of \(3.6\) appears in the statement of the alternative hypothesis; the associated \(p\)-value for the alternative hypothesis is, evidently, \(0.04838\). Thus, we can reject the null hypothesis that the true mean is \(3.6\).

The computation of the \(p\)-value is based on the \(t\)-score in much the same way that the \(p\)-value for the normal distribution is based on the \(z\)-score. In the output above, the \(t\)-score is indicated the t = 2.1177 in the output.

Finally, the \(95\%\) confidence interval is just what you’d probably think - we have a \(95\%\) level of confidence that the interval contains the actual mean. This confidence interval is constructed in exactly the same way that we’ve constructed confidence intervals before - it has the form \[[\bar{x} - ME, \bar{x} + ME].\]

For this particular problem, we might be interested in a one-sided hypothesis test to check, for example, if the mercury concentration is too high. Thus, we might run the command

t.test(d, mu=3.6, alternative = "greater")

## 
##  One Sample t-test
## 
## data:  d
## t = 2.1177, df = 18, p-value = 0.02419
## alternative hypothesis: true mean is greater than 3.6
## 95 percent confidence interval:
##  3.765526      Inf
## sample estimates:
## mean of x 
##  4.513684

Not surprisingly, the \(p\)-value is exactly half of the previous \(p\)-value and the confidence interval is a semi-infinite interval that lies entirely to the right of the mean.

The \(t\)-score and the \(t\)-table

Computing a \(t\)-score

The \(t\)-score for a \(t\)-distribution is computed in much the same way that a \(z\)-score is computed for a normal distribution. That is, given a sample of size \(n\), with mean \(\bar{x}\), and standard deviation \(s\) the \(t\)-score of the observation \(x\) is \[T = \frac{x-\bar{x}}{s/\sqrt{n}}.\] In our dolphin example above, the assumed mean is \(3.6\) and the observed mean is \(4.513684\). We can compute the standard error easily enough:

n = length(d)
se = sd(d)/sqrt(n)
se

## [1] 0.4314481

Thus, the \(t\)-score is

t_score = (mean(d)-3.6)/se
t_score

## [1] 2.117715

In agreement with the first line from our application of t.test.

Critical values in the \(t\)-table

The typical \(t\)-table doesn’t contain nearly as much information as a normal table; it just contains crtical cuttoffs for many common choices of confidence level. Thus, we don’t usually compute \(p\)-values directly but we can still typically assess whether we should reject the null or not.

Note that we can also find that \(t^*=2.1\) in the \(t\)-table on our webpage, where we see something that looks like so:

one tail	0.100	0.050	0.025	0.010	0.005
two tails	0.200	0.100	0.050	0.020	0.010
df 1	3.08	6.31	12.71	31.82	63.66
2	1.89	2.92	4.30	6.96	9.92
…	…	…	…	…	…
18	1.33	1.73	2.10	2.55	2.88

The entries in this table are called critical \(t^*\) values. The columns indicate several common choices for confidence level and are alternately labeled either one-sided or two. The rows correspond to degrees of freedom.

Now, look in the row where \(df=18\) and where the two-sided test is equal to 0.05. we see that \(t^*=2.1\). Since we have a \(t\)-score of \(2.11\) in our dolphin example, we can reject the hypothesis that the average content is \(\mu=3.6\) with a \(95\%\) confidence level.