2.57 | 4.43 | 2.09 | 7.68 | 4.77 | 2.12 | 5.13 | 5.71 | 5.33 | 3.31 |
7.49 | 4.91 | 2.58 | 1.08 | 6.6 | 3.91 | 3.97 | 6.18 | 5.9 |
Thu, Mar 28, 2024
Suppose we have the following data on the level of mercury in dolphin muscle arising from 19 Risso’s dolphins from the Taiji area in Japan:
2.57 | 4.43 | 2.09 | 7.68 | 4.77 | 2.12 | 5.13 | 5.71 | 5.33 | 3.31 |
7.49 | 4.91 | 2.58 | 1.08 | 6.6 | 3.91 | 3.97 | 6.18 | 5.9 |
The units are micrograms of mercury per gram of dolphin muscle. We now ask - what kinds of conclusions could we draw from this data? Could we
The data we have is one of many possible samples and, as such, its mean may be considered to be a random variable.
The basic idea in statistics is to first determine the distribution of these types of random variables and then the model the data based on that distribution.
To this point, we’ve used the normal distribution to model our data and with good reason - the central limit theorem tells us that this process should work well for means provided that the sample size is sufficiently large.
A bare minimum for “sufficiently large” is 30. That assumes, though, that the underlying data is normally distributed so, really, 100 or more is better.
We now discuss a new distribution (called the \(t\)-distribution) that’s specifically designed to account for the extra variability inherent in smaller samples.
The \(t\)-distribution is closely related to the standard normal distribution but has heavier tails to account for the extra variability inherent in small sample sizes. As the degrees of freedom increases, the corresponding \(t\)-distribution gets closer and closer to the standard normal.
There’s even a formula:
\[f(t) = \frac{\left(\frac{\nu-1}{2}\right)!} {\sqrt{\nu\pi}\,\left(\frac{\nu-2}{2}\right)!} \left(1+\frac{t^2}{\nu} \right)^{\!-\frac{\nu+1}{2}}\]
As you can check on Desmos.
Finding a confidence interval using a \(t\)-distribution is a lot like finding one using the normal. It’ll have the form \[ [\overline{x}-ME, \overline{x}+ME], \] where the margin of error \(ME\) is \[ ME = t^* \frac{\sigma}{\sqrt{n}}. \] Note that the familiar \(z^*\) multiplier has been replaced with a \(t^*\)-multiplier; it plays the exact same role but it comes from the \(t\)-distribution, rather than the normal distribution.
Recall that we have the following actual data recording the mercury content in dolphin muscle of 19 dolphins:
We’ve now typed out the data into a Python list. Thus, we can attempt to use it, together with NumPy and SciPy tools, to compute a 95% confidence interval for the average mercury content in Risso’s dolphins.
Here’s how we would use Python to compute the mean and standard deviation for our data:
[4.513684210526317, 1.8806388114630852]
Note that we specify ddof=1
so that np.std
computes the sample standard deviation, rather than the population standard deviation - i.e. it uses an \(n-1\) in the denominator, rather than an \(n\). That’s particularly important when dealing with small sample sizes!
The standard error, of course, is the underlying standard deviation divided by the square root of the sample size:
Next, the multiplier \(t^*\) can be computed using t.ppf
from the scipy.stats
module:
Note that the \(t^*>2\) and \(2\), of course, would be the multiplier for the normal distribution. This makes some sense because the \(t\)-distribution is more spread out than the normal.
Finally, the confidence interval is:
Suppose that scientists have determined that we really need to keep the mercury concentration down to 3.2 micrograms per gram of dolphin muscle. We can ask:
Does the data support the conclusion that the average dolphin has more than 3.2 micrograms of mercury per gram of muscle?
We’d like to work at a 90% level of confidence.
We should clearly state our Hypothesis test. That is, if \(\mu\) represents the actual average concentration, then our hypotheses are
\[\begin{align} H_0 &: \mu=3.2 \longleftarrow \text{sometimes written } \mu\leq3.2 \\ H_A &: \mu > 3.2. \end{align}\]
Since we are working at the 90% level of confidence, we specify a significance level of \(\alpha=0.1\). Thus, a \(p\)-value less than this indicates that we reject the null.
The dots show the actual measurements; we’d like to stay below the red line.
Pictorially, it doesn’t look so good. :(
We’ve still got the mean of the data m
, the assumed or desired mean of m0=3.2
and the standard error:
Thus, we can compute a test statistic:
Finally, we compute the \(p\)-value:
Doesn’t look so good!
More specifically, since our \(p\)-value is less than \(\alpha=0.1\), we reject the null hypothesis that the average concentration doesn’t exceed 3.2 in favor of the alternative hypothesis that it does.