0.09540800627807665
Mon, Sep 16, 2024
To this point, we’ve covered a lot of apparently disparate things:
Today, we’re going to start to tie things together. In particular, we’re going to learn
This is mostly the beginnings of section 5.1 in our text.
Here’s a problem that comes right off of our review sheet:
I’ve got an unfair coin that comes up heads \(90\%\) of the time. Suppose I flip the coin and write down a \(1\) if it comes up heads or a \(0\) if it comes up tails. Let’s denote that numerical value by the random variable \(X\).
What are the expectation and variance of this random variable?
Recall that expectation is \[E(X) = p = 0.9\] and that the variance is \[\sigma^2(X) = p(1-p) = 0.9\times0.1 = 0.09.\]
The problem continues to ask: suppose I flip the coin 1000 times and count the number of heads that I get. We’ll call that numerical value \(S\). What are the expectation and variance of this new random variable \(S\)?
It’s just a matter of multiplying by the number of coin flips to get
Finally, the problem asks us to estimate \(P(S < 888)\).
And, this is where it gets a little funkier
In principle, we can solve this problem with the binomial distribution:
0.09540800627807665
Effectively, this computes \[\sum_{k=0}^{887} \binom{1000}{k} (0.9)^{1000-k}(0.1)^k.\]
There are issues, though:
Here’s an alternative that yields a good estimate:
Effectively, this computes the shaded area in this picture:
import matplotlib.pyplot as plt
xs = np.arange(860,940)
ys = [norm.pdf(x-0.5, 900, np.sqrt(90)) for x in xs]
ypts = [binom.pmf(k, 1000, 0.9) for k in xs]
plt.plot(xs,ypts,'.')
plt.plot(xs,ys,'-')
xs2 = np.arange(860,888)
ax = plt.gca()
ax.fill_between(xs2,0,norm.pdf(xs2, 900, np.sqrt(90)))
ax.set_aspect(600)
Suppose I roll a fair six sided die one million time, add up the numbers, and call the result \(S\). What’s \[P(S < 3501000)?\]
First, we need to know the mean and variance for 1 roll:
\[E(X) = \frac{1+2+3+4+5+6}{6} = \frac{7}{2} = 3.5\] and \[\sigma^2(X) = \frac{\left(1-\frac{7}{2}\right)^2 + \left(2-\frac{7}{2}\right)^2 + \left(3-\frac{7}{2}\right)^2 + \left(4-\frac{7}{2}\right)^2 + \left(5-\frac{7}{2}\right)^2 + \left(6-\frac{7}{2}\right)^2}{6} = \frac{35}{12}.\]
To get the mean and variance for 1,000,000 rolls we simply multiply by 1,000,000.
\[E(S) = 3,500,000\] and \[\sigma^2(S) = \frac{35,000,000}{12}.\]
We use norm.cdf
to compute the probability:
Because of the Central Limit Theorem, of course!
The central limit theorem is the theoretical explanation of why the normal distribution appears as the limit of binomials above and, therefore, so often in practice. Suppose that \(X\) is a random variable which we evaluate a bunch of times to produce a sequence of numbers: \[X_1, X_2, \ldots, X_n.\] We then compute the sum of those values to produce a new value \(S\) defined by \[S = X_1 + X_2 + \cdots + X_n.\] The central limit theorem asserts that the random variable \(S\) is normally distributed. Furthermore, if \(X\) has mean \(\mu\) and standard deviation \(\sigma\), then the mean and standard deviation of \(S\) are \(n\times\mu\) and \(\sqrt{n} \times \sigma\).
Since an average is just a sum divided by \(n\), we can do the same thing with averages.
That is, define \(\bar{X}\) by \[\bar{X} = \frac{X_1 + X_2 + \cdots + X_n}{n}.\] The central limit theorem also asserts that the random variable \(\bar{X}\) is normally distributed. Furthermore, if \(X\) has mean \(\mu\) and standard deviation \(\sigma\), then the mean and standard deviation of \(\bar X\) are \(\mu\) and \(\sigma/\sqrt{n}\).
Note that all of this is true regardless of the distribution of \(X\)!
The process of computing a statistic based on a random sample can be thought of a random variable in the following sense: Suppose we draw a sample of the population and compute some statistic. If we repeat that process several times, we’ll surely get different results.
Since sampling produces a random variable, that random variable has some distribution; we call that distribution the sampling distribution.
Suppose we’d like to estimate the average height of individuals in a population. We could do so by selecting a random sample of 100 folks and finding their average height. Probably, this is pretty close to the actual average height for the whole population. If we do this again, though, we’ll surely get a different value.
Thus, the process of sampling is itself a random variable.
As we move forward, we’ll think of sampling as random processes which we try to model with the normal distribution!