The first fundamental tool for inference is the confidence interval. Here is the basic idea.
The standard error is simply the standard deviation associated with a sampling distribution. Generally, if the standard deviation associated with a population parameter is \(\sigma\) and we form a random variable \(S\) by sampling from the population, the standard deviation of \(S\) is \(\sigma/\sqrt{n}\). Thus, as expected, the standard error decreases as the sample size increases.
Suppose we draw a random sample of 132 people and find that 16 of them have blue eyes. Use this data to write down a 95% confidence interval for the proportion of people with blue eyes
Solution: We have \(\hat{p}=16/132 \approx 0.1212\) and \[SE(\hat{p}) = \sqrt{(16/132)\times(116/132)/132} \approx 0.02840718.\] Thus, our confidence interval is \[0.1212 \pm 2\times0.0284 = [0.0644, 0.178].\]
If you read the details of political surveys, you’re likely to come across the term “margin of error” at some point. Five Thirty Eight, for example, maintains a running Trump approval rating page. The page also points to poll details for a slew of polls. Check out the first one, namely the Gallup poll. There, we read “Daily results are based on telephone interviews with approximately 1,400 national adults; Margin of error is \(\pm 3\) percentage points”. What’s that mean?
When we write a confidence interval as \[s \pm z^* \times SE,\] Then, \(z^* \times SE\) is the margin of error. Geometrically, it’s the distance that the interval extends in either direction from the measured statistic \(s\).
So, where’s the \(\pm 3\) come from?
Suppose we’re writing down a confidence interval for a proportion. In this case, approve or disapprove. If the actual proportion is \(p\) and our sample size is \(n\), then the standard error is \[\sqrt{\frac{p(1-p)}{n}}.\] In our case, the take \(n\approx 1500\). Furthermore the biggest that \(p(1-p)\) can be is \(1/4\). You can see this by taking a look at a graph:
Thus, our standard error is at most \[SE \leq \sqrt{\frac{1/4}{1500}} \approx 0.01290994.\] Now, for a \(95\%\) confidence interval, we take \(z^* = 2\) so that our margin of error is at most \[ME \leq 2*0.01290994 \approx 0.026,\] which is rounded up to 3 percentage points.
This is a common thing to shoot for in political polls, which is why you often see sample sizes close to 1500.
In a political contest, where there are two candiates and victory requires a simple majority, a candiate likes to be more than 3 percentage points above 50%. Can you see why?
In early November of last year, FiveThirtyEight reported that Donald Trump was only 3.3 perecentage points behind Hilary Clinton - almost within the margin of error. In fact, Clinton ended up winning the popular vote by 2.1%.
A recent poll by the Brookings Institute asks the following question of 1500 college students: “Is hate speech constitutionally protected?”
Here are the results:
Political Affiliation | Type of College | Gender | ||||||
---|---|---|---|---|---|---|---|---|
All | Dem | Rep | Ind | Public | Private | Female | Male | |
Yes | 39 | 39 | 44 | 40 | 38 | 43 | 31 | 51 |
No | 44 | 41 | 39 | 44 | 44 | 44 | 49 | 38 |
Don’t know | 16 | 15 | 17 | 17 | 17 | 13 | 21 | 11 |
Use this to write down a confidence interval for the percentage of students who believe that hate speech is not constitutionally protected.