A heavy hypothesis test
(5 pt)
Our CDC data set has just over 1000 men who are 5'9''. We can pack them into a data frame and take a sample of size 100 from that data frame as follows:
import pandas as pd
df = pd.read_csv('https://www.marksmath.org/data/cdc.csv')
df_men = df[df.gender=='m']
five9 = df_men[df_men.height==69]
sample = five9.sample(100, random_state=5)
print(len(five9))
sample.head()
genhlth | exerany | hlthplan | smoke100 | height | weight | wtdesire | age | gender | |
---|---|---|---|---|---|---|---|---|---|
838 | excellent | 1 | 1 | 1 | 69 | 160 | 160 | 31 | m |
13979 | good | 1 | 1 | 1 | 69 | 150 | 165 | 52 | m |
16025 | fair | 1 | 1 | 1 | 69 | 190 | 150 | 53 | m |
14550 | excellent | 1 | 1 | 0 | 69 | 200 | 180 | 29 | m |
4765 | fair | 0 | 1 | 1 | 69 | 235 | 180 | 56 | m |
The CDC recommends that the average weight of men at this height be 165 points, but we suspect that it might be more.
Your exercise is to use the code above (with your special number as the random_state
to grab a sample of size 100 and test the null hypothesis that the average weight of men at 5'9'' is 165 vs the alternative hypothesis that the actual mean is larger.
Thus, my alternative hypothesis is:
%H_A: mu > 165%
Comments
%H_0: mu = 165%
%H_A: mu > 165%
With 95% level of confidence
Therefore a=0.05
[176.33, 23.76486456680929, 2.376486456680929]
4.767542423037337
9.324336787626066e-07
At a 95% confidence level
9.324336787626066e-07 < 0.05
Therefore, we reject the null hypothesis
Hypotheses:
%H_0: mu=165%
%H_A: mu>165%
Data Set:
Mean:
=169.68295
Standard Deviation:
=40.080969967120254
Standard Error:
=4.008096996712025
Z Score:
=1.1683724230829704
P Value:
%P(Z=1.16)= 0.123%
%P Value (0.123) > Confidence Level (.05)%
So, we fail to reject %H_A%
mean:
=169.68296
standard deviation
=40.080969967120254
standard error
=4.008
Z score
=1.167
%H_0:mu=165%
%H_A:mu>165%
P value:
P(Z=1.16)=.13
Pvalue(.13)>(.05)
we fail to reject %H_A%
%H_A:mu=165%
%H_A:mu>165%
Mean, Standard Deviation, Population
[169.68295, 40.080969967120254, 20000]
Standard Error
4.008096996712025
Z Score
1.1683724230829704
%P(Z=1.17)=.1210%
PValue(.1210) > Confidence Level (.05)
So, we fail to reject %H_A%
With a 95% level of confidence, our alpha value equals 0.05
mean=164
standard deviation=10.198039027185569
standard error=1.0198039027185568
z-score=-0.9805806756909203
probability value=0.1635
p-value (0.8365) > confidence level (0.05)
Therefore, we fail to reject the null hypothesis.
%H_0: mu = 165%
%H_A: mu < 165%
%H_0: mu=165%
%H_A: mu>165%
Data Set imported from:
The Mean and Standard Deviation are:
[169.68295, 40.080969967120254]
The Standard Error is:
4.008096996712025
The Z score is: 1.683 with a Probability of 0.8790 .
P value is: (1-0.8790)= 0.121
P = 0.121 > 0.05, therefore I fail to reject the null hypothesis.
import pandas as pd
Mean:
sample.weight.mean()
m
%178.98%
Standard Deviation:
sample.weight.std()
sd
%25.36599967267088%
Standard Error:
se=2.53659
se
%2.53659%
z score:
z=(m-165)/se
z
%5.511336085059072%
%H_A:mu>165%
%H_O:mu=165%
Thus, I reject the null hypothesis.
Hypothesis:
%H_0:mu=165%
%H_A:mu>165%
Data set:
Mean:
=169.68295
Standard Deviation:
=40.080969967120254
Standard Error:
=[180.77, 2.9344300635693807]
Margin of Eroor
=5.8688601271387615
Z score:
=5.374127056488
The Z-score is 5.374127056488, therefore we can reject the null.