An archived instance of Mark's Discourse site as of Tuesday July 18, 2017.

A confidence interval for random heights

mark

(5 pts)

I set up a little tool to generate 100000 random height measurements. You can access it and store the values in a variable called heights like so:

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=YOUR_STUDENT_ID#')
heights = dd$heights
length(heights)

## Out:
#  100000

Here's your assignment with this data:

Grab your data
Compute the mean of your data
Grab a sample of size 30
Compute sample mean
Check to see if your sample mean lies within 1 standard error of the population mean

audrey

Grabbing the data and computing the mean

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=MY_ID')
heights = dd$heights
pop_mean = mean(heights)
pop_mean

## Out:
# 70.00831

Grabbing the sample and computing the mean

ss = sample(heights, 30)
mean(ss)

** Out:
* 70.23533

Check if within one standard error

My population standard error is

sd(heights)/sqrt(30)

## Out:
# 0.364658

And the difference between my population mean and sample mean is

70.23533 - 70.00831 = 0.227

So YES!

wolfpack77

Grabbing the data and computing the mean:

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=MY_ID')
heights = dd$heights
pop_mean = mean(heights)
pop_mean

## Out:
# 69.99156

Grabbing the sample and computing the mean:

ss = sample(heights, 30)
mean(ss)

## Out:
# 69.302

Check if sample mean lies within 1 standard error of the population mean:

My population standard error is:

sd(heights)/sqrt(30)

## Out:
# 0.3644408

And the difference between my population mean and sample is:

abs(69.302-69.99156) = 0.68956

So, No!

kd95

Mean:

mean(heights)
[1] 70.00617

Sample Mean:

> sampless = sample(heights, 30)
> mean(sampless)
[1] 70.40867

Standard Deviation:

> sd(heights)
[1] 1.995

Difference between Means:

> mean(heights) - mean(sampless)
[1] -0.4024989

Standard Error:

> sd(heights)/sqrt(30)
[1] 0.3642354

My sample does not lie within one standard error of the mean.

Sarcasticswimmer

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=#########')

heights = dd$heights
length(heights)
[1] 100000

Mean of the population

pop_mean = mean(heights)
pop_mean
[1] 69.99943

head(dd)
heights
1 68.60
2 68.95
3 69.02
4 67.18
5 73.15
6 65.29

Sample of 30

sample<- sample(heights, 30)
sample
[1] 70.85 71.15 68.94 68.35 70.58 67.03 71.12 67.99 67.97 68.33 71.91 72.40
[13] 72.03 71.74 71.98 70.09 72.87 73.40 70.13 69.42 70.16 72.92 70.04 72.95
[25] 70.96 71.92 69.01 73.31 68.88 68.87

Mean of sample

mean(sample)
[1] 70.57667
meansamp<-mean(sample)

Standard Deviation of sample

ssd=sd(sample)
ssd
[1] 1.802711

Sample mean does not lie within one standard error of the population mean

ssd/sqrt(30)
[1] 0.3291285
Sample mean=70.57667
Population mean=69.99943
70.57667-69.99943=0.57724
0.57724>.3291285

Alison

> dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=930272773')
> head(dd)

> heights=dd$heights
> pop_mean = mean(heights)
> pop_mean
[1] 69.99294

## Out:    
#  69.99294

> ss = sample(heights, 30)
> mean(ss)
[1] 69.80167

> sd(heights)/sqrt(30)
[1] 0.3635401

69.80167 - 69.99294 = -0.19127 YES

Nonamaker

dd <- read.csv('https://www.marksmath.org/cgi-
bin/random_heights.csv?id=YOUR_STUDENT_ID#')
heights = dd$heights
length(heights)

100000

Calculating Population Mean

mean(dd$heights)

70.0048

Calculating Sample Mean

my_sample=sample(heights,30)
mean(my_sample)

70.5

Calculating Standard Error

sd(my_sample)/sqrt(30)

0.3930225

Difference of Sample and Population Mean

mean(my_sample)-mean(dd$heights)

0.4952012

0.4952012 > 0.3930225
Sample mean does not lie withing 1 standard error of the population mean

Amelia

Grabbing the data and computing the mean

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=ID')

Sample

[1] 70.25 68.37 69.46 71.14 73.00 62.29 70.60 69.13 69.04 72.70 70.53 73.24
[13] 71.71 69.50 70.36 64.76 71.60 70.64 71.49 69.35 65.25 75.63 68.27 69.17
[25] 70.48 67.43 72.18 70.45 72.42 70.35

Pop mean

heights = dd$heights
pop_mean = mean(heights)
pop_mean
## Out:
 #  69.99394

Sample mean

sample= sample(heights, 30)
mean(sample)

** Out:
* 70.02633

Check if within one standard error

sd(heights)/sqrt(30)
## Out: 
# 0.367057

Difference between the means

mean(heights) - mean(sample)
| -0.03239373|

So, YES

PaulWall

Grabbing the data and computing the mean

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=MY_ID#')
heights = dd$heights
pop_mean = mean(heights)
pop_mean

## Out: 
70.00587

Grabbing the sample and computing the mean

ss = sample(heights, 30)
ss
[1] 74.06 67.90 70.95 66.86 68.87 70.22 69.29 69.48 70.71 69.09 72.26 74.42
[13] 70.00 71.21 72.14 66.01 68.07 66.08 70.49 68.73 70.13 71.49 66.65 70.06
[25] 68.95 67.80 73.62 70.17 68.29 63.10

mean(ss)
## Out:
69.57

Check if within one standard error

My population standard error is

sd(heights)/sqrt(30)
# Out:
0.3641824

And the difference between my population mean and sample mean is

69.57 - 70.00587 = 0.43587

So NO!

Sierra

Grabbing the data:

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=930359479')

Computing the mean:

heights = dd$heights
pop_mean = mean(heights)
pop_mean

## Out: 
# 70.00702

Grabbing a sample size of 30:

sample(heights, 30)

## Out: 
# 70.95 72.81 70.08 73.55 68.97 71.15 68.61 72.50 74.37 68.18 71.53 
72.37 66.31 69.28 71.12 67.39 67.68 73.15 69.75 68.46 68.31 68.94 68.09 
70.30 68.11 70.96 71.55 73.28 72.53 70.59

Computing the sample mean:

ss = sample(heights, 30)
mean(ss)

## Out:
# 69.644

Does sample mean lie within 1 standard error of the population mean?

My population standard error is:

sd(heights)/sqrt(30)

## Out: 
# 0.3658712

The absolute value of the difference between my population mean and sample mean is:

abs(69.644-70.00702)

## Out:
# 0.36302

So, yes, the sample mean lies within 1 standard error of the population mean.

Jenna

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=930332725')
heights = dd$heights
length(heights)

## Out: [1] 100000

pop_mean = mean (heights)
pop_mean
## Out: [1] 70.00037

head(dd)
heights

1 68.53
2 70.01
3 71.45
4 68.25
5 66.17
6 70.65

sample<- sample(heights, 30)
sample

[1] 72.56 70.06 73.08 68.46 69.18 70.97 68.18 69.96 71.27 66.91 70.50 69.56 69.64 70.12 69.01 73.73 67.59 69.76 73.05
[20] 69.28 70.37 66.34 67.35 67.92 70.86 70.21 69.67 71.78 72.87 72.11

mean(sample)
## Out: [1] 70.07833

ssd=sd(sample)
ssd
## Out: [1] 1.921907

Sample mean lies within one standard error of the population Mean

ssd/sqrt(30)
## Out: [1] 0.3508905
Sample mean = 70.07833
Population mean =  70.00037

## Out: 70.07833 - 70.00037= .07796
.07796 < 0.3508905

monehish

First, I'll pull my data:

heights = dd$heights
dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=#########')

Here's the head of my data:

Finding the mean of my heights

pop_mean

##Out
#70.01347

Grabbing the sample and computing mean

ss = sample(heights, 30)
mean(ss)

** Out:
* 70.61767

Check if within one standard error

My population standard error is

sd(heights)/sqrt(30)

##Out
# 0.3660575

70.61767 - 70.01347 = 0.6042

So....no

not_sam

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=930255908')
heights = dd$heights
length(heights)

# Out: [1] 100000

pop_mean = mean(heights)
pop_mean
# Out: [1] 70.00258

sample<- sample(heights, 30)
mean(sample)

# Out: [1] 70.34867

sd(sample)/sqrt(30)
# Out[1] 0.2983432

mean(height)-mean(sample)=-0.34609

yes

Kristian

My Data:

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=mydata')

Here's the Head of My Data:

head(dd)

 heights
1   70.96
2   72.87
3   68.03
4   68.24
5   68.93
6   70.71

Finding the Mean of the Heights:

heights=dd$heights
pop_mean=mean(heights)
pop_mean

Out: 69.9981

Grabbing the Sample and Computing the Mean:

ss=sample(heights, 30)
mean(ss)

Number Out: 70.58933

Check if Within One Standard Error:

My population standard error is:

sd(heights)/sqrt(30)
Number Out: **0.3661043**

The difference between my population mean and sample mean is:

70.58933 - 69.9981 = 0.**59123 **

So my answer is NO.

Bellaj

dd <- read.csv('https://www.marksmah.org/cgi-bin/random_heights.csv?id=ID number')
heights= dd$heights 
length(heights)

## Out:
 # 100000

Grabbing the data and computing the mean

pop_mean = mean(heights) 

** Out: 
# 70.00764

Grabbing the sample and computing the mean

** Out: 
ss = sample(heights, 30)
mean(ss)

**out
*70.526

Check if within one standard error

My standard error:

sd(heights)/sqrt(3 0)
# 0.3658316

the difference between my population mean and sample mean is

70.526-70.00764 = 0.51836, so no!

Prestonw

Grabbing the data and computing the mean

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=YOUR_STUDENT_ID#')
heights = dd$heights
pop_mean = mean(heights)
pop_mean

## Out:
# 70.00056

Grabbing the sample and computing the mean

ss = sample(heights, 30)
mean(ss)

** Out:
* 69.80867

Check to see if sample mean is within 1 standard error of the population mean

My population standard error is:

sd(heights)/sqrt(30)

## Out:
# 0.3650662

The absolute value difference between my population mean and sample mean is below:

69.80867 - 70.00056 = 0.19189

This number is less than my standard error, so YES.

FRD

Grabbing the data and computing the mean

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?id=YOUR_STUDENT_ID#')
heights <-dd$heights
pop_mean<-mean(heights)
pop_mean
**Out:
**[1] 70.01041

Grabbing the sample and computing the mean

sample(heights,30)
sample<-sample(heights,30)
sampleMean<-mean(sample)
sampleMean
**Out:
**[1] 70.23167

Check if within one standard error

SE<-sd(heights)/sqrt(30)
SE
**Out:
**[1] 0.3668139
difference<-abs(sampleMean-pop_mean)
difference
**Out:
**[1] 0.2212544
SE-difference
**Out:
**[1] 0.1455596

(SE-difference)>0 so YES!

Andy

Grabbing data & computing mean:

dd <- read.csv('https://www.marksmath.org/cgi-bin/random_heights.csv?
id=********')
heights = dd$heights
pop_mean = mean(heights)
pop_mean

Out:
70.00056

Grabbing the sample & computing the mean:

ss = sample(heights, 30)
mean(ss)

Out:
70.096

Check if within one standard error:

sd(heights)/sqrt(30)

Out:
0.3648623

Difference between my population mean and sample mean:

70.00056 - 70.096 = -0.0954438

So, yes.

mark

@Bellaj I think we're looking for a YES or a NO at the end.

mark

@not_sam I edited your post for formatting a bit - please take note.