An archive the questions from Mark's Fall 2018 Stat 225.

Is my coin fair?

Mark

(10 pts)

I’ve got a program on my web space that generates a random sequence of coin flips and returns the result as a CSV file. You can access it here:
https://marksmath.org/cgi-bin/coin_flips.csv

Again, it looks like a CSV file. In fact, it is a CSV file by the time you get it. On my web server, though, it’s a computer program that generates a CSV file. If you load it again, you should get a different sequence of flips. You can even specify some query parameters. For example, you can ask for more flips:
https://marksmath.org/cgi-bin/coin_flips.csv?n=100

Here’s the thing: The coin is not necessarily fair. The first thing the program does is pick a number p with probability distribution

  • P(0.4<p<0.5)=1/3
  • P(p=0.5)=1/3
  • P(0.5<p<0.6)=1/3

The program then generates a sequence of Heads and Tails with distribution

  • P(X=H)=p
  • P(X=T)=1-p

Thus, the coin is fair one-third of the time but it might be biased in one direction or the other.

The program does accept a seed parameter that allows you to seed the random number generator for the choice of p. For example, I could seed the random number generator with my name like so:
https://marksmath.org/cgi-bin/coin_flips.csv?n=100&seed=mark

Note that the sequence of coin flips still varies. Thus, if I reload that page, I get a different sequence of coin flips but that sequence will be generated from the same distribution - i.e. with the same choice of p.

Your mission

Seed the program above with your forum login name and run a hypothesis test to determine if the resulting coin is fair.

Garrett

I have seeded my program with my forum login name and ran a hypothesis test to determine if the resulting coin is fair.
What is the is the proportion of heads and tails using a 99% confidence level. With a confidence of 99%, the p-value must be less than .01.

Written as a hypothesis test:

H_0:p=1/2
H_A:p≠1/2

import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=1000&seed=garrett')
df.head()

Next, calculate the proportion of heads

p = df[df.flips == 'H'].shape[0]/df.shape[0]
p

Out[9]:0.486

Calculate the T value:

import numpy as np
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/1000)
T

Out[11]: -0.8854377448471471

Calculate the p-value:

from scipy.stats import norm
2*(norm.cdf(-0.885))

Out[13]: 0.3761566313613266

This is significantly larger than α(0.01)

so I reject the null hypothesis, meaning my coin is unfair.

vscala

With a confidence of 95%, the p-value must be less than .05.
The null hypothesis is p=1/2
The alternative hypothesis is p is not equal to 1/2

import pandas as pd
import numpy as np
from scipy.stats import norm
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=1000&seed=vscala')
df.head()
flips
0 H
1 T
2 H
3 T
4 H
p = len(df[df.flips=='H'])/1000
T = (p-.5)/np.sqrt(.5*(1-.5)/1000)
pv = 2*(1-norm.cdf(T))
[p, pv]

[0.514, 0.37592058254807426]

The computed probability is .514 which gives a p-value of .375… which is above .05 and does not fall within the confidence interval of 95%, thus the null hypothesis is disproven.

dennis

Hypothesis test, the coin is fair:

  • H_0 : p = 1/2
  • H_a : p \neq 1/2

with 95% confidence :. α=0.05

Seed my file and generate flips, and import into jupyter:

import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=1000&seed=dennis')
df.head()

Output:

||flips|
| --- | --- |
|0|H|
|1|H|
|2|T|
|3|H|
|4|H|

Next, calculate proportion of Heads:

p = df[df.flips == 'H'].shape[0]/df.shape[0]
p

= 0.503

Calculate T value:

import numpy as np
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/1000)
T

= 0.18973665961010294

Calculate p-value:

from scipy.stats import norm
2*(1-norm.cdf(T))

=0.84951549236503476

This is significantly larger than α (0.05)
so I reject the null hypothesis, meaning my coin is unfair.

megan

I read my csv file, seeded with my name, into the program like so

import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=100&seed=megan')
df

I wanted to see if the coin was fair or not with a 95% confidence interval so I wrote my hypothesis test:

H_0 : p = 1/2
H_A : p \neq 1/2
with \alpha = .05

I found my probability that I would get a head to be 0.47

phat = df[df.flips=='H'].shape[0]/df.shape[0]

Then I computed my test statistic as -0.597

import numpy as np
T = (phat-0.5)/np.sqrt(0.5*(1-0.5)/99)

and then my p- value as 0.551

from scipy.stats import norm
pval = 2*(norm.cdf(-0.597))

Since this is greater that .05 we fail to reject the null hypothesis

joshua

The null hypothesis is H_0: p=1/2
The alternative hypothesis is H_A: p≠1/2

First I imported the data…

import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=100&seed=joshua')
df.head()

image

Then I found the p…

p = df[df.flips=='H'].shape[0]/df.shape[0]
p
OUTPUT: 0.48

Then I calculated the T…

import numpy as np
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/100)
T
OUTPUT: -0.40000000000000036

The I calculated…

from scipy.stats import norm
2*norm.cdf(-0.4)
OUTPUT:0.68915651677935164

.

p = len(df[df.flips=='H'])/100
T = (p-.5)/np.sqrt(.5*(1-.5)/100)
pv = 2*norm.cdf(T)
[p, pv]
OUTPUT: [0.48, 0.68915651677935141]

Based on all of this I would reject the null hypothesis.

john

Given my coin flips:

import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=100&seed=john')
df.head()

The question is whether this coin is a fair coin. To determine this we run a hypothesis test. Where

H_0: p=1/2
H_a: p\neq 1/2

I use a 95% confidence where \alpha=0.05

Now I calculate the proportion of flips with a head:

p = df[df.flips=='H'].shape[0]/df.shape[0]
print(p)
0.45

Then using this p, I calculate the T value:

import numpy as np
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/100)
print(T)
-1.0

Then I calculate the area under the normal distribution:

from scipy.stats import norm
area = norm.cdf(T)
print(2*area)
0.317310507863

This is greater than 0.05, so we fail to reject the null hypothesis. So my coin is fair.

mac

First I import my data to python

import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=100&seed=mac')
df.head()

Next I write out my hypothesis test with a confidence interval of 95%

H_0:p=1/2
H_A:p≠1/2
with α=0.05

Next I calculate my probability

p = df[df.flips == 'H'].shape[0]/df.shape[0]
p

This returned 0.46

I then calculate my test statistic

import numpy as np
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/100)
T

This returned -0.60000000000000053

Finally i calculate me p-value

from scipy.stats import norm
2*norm.cdf(-0.6)

This returned 0.54850623550014688

The computed value of 0.5485 is much larger that 0.05 therefore we fail to reject the null hypothesis

btucker

Hypothesis:

H_0 : p=1/2
H_A : p ne1/2

Import data:

 import pandas as pd
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv?n=100&seed=btucker')
df.head()

Hypothesis test:

p = df[df.flips=='H'].shape[0]/df.shape[0]
p
out: 5.1

Here’s the test statistic:

import numpy as np
T = (p-.5)/np.sqrt(.5*(1-.5)/100)
T
out: 0.20000000000000018

p, value:

from scipy.stats import norm
1-norm.cdf(2)
out: 0.022750131948179209

Based on a 99% confidence level, since my p value is greater than .01 I fail to reject the null hypothesis.

goodmorning

in order to prove with a 95% confidence level that the coin is not fair the p-value must fall below .05

first I imported the seed and coin data

import pandas as pd
df = pd.read_csv("https://marksmath.org/cgi- 
bin/coin_flips.csv?n=100&seed=omartin1")
df.head(100)

then in order to find the probability I used

p = df[df.flips=="H"].shape[0]/df.shape[0]
p

which gave me a probability of .56

I used it to find the Test Statistic

import numpy as np
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/130)
T

and then the norm.cdf from the T which is 1.3682105101189668

from scipy.stats import norm
1-norm.cdf(1.3682105101189668)

giving me a p-value of 0.085623096358241835
we then multiply it by 2 to account for both ends which is
0.17124619271648367 which fails to meet the .05 and disprove the coin

Tripp

I grabbed my data from seeding my name / ran in program like so

import pandas as pd 
df = pd.read_csv('https://marksmath.org/cgi-bin/coin_flips.csv? 
n=100&seed=tripp')
df.head()


flips
0 T
1 H
2 H
3 T
4 T

Calculated my Proportion

p = df[df.flips == 'H']
p
.49

Calculated Test statistic

import numpy as np 
T = (p-0.5)/np.sqrt(0.5*(1-0.5)/100)
T
-.200000

P-value 
from scipy.stats import norm 
2*norm.cdf(-.20) 

0.84148