Some probability computations

In this notebook, we'll see how to use Python to compute a few simple probabilities and we'll take the opportunity to introduce a couple of programming constructs.

What is probability?

In section 2.2, our textbook gives us a pretty good attempt to define probability. Intuitively, the probabiltiy of an event should be defined as the limit of its relative frequency. In practice, we can estimate if we can run a large number of trials. This is called the frequentist interpretation of probability. We can illustrate this idea by running a computer experiment:

In [1]:
# Load some standard libraries
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np

# A couple of functions for random number generation
from numpy.random import choice, seed

# seed the random number generator for reproducibilituy
seed(2)

# Number of flips
n=100

# Probability of a head on each flip
p = 0.6

# Generate the flips - 0 or 1 for tail or head.
flips = choice([0.0,1.0], size=n, p=[1-p,p])

# Generate the cummulative sums, 
# i.e. [1,0,1,0,0,1] -> [1,1,2,2,2,3]
sums = np.cumsum(flips)

# Convert sums to proportions
# i.e. [1,1,2,2,2,3] -> [1,1/2,2/3,1/2,2/5,1/2]
moving_averages = sums/np.arange(1,n+1)

# Plot the result
plt.plot(moving_averages, 'b-')
plt.plot(moving_averages, 'b.')
plt.plot([0,n],[p,p],'k--')
ax = plt.gca()
ax.set_ylim(0,1);

I think this frequentist approach is relevant for one of your WebWork problems! An alternative approach is the Bayesian. In any event, the axioms we ultimately write down should capture the properties yielded by our intution.

Programming note: building lists with list comprehensions

Here's a little programming construct that weill come in handy on the next couple of examples. The use of a loop to build an object, like a list, is common in computer programming. For example, here's a list of the squares of the prime numbers less than 50 using a library function to check primality.

In [2]:
from sympy.ntheory import isprime
prime_list = []
for k in range(50):
    if isprime(k):
        prime_list.append(k**2)
print(prime_list)
[4, 9, 25, 49, 121, 169, 289, 361, 529, 841, 961, 1369, 1681, 1849, 2209]

This can be written more compactly using a list comprehension:

In [3]:
[k**2 for k in range(50) if isprime(k)]
Out[3]:
[4, 9, 25, 49, 121, 169, 289, 361, 529, 841, 961, 1369, 1681, 1849, 2209]

Combinations

Suppose we pick a random subset from our class. What’s the probability that subset has fewer than five students?

According to the binomial coefficient formula, the total number of such subsets is $$\sum_{k=1}^4 \begin{pmatrix}13 \\ k \end{pmatrix},$$ where $$\begin{pmatrix}n \\ k \end{pmatrix} = \frac{n!}{k!(n-k)!}.$$ We can compute this with Python as follows:

In [4]:
from scipy.special import binom
total = sum([binom(13,k) for k in range(5)])
total
Out[4]:
1093.0

Of course, the total number of all subsets of the class of size 13 is $2^{13}$. Thus, the probability is

In [5]:
total/2**13
Out[5]:
0.1334228515625

Programming note: defining functions

We can define our own functions with Python using the def operator. For example, here's how to define $f(x)=x^2$ and compute $f(-2)$:

In [6]:
def f(x):
    return x**2
f(-2)
Out[6]:
4

Of course, we'll often have more "programmy" things! We'll see an example in a min.

Counting with the computer

On one of your WebWork problems, you're asked to compute the proportion of palindromic numbers from a superset. As a similar example, let's as the following:

Suppose we pick a number between 10 and 10000 inclusive randomly - all with equal probability. What is the probability that number is a palindrome?

I guess there are $10000-9 = 9991$ numbers between 10 and 10000; we just have to figure out how many of them are palindromes. If there are $n$ of them, the answer should be $n/9991$. One approach is to build a list of palindromes in that range (using a list comprehension) and then divide the length of that list by 9991. To do so, it makes sense to first write and is_palindrome command to check if a number is a palindrome:

In [7]:
def is_palindrome(n):
    digits = list(str(n))
    digits_too = digits.copy()
    digits_too.reverse()
    return digits == digits_too
[is_palindrome(100), is_palindrome(101)]
Out[7]:
[False, True]

Cool! And now the computation:

In [8]:
palindromes = [n for n in range(10,10000) if is_palindrome(n)]
len(palindromes)/9991
Out[8]:
0.01891702532279051