Random variables

On page 105, the concept of a random variable \(X\) is defined to be a random process with some numerical outcome.

An important distinction:


Roughly, the distribution of a random variable tells you how likely that random variable is to produce certain outputs. The specifics are particular to the distribution in question but divide into two main classes - the discrete distributions and the continuous distributions

Distributions for discrete random variables

On page 82, the distribution of a discrete random variable is defined to be a table of all the possible outcomes together with their probabilities. For examples 1, 2, and 3 above, the distributions are as follows:

  • Example 1: \(P(X=1) = P(X=0) = 1/2\)
  • Example 2: \(P(X=1) = P(X=2) = P(X=3) = P(X=4) = P(X=5) = P(X=6) = 1/6\)
  • Example 3:
    • \(P(X=1) = 3/10\)
    • \(P(X=2) = 4/10\)
    • \(P(X=3) = 3/10\)


  • We’ve introduced the common notation \[P(X=x_i)=p_i\]
  • There can be any number of outcomes
  • Those outcomes need not be equally likely
  • We can visualize a discrete distribution uing a basic plot. Here’s third one:

Often, it makes more sense to do this with a larger distribution. Here’s a uniform distribution on 50 number, i.e. the numbers 1 through 50 are all equally likely.

Distributions for continuous random variables

Section 2.5 discusses the somewhat deeper idea of a distribution for continuous random variable. Rather than assigning a probability to the choice of any single number, we assign probabilities to interval ranges of numbers. Thus, if \(X\) is a continuous random variable, we might write \[P(0<X<5) = 0.2\] or, more generally, \[P(a<X<b) = p.\]

One convenient way to describe this sort of thing is as the area under the graph of a function. As we’ve already seen, this is exactly how the normal distribution works. Here’s a representation of \(P(-0.5<X<2)\) when \(X\) has a standard normal distribution:

Another example is the so-called uniform distribution on an interval which states that the probability of picking a number out of a subinterval is proportional to the length of that subinterval. For example, if our main interval is \[[-1,1] = \{x: -1\leq x \leq 1\},\]

Then, the probability of picking a number out of the left half is \(1/2\); in symbols: \[P(-1<X<0) = 1/2.\]

Note that the uniform distribution is the simplest of all continuous distributions. We will much more often than not be interested in the normal distribution and will read such areas (or probabilities) off of a table.

The binomial distribution

The binomial distribution is a discrete distribution that plays a special role in statistics for many reasons. Importantly for us, the binomial distribution allows us to see how a bell curve (in fact the normal curve) arises as a limit of other types of distributions.

The general idea of the binomial distribution is as follows: Suppose that a single experiment has probability of success \(p\) and probability of failure \(1-p\). We turn this into random variable by assigning numeric values, say success yields a \(1\) and failure yields a \(0\). We then run the experiment a number of times, say \(n\), and count the number of successes. This yields an integer between \(0\) and \(n\) inclusive. The binomial distribution tells us the probability of each of those \(n+1\) outcomes.

Flipping a fair coin

Suppose our experiment is just flipping a coin, that a head represents success, and that a tail represents failure. Thus, with one flip, we can get a \(0\) or a \(1\) with equal probability \(1/2\) each.

Now suppose we flip a coin 5 times and count how many heads we get. This will generate a random number \(X\) between 0 and 5 but they are not all equally likely. The probabilities are:

  • \(P(X=0)=1/32\)
  • \(P(X=1)=5/32\)
  • \(P(X=2)=10/32\)
  • \(P(X=3)=10/32\)
  • \(P(X=4)=5/32\)
  • \(P(X=5)=1/32\)

Note that the probability of getting any particular sequence of 5 heads and tails is \[\frac{1}{2^5} = \frac{1}{32}.\] That explains the denominator of 32 in the list of probabilities. The numerator is the number of ways to get that value for the sum. For example, there are 10 ways to get 2 heads in 5 flips: