Lecture 10 – Permutation Testing

Permutation testing

Hypothesis testing vs. permutation testing

"Standard" hypothesis testing helps us answer questions of the form:

I have a population distribution, and I have one sample. Does this sample look like it was drawn from the population?

It does not help us answer questions of the form:

I have two samples, but no information about any population distributions. Do these samples look like they were drawn from the same population?

That's where permutation testing comes in.

Example: Birth weight and smoking 🚬

Note: For familiarity, we'll start with an example from DSC 10. This means we'll move quickly!

Let's start by loading in the data.

We're only interested in the 'Birth Weight' and 'Maternal Smoker' columns.

Note that there are two samples:

Exploratory data analysis

How many babies are in each group? What is the average birth weight within each group?

Note that 16 ounces are in 1 pound, so the above weights are ~7-8 pounds.

Visualizing birth weight distributions

Below, we draw the distributions of both sets of birth weights.