Lecture 10 – Permutation Testing

DSC 80, Spring 2023


Permutation testing

Hypothesis testing vs. permutation testing

"Standard" hypothesis testing helps us answer questions of the form:

I have a population distribution, and I have one sample. Does this sample look like it was drawn from the population?

It does not help us answer questions of the form:

I have two samples, but no information about any population distributions. Do these samples look like they were drawn from the same population?

That's where permutation testing comes in.

Example: Birth weight and smoking 🚬

Note: For familiarity, we'll start with an example from DSC 10. This means we'll move quickly!

Birth weight and smoking 🚬

Let's start by loading in the data.

We're only interested in the 'Birth Weight' and 'Maternal Smoker' columns.

Note that there are two samples:

Exploratory data analysis

How many babies are in each group? What is the average birth weight within each group?

Note that 16 ounces are in 1 pound, so the above weights are ~7-8 pounds.

Visualizing birth weight distributions

Below, we draw the distributions of both sets of birth weights.