Lecture 10 – Permutation Testing

DSC 80, Winter 2023

Announcements

Agenda

Example: Penguins (again!)

(source)

Consider the penguins dataset from a few lectures ago.

Average bill length by island

It appears that penguins on Torgersen Island have shorter bills on average than penguins on other islands.

Setup

The plan

Simulation

Again, while you could do this with a for-loop (and you can use a for-loop for hypothesis tests in labs and projects), we'll use the faster size approach here.

Instead of using np.random.multinomial, which samples from a categorical distribution, we'll use np.random.choice, which samples from a known sequence of values.

Visualizing the empirical distribution of the test statistic