Lecture 13 – Imputation

DSC 80, Spring 2023

Midterm Exam Logistics

Agenda

Recap: Identifying missingness mechanisms

Review: Missingness mechanisms

Deciding between MAR and MCAR

Recall, the "missing value flowchart" says that we should:

To decide between MAR and MCAR, we can look at the data itself.

Deciding between MAR and MCAR

Example: Heights

Today, we'll use the same heights dataset as we did last time.

Example: Missingness of 'child' heights on 'father''s heights (MCAR)

Aside: In util.py, there are several functions that we've created to help us with this lecture.

Example: Missingness of 'child' heights on 'father''s heights (MCAR)