Nonparametric tests of Missing Completely At Random

Thomas Berrett (Warwick University)

20 juin 2024, 11h00–12h15


Salle Auditorium 5

MAD-Stat. Seminar


One of the most commonly-encountered discrepancies between real data sets and models hypothesised in theoretical work is that of missing data. When faced with incomplete data, the primary concern is to understand the relationship between the data-generating and missingness mechanisms. In the ideal situation, these two sources of randomness are independent, a setting known as Missing Completely At Random (MCAR), but this is often too restrictive in practice. In this talk I will discuss hypothesis tests of the MCAR assumption with material based on joint work with Richard Samworth (paper 1) and Alberto Bordino (paper 2). It turns out that there are deep connections between this problem and ideas from copula theory and convex optimisation. Our methods in the first work are based on using linear programming to test the compatibility of distributions. In the second we draw connections with the matrix completion literature and thus develop tests based on semidefinite programming. In both cases our methods are more widely applicable than existing methods and, in cases that existing methods are applicable, we see strong empirical performance with comparable power.