Machine learning beyond the data range: an extreme value perspective

Sebastian Engelke (Université de Genève)

May 2, 2024, 11:00–12:15


Room Auditorium 5

MAD-Stat. Seminar


Machine learning methods perform well in prediction tasks within the range of the training data. These methods typically break down when interest is in (1) prediction in areas of the predictor space with few or no training observations; or (2) prediction of quantiles of the response that go beyond the observed records. Extreme value theory provides the mathematical foundation for extrapolation beyond the range of the training data, both in the dimension of the predictor space and the response variable. In this talk we present recent methodology that combines this extrapolation theory with flexible machine learning methods to tackle the out-of-distribution generalization problem (1) and the extreme quantile regression problem (2). We show the practical importance of prediction beyond the training observations in environmental and climate applications, where domain shifts in the predictor space occur naturally due to climate change and risk assessment for extreme quantiles is required.