In statistics and machine learning, the evaluation of algorithms typically relies on their performance on data. This is because, in contrast to a theoretical guarantee (e.g. a consistency result), it is in general not possible to prove that an algorithm performs well on a particular (unseen) data set. Therefore, it is of vital importance that we ensure the reliability of data-based evaluations. This requirement poses a wide range of open research problems and challenges. These include

  1. the lack of a ground truth to validate results in real-world applications,
  2. the high instability of empirical results in many settings,
  3. the difficulty to make statistics and machine learning research reproducible,
  4. the general over-optimism of published research findings due pre-publication optimization of the algorithms and publication bias.

This workshop brings together scientists from statistics, machine learning, and their application fields to tackle these challenges. The workshop serves as a platform to critically discuss current shortcomings, to exchange new approaches, and to identify promising future directions of research.

Organizing Committee