Much of machine learning and data mining has been so far concentrating on analyzing data already collected, rather than collecting data. While experimental design is a well-developed discipline of statistics, data collection practitioners often neglect to apply its principled methods. As a result, data collected and made available to data analysts, in charge of explaining them and building predictive models, are not always of good quality and are plagued by experimental artifacts. In reaction to this situation, some researchers in machine learning and data mining have started to become interested in experimental design to close the gap between data acquisition or experimentation and model building. This has given rise of the discipline of active learning. In parallel, researchers in causal studies have started raising the awareness of the differences between passive observations, active sampling, and interventions. In this domain, only interventions qualify as true experiments capable of unravelling cause-effect relationships.

In this challenge, which follows on from two very successful earlier challenges (“Causation and Prediction”, and “Competition Pot-luck”) sponsored by PASCAL, we evaluated methods of experimental design, which involve the data analyst in the process of data collection. From our perspective, to build good models, we need good data. However, collecting good data comes at a price. Interventions are usually expensive to perform and sometimes unethical or impossible, while observational data are available in abundance at a low cost. For instance, in policy-making, one may want to predict the effect on a population’s health status of forbidding the use of cell phones when driving, before passing a law to that effect. This example illustrates the case of an experiment, which is possible, but expensive, particularly if there turns out to be of no effect. Practitioners must identify strategies for collecting data, which are cost effective and feasible, resulting in the best possible models at the lowest possible price. Hence, both efficiency and efficacy are factors of evaluation considered in this new challenge. Evaluating experimental design methods requires performing actual experiments. Because of the difficulty of experimenting on real systems in the context of our challenge, experimentation were made on realistic simulators of real systems, trained on real data or incorporating real data. The tasks were taken from a variety of domains, including medicine, pharmacology, manufacturing, plant biology, sociology, and marketing. Typical examples of tasks include: to evaluate the therapeutic potential or the toxicity of a drug, to optimize the throughput an industrial manufacturing process, to assess the potential impact of a promotion on sales.

The participants carried out virtual experiments by intervening on the system, e.g. by clamping variables to given values. We made use of our recently developed Virtual Laboratory. In this environment, the participants pay a price in virtual cash to perform a given experiment, hence they must optimize their design to reach their goal at the lowest possible cost. This challenge contributed to bringing to light new methods to integrate modeling and experimental design in an iterative process and new methods to combine the use of observational and experimental data in modeling.