This challenge addresses machine learning problems in which labeling data is expensive, but large amounts of unlabeled data are available at low cost. Examples include: – Handwriting and speech recognition; – Document classification (including Internet web pages); – Vision tasks; – Drug design using recombinant molecules or protein engineering. Such problems might be tackled from different angles: learning from unlabeled data or active learning. In the former case, the algorithms must satisfy themselves with the limited amount of labeled data and capitalize on the unlabeled data with semi-supervised learning methods. Several challenges have addressed this problem in the past. In the latter case, the algorithms may place a limited number of queries to get new sample labels. The goal in that case is to optimize the queries and the problem is referred to as active learning. In most past challenges we organized, we used the same datasets during the development period and during the test period. In this challenge we used two sets of datasets, one for development and one for the final test, drawn from: Embryology, cancer diagnosis, chemoinformatics, handwriting recognition, text ranking, ecology, and marketing.