The Synthetic Visual Reasoning Test Challenge

The first dead-line has been moved to August 31, 2010.

This competition is part of the PASCAL2 challenge program


We are pleased to announce a new challenge for machine learning and
computer vision: The Synthetic Visual Reasoning Test (SVRT). One
motivation is to expose some limitations of current methods for
pattern recognition, and thereby to argue for making a larger
investment in other paradigms and strategies, emphasizing the pivotal
role of relationships among parts, complex hidden states and a rich
dependency structure.

This test consists of a series of 23 hand-designed, image-based,
binary classification problems. The images are binary and with
resolution 128×128. For each problem we have implemented a generator
in C++, which allows one to produce as many i.i.d samples as desired.
A pdf document containing examples of images is available at

The Bayes error rate of each problem is virtually zero, and nearly all
of them can be perfectly solved by humans after seeing fewer than ten
examples from each class. Nonetheless, some of them are probably as
difficult as various “real” problems featured in previous challenges
and widely known data-sets. In particular, solving these synthetic
visual tasks with high accuracy requires “reasoning” about
relationships among shapes and their poses.

Human experiments were conducted in the laboratory of Prof. Steven
Yantis, a cognitive psychologist at Johns Hopkins University; those
results will appear in a future publication. A number of people were
asked to solve the problems and the number of samples required to
master each concept was recorded.

SVRT challenge participants who follow the rules described below and
whose results are noteworthy for either their originality or sheer
performance will be invited to co-author a comprehensive, and
hopefully visible, article summarizing the performance of their
methods, including a discussion of the performance of humans (and
possibly monkeys) on the same tasks.


The generators for a randomly-selected subset of 13 problems are made
available to participants. Using these 13 problems as “case studies,”
the challenge is to develop or adapt a learning algorithm which inputs
a training set and outputs a classifier for labeling a binary image.

An important performance metric is the number of training examples
required to obtain any given accuracy. Algorithms should be designed
to be trained on sets of varying sizes.

Participants have until August 31, 2010, for development, and are
required to make public the results achieved on the 13 problems as
well as the source code required to reproduce these results and to
test the algorithm on other problems.

The source code and test error rates must be sent to the challenge
organizers Francois Fleuret (francois.fleuret(at) and Donald
Geman (geman(at) before midnight EST, August 31, 2010.

The test error rates must be provided in a single text file, with one
line per problem and number of training examples. At minimum, results
are to be provided for exactly 10, 100 and 1000 training examples per
class per problem. Participants may choose to also send their results
for higher powers of ten. On each line there should be the problem
number, followed by the number of training samples, followed by ten
test error rates estimated on ten different runs, with 10,000 test
samples per class. Numbers should be separated by commas.

On September 1, 2010, we will publish the ten remaining problems
(i.e., make the generators available). Participants will measure the
performance of their algorithms *with no additional change* on this
new set of problems and send the performance by mail to the challenge
organizers before midnight EST, September 31, 2010. At that point, we
may use the participants’ code to verify the reported performance.


The source code of the generators can be downloaded from

A pdf document containing ten samples of each class of each problem,
together with the error rate of a baseline classifier trained with
Boosting, is available at


Fran├žois Fleuret, Idiap Research Institute

Donald Geman, Johns Hopkins University