Introduction
The goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). It is fundamentally a supervised learning learning problem in that a training set of labelled images is provided. The twenty object classes that have been selected are:
- Person: person
- Animal: bird, cat, cow, dog, horse, sheep
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
There will be three main competitions: classification, detection, and segmentation; and a single smaller scale “taster” competition: person layout:
Classification/Detection Competitions
- Classification: For each of the twenty classes, predicting presence/absence of an example of that class in the test image.
- Detection: Predicting the bounding box and label of each object from the twenty target classes in the test image.
20 classes
Participants may enter either (or both) of these competitions, and can choose to tackle any (or all) of the twenty object classes. The challenge allows for two approaches to each of the competitions:
- Participants may use systems built or trained using any methods or data excluding the provided test sets.
- Systems are to be built or trained using only the provided training/validation data.
The intention in the first case is to establish just what level of success can currently be achieved on these problems and by what method; in the second case the intention is to establish which method is most successful given a specified training set.
Segmentation Competition
- Segmentation: Generating pixel-wise segmentations giving the class of the object visible at each pixel, or “background” otherwise.
Image Objects Class
Person Layout Taster Competition
- Person Layout: Predicting the bounding box and label of each part of a person (head, hands, feet).
Image Person Layout
Data
To download the training/validation data, see the development kit.
The training data provided consists of a set of images; each image has an annotation file giving a bounding box and object class label for each object in one of the twenty classes present in the image. Note that multiple objects from multiple classes may be present in the same image. Some example images can be viewed online. A subset of images are also annotated with pixel-wise segmentation of each object present, to support the segmentation competition. Some segmentation examples can be viewed online.
Annotation was performed according to a set of guidelines distributed to all annotators.
The data will be made available in two stages; in the first stage, a development kit will be released consisting of training and validation data, plus evaluation software (written in MATLAB). One purpose of the validation set is to demonstrate how the evaluation software works ahead of the competition submission.
In the second stage, the test set will be made available for the actual competition. As in the VOC2008 challenge, no ground truth for the test data will be released.
The data has been split into 50% for training/validation and 50% for testing. The distributions of images and objects by class are approximately equal across the training/validation and test sets. In total there are 14,743 images. Further statistics are online.
Example images
Example images and the corresponding annotation for the classification/detection/segmentation tasks, and and person layout taster can be viewed online:
- Classification/detection example images
- Segmentation taster example images
- Person Layout taster example images
Development Kit
The development kit consists of the training/validation data, MATLAB code for reading the annotation data, support files, and example implementations for each competition.
- Download the training/validation data (900MB tar file)
- Download the development kit code and documentation (250KB tar file) (updated 14-Aug-09)
- Download the PDF documentation (200KB PDF)
- Browse the HTML documentation
- View the guidelines used for annotating the database
Test Data
The test data is now available. Note that the only annotation in the data is for the layout taster challenge – disjoint from the main challenge. As in 2008, there are no current plans to release full annotation – evaluation of results will be provided by the organizers.
The test data can now be downloaded from the evaluation server. You can also use the evaluation server to evaluate your method on the test data.
Useful Software
Below is a list of software you may find useful, contributed by participants to previous challenges.
- Discriminatively Trained Deformable Part Models
Pedro Felzenszwalb, Ross Girshick, David McAllester, Deva Ramanan. - Color Descriptors
Koen van de Sande, Theo Gevers, Cees Snoek.
Timetable
- 15 May 2009: Development kit (training and validation data plus evaluation software) made available.
- 15 June 2009: Test set made available.
- 14 September 2009 (extended). Deadline for submission of results. There will be no further extensions.
- 3 October 2009: Workshop in assocation with ICCV 2009, Kyoto, Japan.
Submission of Results
Participants are expected to submit a single set of results per method employed. Participants who have investigated several algorithms may submit one result per method. Changes in algorithm parameters do not constitute a different method – all parameter tuning must be conducted using the training and validation data alone.
Results must be submitted using the automated evaluation server:
It is essential that your results files are in the correct format. Details of the required file formats for submitted results can be found in the development kit documentation. The results files should be collected in a single archive file (tar/tgz/tar.gz). An example of the correct format is available:
- Example results file (2.5MB gzipped tar file)
The format of your results must match that specified in the development kit and in the example file, including both file names and directory structure. If you are not entering a particular competition, just omit the corresponding files.
Participants submitting results for several different methods (noting the definition of different methods above) should produce a separate archive for each method.
In addition to the results files, participants will need to additionally specify:
- contact details and affiliation
- list of contributors
- brief description of the method
If you would like to submit a more detailed description of your method, for example a relevant publication, this can be included in the results archive.
Prizes
The following prizes were announced at the challenge workshop:
Classification | |
Winner: | NEC/UIUC Yihong Gong, Fengjun Lv, Jinjun Wang, Chen Wu, Wei Xu, Jianchao Yang, Kai Yu, Xi Zhou, Thomas Huang NEC Laboratories America; University of Illinois at Urbana-Champaign |
Honourable mentions: | UVA/SURREY Koen van de Sande, Fei Yan, Atif Tahir, Jasper Uijlings, Mark Barnard, Hongping Cai, Theo Gevers, Arnold Smeulders, Krystian Mikolajczyk, Josef Kittler University of Amsterdam; University of SurreyCVC Fahad Shahbaz Khan, Joost van de Weijer, Andrew Bagdanov, Noha Elfiky, David Rojas, Marco Pedersoli, Xavier Boix, Pep Gonfaus, Hany salahEldeen, Robert Benavente, Jordi Gonzalez, Maria Vanrell Computer Vision Centre Barcelona |
Detection | |
Joint Winners: | UoC/TTI Chicago Pedro Felzenszwalb, Ross Girshick, David McAllester University of Chicago; Toyota Technological Institute at ChicagoOxford/MSR India Andrea Vedaldi, Varun Gulshan, Manik Varma, Andrew Zisserman University of Oxford; Microsoft Research India |
Segmentation | |
Winner: | Bonn Joao Carreira, Fuxin Li, Cristian Sminchisescu University of Bonn |
Runner-up: | CVC Xavier Boix, Josep Maria Gonfaus, Fahad Kahn, Joost van de Weijer, Andrew Bagdanov, Marco Pedersoli, Jordi Gonzalez, Joan Serrat Computer Vision Center Barcelona |
Publication Policy
The main mechanism for dissemination of the results will be the challenge webpage.
The detailed output of each submitted method will be published online e.g. per-image confidence for the classification task, and bounding boxes for the detection task. The intention is to assist others in the community in carrying out detailed analysis and comparison with their own methods. The published results will not be anonymous – by submitting results, participants are agreeing to have their results shared online.
Citation
If you make use of the VOC2009 data, please cite the following reference (to be prepared after the challenge workshop) in any publications:
@misc{pascal-voc-2009, author = "Everingham, M. and Van~Gool, L. and Williams, C. K. I. and Winn, J. and Zisserman, A.", title = "The {PASCAL} {V}isual {O}bject {C}lasses {C}hallenge 2009 {(VOC2009)} {R}esults", howpublished = "http://www.pascal-network.org/challenges/VOC/voc2009/workshop/index.html"}
Database Rights
The VOC2009 data includes images obtained from the “flickr” website. Use of these images must respect the corresponding terms of use:
For the purposes of the challenge, the identity of the images in the database, e.g. source and name of owner, has been obscured. Details of the contributor of each image can be found in the annotation to be included in the final release of the data, after completion of the challenge. Any queries about the use or ownership of the data should be addressed to the organizers.
Organizers
- Mark Everingham (University of Leeds), me@comp.leeds.ac.uk
- Luc van Gool (ETHZ, Zurich)
- Chris Williams (University of Edinburgh)
- John Winn (Microsoft Research Cambridge)
- Andrew Zisserman (University of Oxford)
Acknowledgements
We gratefully acknowledge the following, who spent many long hours providing annotation for the VOC2009 database: Jan Hendrik Becker, Patrick Buehler, Kian Ming Adam Chai, Miha Drenik, Chris Engels, Hedi Harzallah, Nicolas Heess, Sam Johnson, Markus Mathias, Alastair Moore, Maria-Elena Nilsback, Patrick Ott, Florian Schroff, Alexander Sorokin, Paul Sturgess, David Tingdahl. We also thank Andrea Vedaldi for additional assistance.