In recent years, the problem of hierarchical text classification has been addressed in machine learning literature, but its handling at large scale (i.e. involving several thousand categories) remains an open research issue. Combined with the increasing demand for practical systems of this kind, there seems to be a need for a significant push of this research activity. This is our motivation for this PASCAL challenge aiming at assessing models, methods and tools for classification in very large, hierarchically organized, category systems. We prepared large datasets for experimentation, based in the ODP Web directory (www.dmoz.org), as well as baseline classifiers based on k-NN and logistic regression. We used two of these datasets for the challenge: a very large one (around 30,000 categories) and a smaller one (around 3,000 categories). The participants were given the chance to dry-run their classification methods on the smaller datasets. They were then asked to learn their system using the training and validation parts of the larger set, and provide their classification results on the test part. A two-sided evaluation of the participating methods was used: one measuring classification performance and one computational performance. Work on this challenge resulted in a new EU project “BioASQ: A challenge on large-scale biomedical semantic indexing and question answering” which started on October 1, 2012.