Zulu Challenge

January 15, 2010

PASCAL2

http://labh-curien.univ-st-etienne.fr/zulu/
Supported by Pascal 2 Network of Excellence

Keywords: Active learning, grammatical inference, DFA and grammars

Abstract:

Zulu is an active learning competition. Participants are to build
algorithms that can learn deterministic finite automata (DFA) by making
the smallest number of membership queries to the server/oracle.

Motivations:

When learning language models, techniques usually make use of huge corpora
that are unavailable in many less resourced languages (such as the Zulu
language). One possible way around this problem is to interrogate an
expert with a number of chosen queries, in an interactive mode, until a
satisfying language model is reached. In this case, an important indicator
of success is the amount of energy the expert has spent in order for
learning to be successful. A nice learning paradigm covering this
situation is that of Query Learning, introduced by Dana Angluin.

In the field of Grammatical Inference, Query Learning was thoroughly
investigated to learn deterministic finite automata (DFA). As negative
results, it was proved that DFA could not be learned from just a
polynomial number of membership queries nor from just a polynomial number
of strong equivalence queries. On the other hand, algorithm L* designed by
Angluin, was proved to learn DFA from a polynomial number of both
membership and equivalence queries. These results yield several
successfull applications in Robotics, Games and Agents Technologies,
Information Retrieval, Hardware and Software Verification.

However, what has not been hardly studied is how to optimise the learning
task by trying to minimize the number of queries while making queries for
which the Oracle’s work and answers are simple. These are strong
motivations for stemming research in the direction of developing new
interactive learning strategies and algorithms, that is the aim of this
competition.

The competition:

Zulu (http://labh-curien.univ-st-etienne.fr/zulu/) is both a web based
platform simulating an Oracle in a DFA learning task and a competition.

As a web platform, Zulu allows users to generate tasks, to interact with
the Oracle in learning sessions and to record the results of the users. It
provides the users with a baseline algorithm written in JAVA, or the
elements allowing to build from scratch a new learning algorithm capable
of interacting with the server.

The server can be accessed by any user/learner who has opened an account.
The server acts as an Oracle for membership queries. A player can log in
and ask for a target DFA. The server then computes how many queries it
needs to learn a reasonable machine (reasonable means less than 30%
classification errors), and invites the player to interact in a learning
session in which he can ask up to that number of queries.

At the end of the learning process the server gives the learner a set of
unlabelled strings (a test set). The labels the learner submits are used
to compute his score.

As a starting point the baseline algorithm, which is a simple variation of
L*, with some sampling done to simulate equivalence queries, is given to
the user, who can therefore play with some simple JAVA code for a start.

The competition itself will be held in the spring of 2010 and the results
will be presented during a special session at the International Colloquium
on Grammatical Inference in Valencia, Spain, September 13-16, 2010
(http://users.dsic.upv.es/workshops/icgi2010/)

Schedule:

* from now to March 1st, 2010: Zulu platform is open, anyone may
register and have fun
* March 1st: Official beginning of the competition
* May 15th: Deadline for scoring, submissions closed
* June 1st: Notifications of the results
* June 20th: Deadline for submission of abstracts explaining
participants strategies
* September 13-16th: workshop at ICGI

Prizes and publications:
The winner of the Zulu competition will receive a prize, to be announced
on the Zulu webpage. Participants are encouraged to present their
innovations either as full papers to the ICGI 2010 conference, or as
extended abstracts to the Zulu workshop that will be organised during
ICGI. A journal special issue will also be considered.

Scientific committee:
* Dana Angluin, Yale University, USA
* Leo Becerra Bonache, Universidad de Tarragona, Spain
* François Coste, IRISA, Rennes, France
* Alex Clark, Royal Holloway University of London, UK
* Ricard Gavalda, Universidad Politecnica de Barcelona, Spain
* Colin de la Higuera, University of Nantes, France
* Jean-Christophe Janodet, University of Lyon, France
* Aurelien Lemay, University of Lille, France
* Laurent Miclet, ENSAT Lannion and IRISA, France
* Tim Oates, University of Maryland, USA
* Anssi Yli Jyra, University of Helsinki, Finland
* Menno van Zaanen, Tilburg University, The Netherlands