Finite state automata (or machines) are well-known models for characterizing the behaviour of systems or processes. They have been used for several decades in computer and software engineering to model the complex behaviours of electronic circuits and software such as communication protocols. They are equivalent to Hidden Markov Models, used in a number of applications. The state of the art of learning either of these types of machines from strings is unclear as there has never been a challenge or even a benchmark over which learning algorithms have been compared. The goal of PAutomaC is to provide an overview of which probabilistic automaton learning technique works best in which setting and to stimulate the development of new techniques for learning distributions over strings. Such an overview will be very helpful for practitioners of automata learning and provide directions to future theoretical work and algorithm development. PAutomaC will provide the first elaborate test-suite for learning string distributions. The task is of interest to:

  • Grammatical Inference theoreticians wanting to find out how good their ideas and algorithms really are
  • Pattern recognition practitioners who have developed fine tuning EM inspired techniques to evaluate the parameters of HMMs or related models;
  • Statistical modelling experts who have to deal with strings or sequences.