**Summary**

Efficient approximate inference in large Hybrid Networks (graphical models with discrete and continuous variables) is one of the major unsolved problems in machine learning, and insight into good solutions would be beneficial in advancing the application of sophisticated machine learning to a wide range of real-world problems.

Such research would benefit potentially applications in Speech Recognition, Visual Object Tracking and Machine Vision, Robotics, Music Scene Analysis, Analysis of complex Times series, understanding and modelling complex computer networks, Condition monitoring, and other complex phenomena.

**theory challenge**specifically addresses a central component area of PASCAL, namely Bayesian Statistics and statistical modelling, and is also related to the other central areas of Computational Learning, Statistical Physics and Optimisation techniques

**.**

**aim of this challenge**is to bring together leading researchers in graphical models and related areas to develop and improve on existing methods for tackling the fundamental intractability in HNs. We do not believe that there will necessarily emerge a single best approach, although we would expect that successes in one application area should be transferable to related areas. Many leading machine learning researches are currently working on applications that involve HNs, and we invite participants to suggest their own applications. Ideally, this would be in the form of a dataset along the lines of PASCAL.

**Graphical Models**

**Hybrid Networks**

Hybrid Networks (HNs) are such Graphical Models with both **continuous ** **and discrete variables** . For example, imagine that a musical instrumentis played at a time t, which we can model with a switch variables(t)=1 : future sound generation can be modelled as a hiddenGaussian linear dynamical system with transition dynamicsp(h(t+1)|h(t),s(t)=1). When the musical instrument is at a future time t is turned off, s(t)=0, a different dynamics occurs, p(h(t+1)|h(t),s(t)=0). The observation (visible) process p(v(t)|h(t)) is typically a noise projection of the hidden state h(t), say in the case of sound, to a one dimensional pressure displacement. Based on the observed sequence v(1),…v(T), we may with to infer the switch variables s(1),..s(T). This particular kind of HN is called a switching Kalman Filter.

**The Challenge**

**fundamental difficulty**with Hybrid Networks is their intractability — unfortunately, the marriage of two tractable models, the Kalman Filter and Hidden Markov Model, does not result in a tractable Hybrid Network. Recently, developing approximate inference and learning methods in HNs has been a major research activity, across a large range of fields, including speech, music transcription, condition monitoring, control, robotics, and computer-brain interfaces.

**concrete problem**, we will make available a dataset of acoustic recordings, for example of a live piano recording. The challenge would be infer what notes were played and when — that is, to perform a transcription of the performance. We will know the ground truth (since we generated the data), for which we can compare competing solutions. Music transcription is a difficult, largely unsolved problem, although initial attempts using HNs have demonstrated the effectiveness of the HN solution. However, making faster approximation schemes in this area would overcome the current barrier to commercialisation of such techniques.