This supplement issue consists of 10 peer-reviewed papers and one review article based on the NIPS Workshop on New Problems and Methods in Computational Biology held at Whistler, Canada on December 18th, 2004. This workshop is designed to bring together machine learning and com putational biology researchers to develop fundamentally new methods for an alyzing biological data.
We received submissions both from the presenters at the worksho p as well as non-presenters. Submitted manuscripts were rigorously reviewed by at least two referees. The quality of each paper was evaluated on the contributions to biology as well as novelty as new machi ne learning methods. Since the NIPS conference is a leading machine learning conference, we require d technical novelty and mathematical rigor in methodology.
We would like to thank the workshop presenters and participants who made this special issue possible. Special thanks go to the  editors of BMC Bioinformatics who advised us in preparing the manuscripts. Finally we acknowledge the financial support by PASCAL (Pattern Analysis, Statistical Mode lling and
Computational Learning,) a newly launched European Network of Excellence (NoE).

Program Committee

  • Pierre Baldi, UC Irvine
  • Kristin Bennett, Rensselaer Polytechnic Institute
  • Nello Cristianini, UC Davis
  • Eleazar Eskin, UC San Diego
  • Nir Friedman, Hebrew University and Harvard
  • Dan Geiger, The Technion
  • Michael I. Jordan, UC Berkeley
  • Alexander Hartemink, Duke University
  • Klaus-Robert Müller, Fraunhofer FIRST
  • William Stafford Noble, University of Washington
  • Bernhard Schölkopf, Max Planck Institute for Biological Cybernetics
  • Alexander Schliep, Max Planck Institute for Molecular Genetics
  • Eran Segal, Stanford University
  • Jean-Philippe Vert, Ecole des Mines de Paris

It's been over 30 years since the foundations of sample complexity based learning theory and now seems a good time to assess the program. Has this branch of learning theory been useful?

The purpose of this workshop is not merely progress assessment. The sample complexity bounds community has internal disagreements about what is (and is not) a useful bound, what is (and is not) a tight bound, how (and where) bounds might reasonably be used, and which bounds-related questions should be answered. One goal of this workshop is to debate the merits of these different issues in order to foster better understanding internally as well as externally.

It is not the purpose of the workshop to converge to the one right way to assess sample complexity or learning performance etc; rather we seek to understand the relative merits of diverse approaches and how they relate, recognising that it is very unlikely there is one true and best solution.

The workshop is generally focused on answers to the above questions. Some specific topics include:

  1. Quantitatively tight bounds. (What are they, how are they useful, etc...)
  2. Position statements and arguments about what bounds should deliver.
  3. Bounds for clustering and other "non-standard" learning problems
  4. The relationship between bounds and algorithms
  5. When are bounds useless?
  6. Issues in bound use (computational and informational complexities)
  7. What quantities should bounds depend on? (a priori knowledge of the task? Unlabeled training data? All training data?)

Organizers

The theoretical analysis of systems that learn from data has been an important topic of study in statistics, machine learning, and information theory. In all these paradigms, distinct methods have been developed to deal with inference when the models under consideration can be arbitrarily large. Recently, there has been a fruitful cross-fertilization of ideas and proof techniques. To give but one example, very recently, minimax optimal convergence rates of the information-theoretic MDL method were proved using ideas from the - computational - PAC-Bayesian paradigm and - statistical - empirical process techniques. The goal of this workshop is to bring together leading theoreticians to allow them to debate, compare and cross-fertilise ideas from these distinct inductive principles. At the workshop, we will establish a PASCAL special interest group for `merging computational and information-theoretic learning with statistics'.

These pages record the Sheffield Machine Learning Workshop held at the Marriott Hotel Sheffield in September 2004. Thank you to all the invited speakers and attendees who made the workshop such a success. The sponsors for this workshop are:

AMI (Augmented Multiparty Interaction, http://www.amiproject.org) is a newly launched (January 2004) European Integrated Project (IP) funded under Framework FP6 as part of its IST program. AMI targets computer enhanced multi-modal interaction in the context of meetings. The project aims at substantially advancing the state-of-the-art, within important underpinning technologies (such as human-human communication modeling, speech recognition, computer vision, multimedia indexing and retrieval). It will also produce tools for off-line and on-line browsing of multi-modal meeting data, including meeting structure analysis and summarizing functions. The project also makes recorded and annotated multimodal meeting data widely available for the European research community, thereby contributing to the research infrastructure in the field.

PASCAL (Pattern Analysis, Statistical Modelling and Computational Learning, http://www.pascal-network.org) is a newly lauched (December 2003) European Network of Excellence (NoE) as part of its IST program. The NoE brings together experts from basic research areas such as Statistics, Optimisation and Computational Learning and from a number of application areas, with the objective of integrating research agendas and improving the state of the art in all concerned fields.

IM2 (Interactive Multimodal Information Management, http://www.im2.ch) is a Swiss National Center of Competence in Research (NCCR) aiming at the advancement of research, and the development of prototypes, in the field of man-machine interaction. IM2 is particularly concerned with technologies coordinating natural input modes (such as speech, image, pen, touch, hand gestures, head and/or body movements, and even physiological sensors) with multimedia system outputs, such as speech, sounds, images, 3D graphics and animation. Among other applications, IM2 is also targeting research and development in the context of smart meeting rooms.

M4 (Multi-Modal Meeting Manager, http://www.m4project.org) is an EU IST project launched in March 2002 concerned with the construction of a demonstration system to enable structuring, browsing and querying of an archive of automatically analysed meetings. The archived meetings will have taken place in a room equipped with multimodal sensors.

Given the multiple links between AMI, PASCAL, IM2 and M4, it was decided to organize a join workshop in order to bring together researchers from the different communities around the common theme of advanced machine learning algorithms for processing and structuring multimodal human interaction in meetings.

 

The aim of the meeting is to encourage a closer interaction between the computer vision community and the machine learning and statistical pattern recognition communities.

Overview

 How can we make computers interact more intelligently with us? Does the field of Human/Computer Interface (HCI) suggest challenging new problems for machine learning? This workshop will address these and other related questions. We will focus discussion on five topics in HCI which have the greatest connection to machine learning (shown below). The goal of the workshop is to cross-fertilize HCI with machine learning by fostering discussion between researchers in the two fields.

Topics

  1. User modeling and personalization --- making predictive models of human state and preferences, in order to serve them better.
  2. Multimodal and perceptual user interfaces --- giving computers one or more senses, to make interaction with people more natural.
  3. Computerized support for meetings --- meeting capture and retrieval, to make meetings more effective.
  4. Direct brain-computer interface ---- input to computers directly from the brain.
  5. Intelligent dialog systems --- systems that can engage in conversations with people.

Invited Speakers

Organizers