Workshop | Event Types | Knowledge 4 All Foundation Ltd.

/ Events

Physical and economic limitations have forced computer architecture towards parallelism and away from exponential frequency scaling. Meanwhile, increased access to ubiquitous sensing and the web has resulted in an explosion in the size of machine learning tasks. In order to benefit from current and future trends in processor technology we must discover, understand, and exploit the available parallelism in machine learning. This workshop will achieve four key goals:

Bring together people with varying approaches to parallelism in machine learning to identify tools, techniques, and algorithmic ideas which have lead to successful parallel learning.
Invite researchers from related fields, including parallel algorithms, computer architecture, scientific computing, and distributed systems, who will provide new perspectives to the NIPS community on these problems, and may also benefit from future collaborations with the NIPS audience.
Identify the next key challenges and opportunities to parallel learning.
Discuss large-scale applications, e.g., those with real time demands, that might benefit from parallel learning.

Prior NIPS workshops have focused on the topic of scaling machine learning, which remains an important developing area. We introduce a new perspective by focusing on how large-scale machine learning algorithms should be informed by future parallel architectures.

Topics of Interest

While we are interested in a wide range of topics associated with large-scale, parallel learning, the following list provides a flavor of some of the key topics:

Multicore / Cluster based Learning Techniques
Machine Learning on Alternative Hardware (GPUs, Cell Processors, FPGAs, iPhone, ...)
Distributed Learning
Learning results and techniques on Massive Datasets
Large Scale Kernel Methods
Fast Online Algorithms for Large Data Sets
Parallel Computing Tools and Libraries

Organizers

Carlos Guestrin, Carnegie Mellon
Alex Gray, Georgia Tech
Alex Smola, Yahoo
Arthur Gretton, Carnegie Mellon
Joseph Gonzalez, Carnegie Mellon

Now is the time to revisit some of the fundamental grammar/language learning tasks such as grammar acquisition, language acquisition, language change, and the general problem of automatically inferring generic representations of language structure in a data driven manner.

Though the underlying problems have been known to be computationally intractable for the standard representations of the Chomsky hierarchy, such as regular grammars and context free grammars, progress has been made by modifying or restricting these classes to make them more observable. Generalisations of distributional learning have shown promise in unsupervised learning of linguistic structure using tree based representations, or using non-parametric approaches to inference. More radically, significant advances in this domain have been made by switching to different representations such as the work in Clark, Eyrand & Habrard (2008) that addresses the issue of language acquisition, but has the potential to cross-fertilise a wide range of problems that require data driven representations of language. Such approaches are starting to make inroads into one of the fundamental problems of cognitive science: that of learning complex representations that encode meaning. This adds a further motivation for returning to this topic at this point.

Grammar induction was the subject of an intense study in the early days of Computational Learning Theory, with the theory of query learning largely developing out of this research. More recently the study of new methods of representing language and grammars through complex kernels and probabilistic modelling together with algorithms such as structured output learning has enabled machine learning methods to be applied successfully to a range of language related tasks from simple topic classification through parts of speech tagging to statistical machine translation. These methods typically rely on more fluid structures than those derived from formal grammars and yet are able to compete favourably with classical grammatical approaches that require significant input from domain experts, often in the form of annotated data.

Organisers

Alex Clark Royal Holloway, University of London
Dorota Glowacka University College London
John Shawe-Taylor University College London
Yee Whye Teh University College London
Chris Watkins Royal Holloway, University of London

During the last decade, many areas of Bayesian machine learning have reached a high level of maturity. This has resulted in a variety of theoretically sound and efficient algorithms for learning and inference in the presence of uncertainty. However, in the context of control, robotics, and reinforcement learning, uncertainty has not yet been treated with comparable rigor despite its central role in risk-sensitive control, sensorimotor control, robust control, and cautious control. A consistent treatment of uncertainty is also essential when dealing with stochastic policies, incomplete state information, and exploration strategies.

A typical situation where uncertainty comes into play is when the exact state transition dynamics are unknown and only limited or no expert knowledge is available and/or affordable. One option is to learn a model from data. However, if the model is too far off, this approach can result in arbitrarily bad solutions. This model bias can be sidestepped by the use of flexible model-free methods. The disadvantage of model-free methods is that they do not generalize and
often make less efficient use of data. Therefore, they often need more trials than feasible to solve a problem on a real-world system. A probabilistic model could be used for efficient use of data while alleviating model bias by explicitly representing and incorporating uncertainty.

The use of probabilistic approaches requires (approximate) inference algorithms, where Bayesian machine learning can come into play. Although probabilistic modeling and inference conceptually fit
into this context, they are not widespread in robotics, control, and reinforcement learning. Hence, this workshop aims to bring researchers together to discuss the need, the theoretical properties, and the practical implications of probabilistic methods in control, robotics, and reinforcement learning.

One particular focus will be on probabilistic reinforcement learning approaches that profit recent developments in optimal control which show that the problem can be substantially simplified if certain structure is imposed. The simplifications include linearity of the (Hamilton-Jacobi) Bellman equation. The duality with Bayesian estimation allow for analytical computation of the optimal control laws and closed form expressions of the optimal value functions.

Organizers

Marc Peter Deisenroth
Bert Kappen
Emanuel Todorov
Duy Nguyen-Tuong
Carl Edward Rasmussen
Jan Peters

Clustering is one of the most widely used techniques for exploratory data analysis. In the past five decades, many clustering algorithms have been developed and applied to a wide range of practical problems. There has also been very exciting theoretical work, proving guarantees for algorithms and developing new frameworks for analysis.

Yet in many ways we are only beginning to understand some of the most basic issues in clustering. While there have been some remarkable successes, we believe more is possible. In particular, work addressing issues that are independent of any specific clustering algorithm, objective function, or specific data generative model, is still in its infancy.

In his famous Turing award lecture, Donald Knuth states about Computer Programming that: "It is clearly an art, but many feel that a science is possible and desirable''. In the case of clustering, we believe that an even better and deeper science than what we currently offer is possible and highly desirable.

Goals of the Workshop

This workshop aims at initiating a dialog between theoreticians and practitioners, aiming to bridge the theory-practice gap in this area. The workshop will be built along three main questions:

FROM THEORY TO PRACTICE:
Which abstract theoretical characterizations / properties / statements about clustering algorithms exist that can be helpful for practitioners and should be adopted in practice?
FROM PRACTICE TO THEORY:
What concrete questions would practitioners like to see addressed by theoreticians? Can we identify de-facto practices in clustering in need of theoretical grounding? Which obscure (but seemingly needed or useful) practices are in need of rationalization?
FROM ART TO SCIENCE:
In contrast to supervised learning, where there is general consensus on how to assess the quality of an algorithm, the frameworks for analyzing clustering are only beginning to be developed and clustering is still largely an art. How can we progress towards a deeper understanding of the space of clustering problems and objectives, including the introduction of falsifiable hypotheses and properly designed experimentation? How could one set up a clustering challenge to compare different clustering algorithms? What could be scientific standards to evaluate a clustering algorithm in a paper?

The workshop will also serve as a follow up meeting to the NIPS 2005 “Theoretical Foundations of clustering” workshop, a venue for the different research groups working on these issues to take stock, exchange view points and discuss the next challenges in this ambitious quest for theoretical foundations of clustering.

Organizers

Shai Ben-David is a CS professor at the University of Waterloo, Canada.
Avrim Blum is a professor of CS at Carnegie Mellon University.
Ulrike von Luxburg is a Senior Research Scientist at the Max Plank Institute in Tubingen, Germany.
Isabelle Guyon is an independent engineering consultant, working from California.
Reza Bosagh Zadeh is a graduate student at Carnegie Mellon University.
Margareta Ackerman is a graduate student at the University of Waterloo.
Robert C. Williamson is the Scientific Director of NICTA and a Professor in the Research School of Information Sciences and Engineering at the Australian National University.

Statistical topic models are a class of Bayesian latent variable models, originally developed for analyzing the semantic content of large document corpora. With the increasing availability of other large, heterogeneous data collections, topic models have been adapted to model data from fields as diverse as computer vision, finance, bioinformatics, cognitive science, music, and the social sciences. While the underlying models are often extremely similar, these communities use topic models in different ways in order to achieve different goals. This one-day workshop will bring together topic modeling researchers from multiple disciplines, providing an opportunity for attendees to meet, present their work and share ideas, as well as inform the wider NIPS community about current research in topic modeling. This workshop will address the following specific goals:

Identify and formalize open research areas
Propose, explore, and discuss new application areas
Discuss how best to facilitate transfer of research ideas between application domains
Direct future work and generate new application areas
Explore novel modeling approaches and collaborative research directions

Program Committee

Edo Airoldi
Hal Daumé
Tom Dietterich
Laura Dietz
Jacob Eisenstein
Tom Griffiths
John Lafferty
Li-Jia Li
Andrew McCallum
David Mimno
Dave Newman
Padhraic Smyth
Erik Sudderth
Yee Whye Teh
Chong Wang
Max Welling
Sinead Williamson
Frank Wood
Jerry Zhu

Organizers

David Blei (Princeton University)
Jordan Boyd-Graber (University of Maryland)
Jonathan Chang (Facebook)
Katherine Heller (University of Cambridge)
Hanna Wallach (University of Massachusetts, Amherst)

The field of computational biology has seen dramatic growth over the past few years. A wide range of high-throughput technologies developed in the last decade now enable us to measure parts of a biological system at various resolutions—at the genome, epigenome, transcriptome, and proteome levels. These technologies are now being used to collect data for an ever-increasingly diverse set of problems, ranging from classical problems such as predicting differentially regulated genes between time points and predicting subcellular localization of RNA and proteins, to models that explore complex mechanistic hypotheses bridging the gap between genetics and disease, population genetics and transcriptional regulation. Fully realizing the scientific and clinical potential of these data requires developing novel supervised and unsupervised learning methods that are scalable, can accommodate heterogeneity, are robust to systematic noise and confounding factors, and provide mechanistic insights.

The goals of this workshop are to i) present emerging problems and innovative machine learning techniques in computational biology, and ii) generate discussion on how to best model the intricacies of biological data and synthesize and interpret results in light of the current work in the field. We will invite several rising leaders from the biology/bioinformatics community who will present current research problems in computational biology and lead these discussions based on their own research and experiences. We will also have the usual rigorous screening of contributed talks on novel learning approaches in computational biology. We encourage contributions describing either progress on new bioinformatics problems or work on established problems using methods that are substantially different from established alternatives. Kernel methods, graphical models, feature selection, non-parametric models and other techniques applied to relevant bioinformatics problems would all be appropriate for the workshop. We are particularly keen on considering contributions related to the prediction of functions from genotypes and that target data generated from novel technologies such as gene editing and single cell genomics, though we will consider all submissions that highlight applications of machine learning into computational biology. The targeted audience are people with interest in learning and applications to relevant problems from the life sciences, including NIPS participants without any existing research link to computational biology.

Organizers

Nicolo Fusi, Microsoft Research, Cambridge (USA)
Anna Goldenberg, SickKids Research Institute program of Genetics and Genome Biology (Canada)
Sara Mostafavi, University of British Columbia (Canada)
Gerald Quon, MIT, Cambridge (USA)
Oliver Stegle, EMBL (UK)

Program Committee

Alexis Battle, JHU
Michael A. Beer, JHU
Andreas Beyer, TU Dresden
Karsten Borgwardt, ETH Zurich
Gal Chechik, Gonda brain center, Bar Ilan University
Chao Cheng, Dartmouth Medical School
Manfred Claassen, ETH Zurich
Florence d'Alche-Buc, Université d'Evry-Val d'Essonne, Genopole
Saso Dzeroski, Jozef Stefan Institute
Jason Ernst , UCLA
Pierre Geurts, University of Liège
James Hensman, The University of Sheffield
Antti Honkela, University of Helsinki
Laurent Jacob, Mines Paris Tech
Samuel Kaski, Aalto University
Seyoung Kim, CMU
David Knowles, Stanford
Anshul Kundaje, Stanford
Neil Lawrence, University of Sheffield
Su-In Lee, University of Washington
Shen Li, Mount Sinai, New York
Michal Linial, Hebrew University
John Marioni, EMBL-EBI
Martin Renqiang Min, NEC Labs America
Yves Moreau, KU Leuven
Alan Moses, University of Toronto
Bernard Ng, UBC
William Noble, University of Washington
Uwe Ohler, MDC Berlin & Humboldt University
Yongjin Park, MIT
Leopold Parts, University of Toronto
Dana Pe'er, Columbia University
Nico Pfeifer, Max Planck Institute
Magnus Rattray, University of Manchester
Simon Rogers, University of Glasgow
Juho Rousu, Aalto University
Guido Sanguinetti, University of Edinburgh
Alexander Schliep, Rutgers University
Jean-Philippe Vert, Ecole des Mines de Paris
Jinbo Xu, Toyota Technological Institute of Chicago
Chun (Jimmie) Ye , UCSF

Undirected graphical models provide a powerful framework for representing dependency structure between random variables. Learning the parameters of undirected models plays a crucial role in solving key problems in many machine learning applications, including natural language processing, visual object recognition, speech perception, information retrieval, computational biology, and many others.

Learning in undirected graphical models of large treewidth is difficult because of the hard inference problem induced by the partition function for maximum likelihood learning, or by finding the MAP assignment for margin-based loss functions. Over the last decade, there has been considerable progress in developing algorithms for approximating the partition function and MAP assignment, both via variational approaches (e.g., belief propagation) and sampling algorithms (e.g., MCMC). More recently, researchers have begun to apply these methods to learning large, densely-connected undirected graphical models that may contain millions of parameters. A notable example is the learning of Deep Belief Networks and Deep Boltzmann Machines, that employ MCMC strategy to greedily learn deep hierarchical models.

The goal of this workshop is to assess the current state of the field and explore new directions in both theoretical foundations and empirical applications. In particular, we shall be interested in discussing the following topics:

State of the field: What are the existing methods and what is the relationship between them? Which problems can be solved using existing algorithms and which cannot?
The use of approximate inference in learning: There are many algorithms for approximate inference. In principle all of these can be "plugged-into" learning algorithms. What are the relative merits of using one approximation vs. the other (e.g., MCMC approximation vs. a variational one). Are there effective combined strategies?
Learning with latent variables: Graphical models with latent (or hidden) variables often possess more expressive power than models with only observed variables. However, introducing hidden variables makes learning far more difficult. Can we develop better optimization and approximation techniques that would allow us to learn parameters in such models more efficiently?
Learning in models with deep architectures: Recently, there has been notable progress in learning deep probabilistic models, including Deep Belief Networks and Deep Boltzmann Machines, that contain many layers of hidden variables and millions of parameters. The success of these models heavily relies on the greedy layer-by-layer unsupervised learning of a densely-connected undirected model called a Restricted Boltzmann Machine (RBM). Can we develop efficient and more accurate learning algorithms for RBM's and deep multilayer generative models? How can learning be extended to semi-supervised setting and be made more robust to dealing with highly ambiguous or missing inputs? What sort of theoretical guarantees can be obtained for such greedy learning schemes?
Scalability and success in real-world applications: How well do existing approximate learning algorithms scale to large-scale problems including problems in computer vision, bioinformatics, natural language processing, information retrieval? How well do these algorithms perform when applied to modeling high-dimensional real-world distributions (e.g. the distribution of natural images)?
Theoretical Foundations: What are the theoretical guarantees of the learning algorithms (e.g. accuracy using the learned parameters with respect to best possible, asymptotic convergence guarantees such as almost sure convergence to the maximum likelihood estimator). What are the tradeoffs between running time and accuracy?
Loss functions: In the supervised learning setting, two popular loss functions are log-loss (e.g., in conditional random fields) and margin-based-loss (e.g., in maximum margin Markov networks). In intractable models these approaches result in rather different approximation schemes (since the former requires partition function estimation, whereas the latter only requires MAP estimates). What can be said about the differences between these schemes? When is one model more appropriate than the other? Can margin-based models be applied in the unsupervised case?
Structure vs. accuracy: Which graph structures are more amenable to approximations and why? How can structure learning be combined with approximate learning to yield models that are both descriptive and learnable with good accuracy?

Organizers

Ruslan Salakhutdinov , CSAIL, MIT
Amir Globerson , The Hebrew University of Jerusalem
David Sontag , CSAIL, MIT

Over the past decade, brain connectivity has become a central theme in the neuroimaging community. At the same time, causal inference has recently emerged as a major research topic in machine learning. Even though the two research questions are closely related, interactions between the neuroimaging and machine-learning communities have been limited.

The aim of this workshop is to initiate productive interactions between neuroimaging and machine learning by introducing the workshop audience to the different concepts of connectivity/causal inference employed in each of the communities. Special emphasis is placed on discussing commonalities as well as distinctions between various approaches in the context of neuroimaging. Due to the increasing relevance of brain connectivity for analyzing mental states, we also highly welcome contributions discussing applications of brain connectivity measures to real-world problems such as brain-computer interfacing or mental state monitoring.

Topics

We solicit contributions on new approaches to connectivity and/or causal inference for neuroimaging data as well as on applications of connectivity inference to real-world problems. Contributions might address, but are not limited to, the following topics:

Effective connectivity & causal inference
- Dynamic causal modelling
- Granger causality
- Structural equation models
- Causal Bayesian networks
- Non-Gaussian linear causal models
- Causal additive noise models
Functional connectivity
- Canonical correlation analysis
- Phase-locking
- Imaginary coherence
- Independent component analysis
Applications of brain connectivity to real-world problems
- Brain-computer interfaces
- Mental state monitoring

Organization committee

Moritz Grosse-Wentrup (primary contact), MPI for Biological Cybernetics, Tuebingen
Uta Noppeney, MPI for Biological Cybernetics, Tuebingen
Karl Friston, University College London
Bernhard Schoelkopf, MPI for Biological Cybernetics, Tuebingen

Program committee

Olivier David, Institut National de la Sante et de la Recherche Medicale, Grenoble
Justin Dauwels, Massachusetts Institute of Technology, Cambridge
Michael Eichler, Maastricht University
Jeremy Hill, Max Planck Institute for Biological Cybernetics, Tuebingen
Guido Nolte, Fraunhofer FIRST, Berlin
Will Penny, University College London
Alard Roebroeck, Maastricht University
Klaas Enno Stephan, University of Zurich
Ryota Tomioka, University of Tokyo
Pedro Valdes-Sosa, Cuban Neuroscience Center, Havana

Learning from multiple sources denotes the problem of jointly learning from a set of (partially) related learning problems / views / tasks. This general concept underlies several subfields receiving increasing interest from the machine learning community, which differ in terms of the assumptions made about the dependency structure between learning problems. In particular, the concept includes topics such as data fusion, transfer learning, multitask learning, multiview learning, and learning under covariate shift. Several approaches for inferring and exploiting complex relationships between data sources have been presented, including both generative and discriminative approaches.

The workshop will provide a unified forum for cutting edge research on learning from multiple sources; the workshop will examine the general concept, theory and methods, and will also examine robotics as a natural application domain for learning from multiple sources. The workshop will address methodological challenges in the different subtopics and further interaction between them. The intended audience is researchers working in fields of multi-modal learning, data fusion, and robotics.

The workshop includes a morning session focused on the robotics application, and an afternoon session focused on theory/methods.

Organisers

David Hardoon - Institute for Infocomm Research (I2R).
Gayle Leen - Helsinki University of Technology.
Jaakko Peltonen - Helsinki University of Technology.
Simon Rogers - University of Glasgow.
Barbara Caputo - Idiap Research Institute.
Francesco Orabona - Idiap Research Institure.
Nicolò Cesa-Bianchi - Università degli Studi di Milan.

Programme Committee

Cedric Archambeau - Xerox Research.
Andreas Argyriou - Toyota Technological Institute.
Claudio Gentile - Università dell'Insubria.
Mark Girolami - University of Glasgow.
Samuel Kaski - Helsinki University of Technology.
Arto Klami - Helsinki University of Technology.
John Shawe-Taylor - University College London.
Giorgio Valentini - Università degli Studi di Milan.

Accounting for dependencies between outputs has important applications in several areas. In sensor networks, for example, missing signals from temporal failing sensors may be predicted due to correlations with signals acquired from other sensors. In geo-statistics, prediction of the concentration of heavy pollutant metals (for example, Copper concentration), that require expensive procedures to be measured, can be done using inexpensive and oversampled variables (for example, pH data).

Multi-task learning is a general learning framework in which it is assumed that learning multiple tasks simultaneously leads to better modeling results and performance that learning the same tasks individually. Exploiting correlations and dependencies among tasks, it becomes possible to handle common practical situations such as missing data or to increase the amount of potential data when only few amount of data per task is available.

In this workshop we will consider the use of kernel methods for multiple outputs and multi-task learning. The aim of the workshop is to bring together Bayesian and frequentist researchers to establish common ground and shared goals.

Motivation

In the last few years there has been an increasing amount of work on Multi-task Learning. Hierarchical Bayesian approaches and neural networks have been proposed. More recently, the Gaussian Processes framework has been considered, where the correlations among tasks can be captured by appropriate choices of covariance functions. Many of these choices have been inspired by the geo-statistics literature, in which a similar area is known as cokriging. In the frequentist perspective, regularization theory has provided a natural framework to deal with multi-task problems: assumptions on the relation of the different tasks translate into the design of suitable regularizers. Despite the common traits of the proposed approaches, so far different communities have worked independently. For example it is natural to ask whether the proposed choices of the covariance function can be interpreted from a regularization perspective. Or, in turn, if each regularizer induces a specific form of the covariance/kernel function. By bringing together the latest advances from both communities, we aim at establishing what is the state of the art and the possible future challenges in the context of multiple-task learning.

Organisers

Mauricio Alvarez, School of Computer Science, University of Manchester, U.K..
Lorenzo Rosasco, CBCL, Massachusetts Institute of Technology.
Neil D. Lawrence, School of Computer Science, University of Manchester, U.K..

Knowledge 4 All Foundation Ltd.

Large-Scale Machine Learning: Parallelism and Massive Datasets Workshop

Topics of Interest

Organizers

Grammar Induction, Representation of Language and Language Learning Workshop

Organisers

Probabilistic Approaches for Robotics and Control Workshop

Organizers

Clustering: Science or Art? Towards Principled Approaches Workshop

Goals of the Workshop

Organizers

Applications of Topic Models: Text and Beyond Workshop

Program Committee

Organizers

Machine Learning in Computational Biology Workshop

Organizers

Program Committee

Approximate Learning of Large Scale Graphical Models: Theory and Applications Workshop

Organizers

Connectivity Inference in Neuroimaging Workshop

Topics

Organization committee

Program committee

Learning from Multiple Sources with Applications to Robotics Workshop

Organisers

Programme Committee

Kernels for Multiple Outputs and Multi-task Learning: Frequentist and Bayesian Points of View Workshop

Motivation

Organisers

Knowledge 4 All Foundation Ltd.