This workshop addresses the problem of learning from data that are not independently and identically distrbuted (IID), knowing that IIDness is a common assumption made in statistical machine learning. If this assumption helps to study the properties of learning procedures (e.g. generalization ability), and also guides the building of new algorithms, there are many real world situations where it does not hold. This is particularly the case for many challenging tasks of machine learning that have recently received much attention such as (but not limited to): ranking, active learning, hypothesis testing, learning with graphical models, prediction on graphs, mining (social) networks, multimedia or language processing. The goal of this workshop is to bring together research works aiming at identifying problems where either the assumption of identical distribution or independency, or both, is violated, and where it is anticipated that carefully taking into account the non-IIDness is of primary importance.
Examples of such problems are:
- Bipartite ranking or, more generally, pairwise classification, where pairing up IID variables entails non-IIDness: while the data may still be identically distributed, it is no longer independent;
- Active learning, where labels for specific data are requested by the learner: the independence assumption is also violated;
- Learning with covariate shift, where the training and test marginal distributions of the data differ: the identically distributed assumption does not hold.
- Online learning with streaming data, when the distribution of the incoming examples changes over time: the examples are not identically distributed.
We see the workshop as a venue not only for the presentation of papers focusing on carefully dealing with non-IID data, but also as a forum for sharing ideas across different application domains. Henceforth, it will be an opportunity for discussions on methods that address non-IIDness from the following standpoints:
- Theoretical: results on generalization bounds and learnability, contributions that mathematically formalize the types of non-IIDness encountered, results on the extent to which non-IIDness does not harm the validity of theoretical results build on the IID assumption, helpfulness of the online learning framework,
- Algorithmic: theoretically motivated algorithms designed to handle non-IID data, approaches that make it possible for classical learning results to carry over, online learning procedures,
- Practical: successful applications of non-IID learning methods to learning from streaming data, web data, biological data, multimedia, natural language, social network mining.
Organizers
- Massih-Reza Amini, National Research Council, Canada
- Amaury Habrard, University of Marseille, France
- Liva Ralaivola, University of Marseille, France
- Nicolas Usunier, University Pierre et Marie Curie, France
Program Committee
- Shai Ben-David, University of Waterloo, Canada
- Gilles Blanchard, Fraunhofer FIRST (IDA), Germany
- Stéphan Clémençon, Télécom ParisTech, France
- François Denis, University of Provence, France
- Claudio Gentile, University dell'Insubria, Italy
- Balaji Krishnapuram, Siemens Medical Solutions, USA
- François Laviolette, Université Laval, Canada
- Xuejun Liao, Duke University, USA
- Richard Nock, University Antilles-Guyane, France
- Daniil Ryabko, Institut National de Recherche en Informatique et Automatique, France
- Marc Sebban, University of Saint-Etienne, France
- Ingo Steinwart, Los Alamos National Labs, USA
- Masashi Sugiyama, Tokyo Institute of Technology, Japan
- Nicolas Vayatis, École Normale Supérieure de Cachan, France
- Zhi-Hua Zhou, Nanjing University, China