News Archives

8th Summer School on Data Mining, Maastricht, The Netherlands

8-th SUMMER SCHOOL ON DATA MINING, Maastricht, The Netherlands
http://www.cs.unimaas.nl/datamining/

Summer School: Data Mining

An intensive 4-day introduction to methods and applications

Department of Knowledge Engineering, Maastricht University,
Maastricht, The Netherlands
August 30 – September 2, 2010

Introduction
Most business organizations collect terabytes of data about business
processes and resources. Usually these data provide just “facts and
figures”, not knowledge that can be used to understand and eventually
re-engineer business processes and resources. Scientific community in
academia and business have addressed this problem in the last 20 years
by developing a new applied field of study known as data mining.
In practice data mining is a process of extracting implicit,
previously unknown, and potentially useful knowledge from data. It
employs techniques from statistics, artificial intelligence, and
computer science. Data mining has been successfully applied for
acquiring new knowledge in many domains (like Business, Medicine,
Biology, Economics, Military, etc.). As a result most business
organizations need urgently data-mining specialists, and this is
the point where this course comes to help.

Course Description
The course is well balanced between theory and practice. Each lecture
is accompanied by a lab in which course participants experiment with
the techniques introduced in the lecture. The lab tool is Weka, one
of the most advanced data-mining environments. A number of real data
sets will be analysed and discussed. In the end of the course
participants develop their own ability to apply data-mining techniques
for business and research purposes.

Course Description
The course focuses on techniques with a direct practical use.
A step-by-step introduction to powerful (freeware) data-mining tools
will enable you to achieve specific skills, autonomy and hands-on
experience. A number of real data sets will be analysed and discussed.
In the end of the course you will have your own ability to apply data-
mining techniques for research purposes and business purposes.

Course Content
The course will cover the topics listed below.
– The Knowledge Discovery Process
– Data Preparation
– Basic Techniques for Data Mining:
+ Decision-Tree Induction
+ Rule Induction
+ Instance-Based Learning
+ Bayesian Learning
+ Support Vector Machines
+ Regression Techniques
+ Clustering Techniques
+ Association Rules
– Tools for Data Mining
– How to Interpret and Evaluate Data-Mining Results

Intended Audience
This course is intended for four groups of data-mining beginners:
students, scientists, engineers and experts in specific fields who need
to apply data-mining techniques to their scientific research, business
management, or other related applications.

SIKS
Participating in this course is a part of the advanced components stage
of SIKS’ educational program. SIKS has reserved a number of places for
those Ph.D-students working on the course topics.

Prerequisites
The course does not require any background in databases, statistics,
artificial intelligence, or machine learning. A general background in
science is sufficient as is a high degree of enthusiasm for new
scientific approaches.

Certificate
Upon request a certificate of full participation will be provided after
the course.

Registration
To register for the course please send an email to the registration office
specifying the following information:
– Name
– University / Organisation
– Address
– Phone
-E-Mail

Please register before August 9, 2010

Registration fees
Academic fee 600 Euros
Non-academic fee 850 Euros

Included in the price are: course material and coffee breaks. The local
cafeteria will be available for lunch (not included).

SIKS-Ph.D. students
Participating in this course is a part of the advanced components stage
of SIKS’ educational program. SIKS has reserved a number of places for
those Ph.D-students working on the course topics. SIKS-Ph.D.-students
interested in taking the course should NOT contact the local organization,
but send an e-mail to office(at)siks.nl and confirm that their supervisor
supports their participation

E-mail should be sent to: smirnov(at)maastrichtuniversity.nl

Regular mail should be sent to:

Evgueni Smirnov
Department of Knowledge Engineering
Faculty of Humanities and Sciences
Maastricht University
P.O.Box 616
6200 MD Maastricht
The Netherlands
Phone: +31 (0) 43 38 82023
Fax: +31 (0) 43 38 84897

Call for papers – Machine Learning in Systems Biology

Call for Papers

MLSB 2010
The Fourth International Workshop on Machine Learning in Systems Biology
15-16 October 2010, Edinburgh, Scotland
http://mlsb10.ijs.si/

MOTIVATION

Molecular biology and all the biomedical sciences are undergoing a
true revolution as a result of the emergence and growing impact of a
series of new disciplines/tools sharing the “-omics” suffix in their
name. These include in particular genomics, transcriptomics,
proteomics and metabolomics, devoted respectively to the examination
of the entire systems of genes, transcripts, proteins and metabolites
present in a given cell or tissue type.

The availability of these new, highly effective tools for biological
exploration is dramatically changing the way one performs research in
at least two respects. First, the amount of available experimental
data is not a limiting factor any more; on the contrary, there is a
plethora of it. Given the research question, the challenge has
shifted towards identifying the relevant pieces of information and
making sense out of it (a “data mining” issue). Second, rather
than focus on components in isolation, we can now try to understand
how biological systems behave as a result of the integration and
interaction between the individual components that one can now monitor
simultaneously (so called “systems biology”).

Taking advantage of this wealth of “genomic” information has become a
conditio sine qua non for whoever ambitions to remain competitive in
molecular biology and in the biomedical sciences in general. Machine
learning naturally appears as one of the main drivers of progress in
this context, where most of the targets of interest deal with complex
structured objects: sequences, 2D and 3D structures or interaction
networks. At the same time bioinformatics and systems biology have
already induced significant new developments of general interest in
machine learning, for example in the context of learning with
structured data, graph inference, semi-supervised learning, system
identification, and novel combinations of optimization and learning
algorithms.

The Workshop is organized as “core – event” of Pattern Analysis,
Statistical Modelling and Computational Learning – Network of Excellence
2 (PASCAL 2, http://www.pascal-network.org/)

OBJECTIVE

The aim of this workshop is to contribute to the cross-fertilization
between the research in machine learning methods and their
applications to systems biology (i.e., complex biological and medical
questions) by bringing together method developers and
experimentalists. We encourage submissions bringing forward methods
for discovering complex structures (e.g. interaction networks,
molecule structures) and methods supporting genome-wide data analysis.

LOCATION AND CO-LOCATION

The workshop will take place 15-16 October 2010 at the Edinburgh
International Conference Centre and the Informatics Forum of the
University of Edinburgh. It will be part of the wokshop program of
ICSB 2010, The 11th International Conference on Systems Biology
(11-14 OCT 2010, http://www.icsb2010.org.uk/).

SUBMISSIONS INSTRUCTIONS

We invite you to submit an extended abstract of up to 4 pages
describing new or recently published (2010) results, formatted
according to the Springer Lecture Notes in Computer Science
style. Each extended abstract must be submitted online via the Easychair
submission system: http://www.easychair.org/conferences/?conf=mlsb10

The extended abstracts will be reviewed by the scientific programme
committee. They will be selected for oral or poster presentation
according to their originality and relevance to the workshop topics.
Electronic versions of the extended abstracts will be accessible to the
participants prior to the conference, distributed in hardcopy form to
participants at the conference, and will be made publicly available
on the conference web site after the conference. However, the
book of abstracts will not be published and the extended abstracts
will not constitute a formal publication.

We expect that authors of selected contributions will be invited to
submit full papers to special issues of high-ranking
Machine Learning/Systems Biology journals.

KEY DATES

15 May: Submission site open
25 June: deadline for submission of extended abstracts
25 July: notification of acceptance
15-16 October: workshop

TOPICS

A non-exhaustive list of topics suitable for this workshop is given
below:

Methods

Machine learning algorithms
Bayesian methods
Data integration/fusion
Feature/subspace selection
Clustering
Biclustering/association rules
Kernel methods
Probabilistic inference
Structured output prediction
Systems identification
Graph inference, completion, smoothing
Semi-supervised learning

Applications

Sequence annotation
Gene expression and post-transcriptional regulation
Inference of gene regulation networks
Gene prediction and whole genome association studies
Metabolic pathway modeling
Signaling networks
Systems biology approaches to biomarker identification
Rational drug design methods
Metabolic reconstruction
Protein function and structure prediction
Protein-protein interaction networks
Synthetic biology

INVITED SPEAKERS (confirmed)

Florence d’Alche Buc, Universite d’Evry-Val d’Essonne, Evry, France
Nir Friedman, The Hebrew University of Jerusalem, Jerusalem, Israel
Ursula Kummer, BIOQUANT, University of Heidelberg, Germany
Hans Lehrach, Max Planck Institute for Molecular Genetics, Berlin, Germany
Vebjorn Ljosa, The Broad Institute of MIT and Harvard, USA

MLSB10 PROGRAM CHAIRS

Saöo Dûeroski, Jozef Stefan Institute, Ljubljana, Slovenia
Simon Rogers, University of Glasgow, UK
Guido Sanguinetti, University of Sheffield/University of Edinburgh, UK

UAI 2010 Approximate Inference Evaluation – Invitation to participate

Dear Colleagues,

We would like to invite you to participate at the UAI approximate inference challenge, which is now accepting submissions.

Results are published in an online leader-board which is updated regularly (you can use pseudo-names in submissions if you prefer anonymity).

A prize of four free UAI registrations will be awarded to the winning teams, as well as an opportunity to give a short presentation about their
algorithms at the UAI conference in July.

More details are available at: http://www.cs.huji.ac.il/project/UAI10/index.php

The deadline for submissions is July 1st.

Looking forward to your submissions,

Gal Elidan and Amir Globerson

Open Postdoc Positions in Bandits and Reinforcement Learning at INRIA Lille

Open Postdoc Positions in Bandits and Reinforcement Learning at INRIA Lille

The project team SEQUEL (Sequential Learning) of INRIA Lille, France, http://sequel.lille.inria.fr/ is seeking to appoint several Postdoctoral Fellows. We welcome applicants with a strong mathematical background who are interested in theory and applications of reinforcement learning and bandit algorithms.
The research will be conducted under the supervision of Remi Munos, Mohammad Ghavamzadeh and/or Daniil Ryabko, depending on the chosen topics.

The positions are research only and are for one year, with possibility of being extended.
The starting date is flexible, from the Fall 2010 to Spring 2011.

INRIA is France’s leading institution in Computer Science, with over 2800 scientists employed, of which around 250 in Lille. Lille is the capital of the north of France, a metropolis with 1 million inhabitants, with excellent train connection to Brussels (30 min), Paris (1h) and London (1h30).
The Sequel lab is a dynamic lab at INRIA with over 25 researchers (including PhD students) which covers several aspects of machine learning from theory to applications, including statistical learning, reinforcement learning, and sequential learning.

The positions will be funded by the EXPLO-RA project (Exploration-Exploitation for efficient Resource Allocation), a project in collaboration with ENS Ulm (Gilles Stoltz), Ecole des Ponts (Jean Yves Audibert), INRIA team TAO (Olivier Teytaud), Univ. Paris Descartes (Bruno Bouzy), and Univ. Paris Dauphine (Tristan Cazenave).
See: http://sites.google.com/site/anrexplora/ for some of our activities.

Possible topics include:
– In Reinforcement learning: RL in high dimensions. Sparse representations, use of random projections in RL.
– In Bandits: Bandit algorithms in complex environments. Contextual bandits, Bandits with dependent arms, Infinitely many arms bandits. Links between the bandit and other learning problems.
– In hierarchical bandits / Monte-Carlo Tree Search: Analysis and developement of MCTS / hierarchical bandit algorithms, planning with MCTS for solving MDPs
– In Statistical learning: Compressed learning, use of random projections, link with compressed sensing.
– In sequential learning: Sequential prediction of time series

Candidates must have a Ph.D. degree (by the starting date of the position) in machine learning, statistics, or related fields, possibily with background in reinforcement learning, bandits, or optimization.

To apply please send a CV and a proposition of research topic to remi.munos(at)inria.fr or mohammad.ghavamzadeh(at)inria.fr, or daniil.ryabko(at)inria.fr.

If you are planning to go to ICML / COLT this year, we could set up an appointment there.

ICMLA 2010 Speaker Clustering Challenge

Call for Papers

ICMLA 2010 Speaker Clustering Challenge
Washington DC, USA, 12-14 Dec. 2010

http://www.icmla-conference.org/icmla10/CFP_Challenge1_files/CFP_Challenge1.html

OVERVIEW:
Learning methods for sequential data are receiving widespread attention in recent years. This kind of data arises in many interesting scenarios, where the individual semantic units are no longer single vectors but collections of vectors. As examples of these kind of scenarios, we can cite multimedia analysis (e.g., video understanding, speaker recognition), bioinformatics (e.g., DNA or protein sequences), etc. Sequences can have different lengths, so standard distance measures for vector spaces are not directly applicable.
Moreover, sometimes the information conveyed by the sequences is encoded not just on the individual vectors themselves, but also in the dynamics under which these vectors evolve along time. In order to capture such information, it is usual to employ dynamic models such as hidden Markov models or more general dynamic Bayesian networks. Then, distances between sequences can be defined using the learned models.
However, there are many scenarios where the sequences can be accurately classified or clustered without attending their dynamic characteristics. Examples include bag-of-words models for image analysis, speech-independent speaker verification, etc. In these cases the sequences can be viewed as sets of independent and identically distributed (i.i.d.) samples, and can thus be characterized in terms of their underlying probability density function (PDF). There are many ways of defining affinities or distances between PDFs, from the classic Kullback-Leibler or Bhattacharya divergences (even in feature space) to the recently proposed Probability Product Kernels.
In this challenge we propose to focus on unsupervised methods for sequential data. Specifically, clustering of speech data. Clustering tries to find coherent (in some sense) disjoint groups within a dataset. It does not require any training examples, so it is a very important tool for exploratory data analysis. Furthermore, clustering algorithms can be easily expanded into semi-supervised methods which are very useful when the labelling process is costly.

CHALLENGE FORMAT
This challenge proposes two different tasks:
* 2-class speaker clustering
* Multiclass speaker clustering
The first task is 2-class speaker clustering. For this task we provide 7 datasets, each one of them comprised of speech coming from two different speakers. The participants should then identify two clusters within each dataset.
The more advanced task is multiclass speaker clustering. This task is to be carried out on a single dataset, which is formed by sequences coming from an unknown number of speakers in the range. Participants should discover the number of speakers and perform an adequate clustering.
Both tasks are based on a speech database recorded using a PDA. It includes both male and female speakers. Each subject recorded 50 isolated words, and the mean length of each utterance is around 1.3 seconds. The original audio files were processed using the HTK software, yielding a standard parametrization consisting of 12 Mel-frequency cepstral coefficients (MFCCs), an energy term and their respective increments, giving a total of 26 parameters. These parameters were obtained every 10ms with a 25ms analysis window, yielding 26-dimensional sequences of around 130 samples. Any further pre-processing (normalization, filtering, …) is up to the participants.
Participants can submit their results for just one of the tasks or for the two of them. For details on how to format the results, please contact the organizers.

SUBMISSION AND EVALUATION:
Apart from the actual results, a short paper (4 pages) describing the proposed algorithms should be submitted through the main conference submission website. These papers will be reviewed mainly based on:
• Originality and technical soundness of the employed distance measures
• Coherence of the discovered clusters w.r.t. the speakers
• In the multiclass task, special attention will be paid to the steps toward the correct identification of the number of speakers

PUBLICATION:
Accepted papers will be published in the ICMLA’10 conference proceedings.

IMPORTANT DATES:
Paper Submission Deadline: July 15, 2010
Notification of acceptance: September 7, 2010
Camera-ready papers & Pre-registration: October 1, 2010

ICMLA 2010 Challenge Organizers:
* Darío García-García, University Carlos III Madrid, Spain (dggarcia(at)tsc.uc3m.es)
* Raúl Santos-Rodríguez, University Carlos III Madrid, Spain (rsrodriguez(at)tsc.uc3m.es)

CFP: ECCV10 Workshop: Sign Gesture & Activity

International Workshop on Sign Gesture and Activity 2010

Saturday September 11, 2010, Hersonissos, Heraklion, Crete, Greece in conjunction with ECCV 2010.

www.ee.surrey.ac.uk/Personal/R.Bowden/SGA2010

Topic: The workshop will bring together researchers from vision, learning and related areas to present and discuss the recognition of spatio-

temporal motion of people across a broad range of application areas ranging from sign language recognition through to gesture and activity.

Important Dates:

Submission deadlines: Wed, 16th June 2010

Acceptance decisions: Thurs, 8th July 2010

Camera-ready papers: Tue, 13th July 2010

The list of topics will include (but are not limited to):

• Continuous Sign Language Recognition & analysis

• Non-Manual Features and Facial expression recognition

• Feature Extraction for recognition

• Human torso tracking and modelling

• Hand Shape Classification

• Gesture Recognition

• Activity and Action Recognition

• Facial expression analysis

• Lip Reading

• Fusion methods for Recognition

• Multimodal human behaviour analysis

• Non Verbal Communication

• Affective Computing

• Hand and Face Tracking

• Corpora for training and testing

• Semi-automatic corpora annotation tools

• Probabilistic sequence modelling

Invited Speakers: Ivan Laptev, INRIA, France, Dimitris Metaxas, Rutgers, USA

Workshop organisers: Richard Bowden, Uni of Surrey, UK < r.bowden@surrey.ac.uk>

Philippe Dreuw, RWTH Aachen Uni, DE

Petros Maragos, NTUA, Greece,

Justus Piater, Uni of Liège, Belgium,

Submission site: https://cmt.research.microsoft.com/SGA2010

Workshop site: http://www.ee.surrey.ac.uk/Personal/R.Bowden/SGA2010

Main Conference site: http://www.ics.forth.gr/eccv2010/intro.php

Postdoc Positions in New Sheffield Centre

We would like to announce two post-doctoral researcher positions at a
new Sheffield-based Centre for Biosystems Modelling and Inference.
Funded by significant investment from the Faculties of Medicine and
Engineering, , the new center has made three faculty appointments:
Neil Lawrence,
Magnus Rattray and John R. Terry. It is located in a new institute in
a brand new building. The focus of research in the centre will be
probabilistic inference and dynamical modeling.

The two post-doctoral positions are associated with grants
investigating the use of Gaussian process models in biological
systems. The successful candidates will work with Professor Magnus
Rattray and Professor Neil Lawrence on these projects. The
appointments represent an excellent chance to work with a dynamic
group of individuals applying state of the art machine learning
techniques to problems in computational biology.

More details are available here:

Postdoc on the SYNERGY project: http://bit.ly/cLSA9h
Postdoc on Experimental Design: http://bit.ly/duFqN6

Note that the closing date for application for the first position is
soon: 7th July 2010. The second position has a closing date of 23rd
July.

Please contact Magnus or myself if you have any informal queries.

Neil Lawrence
Magnus Rattray

CfA: BCCN 2010 – Berlin Sept 27 – Oct 1

=== Call for Abstracts ===

Bernstein Conference on Computational Neuroscience (BCCN 2010)

The Bernstein Conference on Computational Neuroscience (BCCN) is an
annual meeting of researchers working in Computational Neuroscience
and Neurotechnology. It has grown out of the annual Symposia of the
German National Bernstein Network for Computational Neuroscience,
which have been held since 2005. Now in its 6th year, organized by the
Bernstein Focus: Neurotechnology at the Berlin Institute of
Technology, it has been opened as an international conference. The
BCCN is a single track conference that covers all aspects of
Computational Neuroscience and Neurotechnology. We invite the
submission of abstracts from all relevant areas. Selected abstracts
will be published in the journal Frontiers in Computational
Neuroscience.

The meeting is open for contributions from all relevant areas of
computational neuroscience including, but not limited to: learning and
plasticity, sensory processing, motor control, reward system, brain
computer interface, neural encoding and decoding, decision making,
information processing in neurons and networks, dynamical systems and
recurrent networks, and neurotechnology.

CONFERENCE DATE AND VENUE:
September 27 – October 1, 2010
Technische Universität Berlin
Berlin, Germany

http://www.bccn2010.de/

PHD STUDENT-SYMPOSIUM:
October 1st, 2009
Technische Universität Berlin
Berlin, Germany

IMPORTANT DATES:
Abstract submission deadline: July 2, 2010
Poster submission deadline: July 2, 2010
Notification of acceptance: August 2, 2010
Early registration closes: August 18, 2010

CONFIRMED INVITED SPEAKERS:
Lars-Kai Hansen (Technical University of Denmark)
Ernst Fehr (University of Zurich)
Pascal Fries (Ernst Strüngmann Institute)
Peter Jonas (Albert-Ludwigs-Universität Freiburg)
Misha Tsodyks (Weizmann Institute of Science)
Gero Miesenböck (University of Oxford)

ORGANIZING COMMITTEE:
General Chair: Klaus-Rober Müller
Conference Office: Matthias L. Jugel, Imke Weitkamp

PROGRAM COMMITTEE
Demian Battaglia, Matthias Bethge, Armin Biess, Benjamin Blankertz,
Axel Borst, Martin Burghoff, Gabriel Curio, Ulrich Egert,
Roland Fleming, Alexander Gail, Jan Gläscher, Tim Gollisch,
Ralf Haefner, John-Dylan Haynes, Leo van Hemmen, Andreas Herz,
Frank Hesse, Christian Igel, Dirk Jancke, Christoph Kayser,
Richard Kempter, Peter König, Christian Leibold, Sebastian Möller,
Klaus-Robert Müller, Andreas Neef, Klaus Obermayer, Stefano Panzeri,
Petra Ritter, Constantin Rothkopf, Gregor Schöner, Jens Steinbrink,
Jochen Triesch, Thomas Wachtler, Felix Wichmann, Laurenz Wiskott,
Annette Witt, Gabriel Wittum

Postdoc in Computer Vision and Machine Learning, University of Leeds

Research Fellow in Computer Vision and Machine Learning
University of Leeds – School of Computing

(Full-time, fixed term position for 14 months)

You will work with Dr Mark Everingham (http://www.comp.leeds.ac.uk/me/) on an EPSRC funded project investigating new methods for learning human pose estimation from weak or approximate supervision. The project has three main aims: (i) producing a large dataset of approximately annotated consumer images, at least two orders of magnitude larger than available datasets; (ii) developing machine learning methods to learn from approximate annotation and “side information” for example simple models of human anatomy; (iii) developing strong models of appearance to give robust pose estimation, using the developed machine learning approach. This will include higher order cues modelling appearance of limbs, dependencies between limbs and appearance of joints and configurations of limbs.

You are expected to have a PhD (or to be awarded shortly) in Computer Vision or Machine Learning. You should have experience in developing and applying computer vision and machine learning algorithms, especially probabilistic methods. Expertise in graphical models, structured learning or human pose estimation would be a particular advantage. You should be a proficient programmer in MATLAB and C/C++. You should be self-motivated, good at time management and planning, and have a proven ability to meet deadlines. Good communication and presentation skills are also important.

Salary: Grade 7 (£29,853 – £35,646 p.a)

Apply using: Application form, CV and Equal Opportunities Monitoring form

Application forms:

http://www.leeds.ac.uk/hr/forms/recruitment/app_form.pdf
http://www.leeds.ac.uk/hr/forms/recruitment/newapplicationform.doc

Informal enquiries: Dr Mark Everingham, tel +44 (0)113 343 5370, email m.everingham(at)leeds.ac.uk

Send completed applications to: Judi Drew, email j.a.drew@leeds.ac.uk, or by post to:

Judi Drew
School of Computing
University of Leeds
Leeds
LS2 9JT

Closing date: 18 June 2010

Apply online: http://hr.leeds.ac.uk/jobs/

Phd/Postdoc position in machine learning & signal processing

The Robotics Group of the University of Bremen, Germany, is looking for a highly motivated researcher (PhD student or Postdoc) to work in a project dealing with the online evaluation of EEG data using machine-learning algorithms. Behavioral and fMRI data will also be analyzed during the project. Applicants without a PhD are expected to use their work in order to write a dissertation. The position is planned for 3 years.

The interdisciplinary project cooperates very closely with the German Research Center for Artificial Intelligence (Deutsches Forschungszentrum für künstliche Intelligenz, DFKI). The successful applicant will work together with scientists from many different disciplines (e.g. engineers, computer scientists, physicists, biologists, mathematicians).

Requirements: Applicants must have a university degree (M.Sc. or equivalent) in computer science (or a related field covering the topic, e.g. computational neuroscience or physics) with a strong background in machine learning. Experience with Brain-Computer-Interfaces (BCIs), autonomous mobile robotic systems and/or acquisition of EEG, fMRI or behavioral data is advantageous. Furthermore, experience in programming with C/C++ and/or Python is a benefit.

While English is mandatory, knowledge of German is a plus.

Applications should be sent electronically to verena.tenzer(at)dfki.de using the key number A42/10.