ICML 2009 Workshop on Numerical Mathematics in Machine Learning

NUMML 2009 Numerical Mathematics in Machine Learning


Most machine learning (ML) algorithms rely fundamentally on concepts of numerical mathematics. Standard reductions to black-box computational primitives do not usually meet real-world demands and have to be modified at all levels. The increasing complexity of ML problems requires layered approaches, where algorithms are components rather than stand-alone tools fitted individually with much human effort. In this modern context, predictable run-time and numerical stability behavior of algorithms become fundamental. Unfortunately, these aspects are widely ignored today by ML researchers, which limits the applicability of ML algorithms to complex problems, and therefore the practical scope of ML as a whole.

Background and Objectives

Our workshop aims to address these shortcomings. Ideally, a code of conduct can be established for MLers combining and modifying numerical primitives, a set of essential rules as a compromise between inadequate black-box reductions and highly involved complete numerical analyses. We will invite speakers with interest in *both* numerical methodology *and* real problems in applications close to machine learning. While numerical software packages of ML interest will be pointed out, our focus will rather be on how to best bridge the gaps between ML requirements and these computational libraries. A subordinate goal will be to address the role of parallel numerical computation in ML. One running example will be the linear model, or Gaussian Markov random field, a building block behind sparse estimation, Kalman smoothing and filtering, Gaussian process models, state space models, or (multi-layer) perceptrons. Basic tasks in this model require the solution of large linear systems, eigenvector approximations, matrix factorizations and their low-rank updates. In turn, model structure can often be used to drastically speed up, or even precondition, these low-level numerical computations.

Impact and Expected Outcome

We will call the community’s attention to the increasingly critical issue of numerical considerations in algorithm design and implementation. A set of essential rules for how to use and modify numerical software in ML is required, for which we aim to lay the groundwork in this workshop. These efforts should lead to an awareness of the problems, as well as increased focus on efficient and stable ML implementations. We will encourage speakers to point out useful software packages, together with their caveats, asking them to focus on examples of ML interest. Raising awareness about the increasing importance of stability and predictable run-time behaviour of numerical machine learning algorithms and primitives. Establishing a code of conduct for how to best select and modify existing numerical mathematics code for machine learning problems. Learning about developments in current numerical mathematics, a major backbone of most machine learning methods.

Potential Subtopics

* Solving large linear systems o Arise in the linear model/Gaussian MRF (mean computations), nonlinear optimization methods (Newton-Raphson, IRLS, …) o Linear conjugate gradients o Preconditioning, use of model structure
* B- Numerical linear algebra packages relevant to ML o LAPACK, BLAS, GotoBLAS, MKL, UMFPACK, …
* Eigenvector approximation o Arise in the linear model (covariance estimation), spectral clustering and graph Laplacian methods, PCA o Lanczos algorithm and specialized variants
* Exploiting matrix/model structure, fast matrix-vector multiplication o Matrix decompositions/approximations o Multi-pole methods o FFT-based multiplication
* Matrix factorizations, low-rank updates o Arise in the linear model, Gaussian process/kernel methods o Cholesky updates/downdates
* Parallel numerical computation for ML