Postdoctoral Position in Machine Learning at INRIA Lille – Team SequeL

Applications are invited for a Postdoctoral position on the general area of “Sequential Decision-making in Online Marketing” at INRIA Lille – Team SequeL. Below is the detail of this call.

Title: Sequential Decision-making in Online Marketing: Optimizing the Lifetime Value of Customers

Keywords: sequential decision-making, reinforcement learning, online marketing and advertising, exploration/exploitation dilemma, bandit algorithms, adaptive resource allocation, regret minimization, optimization

Research Program:

The candidate is expected to conduct research on both theoretical and applied aspects of the problem of “Sequential Decision-making in Online Marketing” and related problems (see the description below), collaborate with researchers and Ph.D. students at INRIA and outside, and publish the results of her/his research in conferences and journals. The candidate will work with Mohammad Ghavamzadeh ( and other researchers at Team SequeL (

With the growth of online marketing, customers visit websites on a regular basis (sometimes daily in the case of banking, e-commerce, and media websites), and at each visit a stream of interactions occurs between the company (promotion, advertisement, email) and a customer (purchase, click on an ad, signing up for a newsletter). This creates many opportunities for a company to reach a customer and make decisions to optimize its objective function (revenue, customer satisfaction, etc.). Today, these marketing decisions are mainly made in a myopic way (mainly using contextual bandit algorithms) without taking into account the lifetime value of the customer. This myopic approach assumes that a decision made by the company does not affect the customer’s future interactions with the company. However, in many applications the sequential nature of the problem is significant (having a long-term relation with customers is important for the company), and thus, myopic decisions have po!
or performance in these problems. This creates an opportunity for reinforcement learning (RL) techniques to play a significant role in this emerging field.

The objective of this research program is to answer fundamental questions related to the use of myopic (contextual bandit algorithms) and non-myopic (sequential decision-making and RL algorithms) decision-making methods in the growing field of online marketing. Questions such as

– Feature selection and dealing with high dimensional data: discovering the right representation for the problem at hand and dealing with the size and dimensionality of the data are among the most important questions in these applications. The size and dimensionality of the data create difficulties for the standard sequential decision-making algorithms. This is closely related to another growing research direction: sequential decision-making with big data.

– Off-policy evaluation: how to evaluate a policy learned from a batch of historical data generated with a different policy (usually the company’s policy) with minimum interaction with the real-world environment. Running a strategy on the real system can be costly: it usually takes a long time to have a reasonable evaluation of its quality and more importantly is the risk of a big loss in case the strategy is not good.

– Discovering patterns in the sales funnel in order to find strategies to direct more customers through the funnel to the final sale.

– Dealing with the non-stationarity, mainly caused by change in the preferences of the customer, arrival and departure of customers, evolution of webpage contents, etc., and delayed feedback (significant delay between an action taken by the marketer and its effects on the customer) in the online marketing applications.


The applicant will have a Ph.D. degree (by the starting date of the postdoctoral position) in Computer Science, Statistics, or related fields, with background in reinforcement learning, bandit algorithms, statistics, and optimization. Programming skills will be considered as a plus. The working language of the group is English, so the candidate is expected to have good communication skills in English.

About INRIA and Team SequeL:

SequeL ( is one of the most dynamic teams at INRIA (, with over 25 researchers and Ph.D. students working on several aspects of machine learning from theory to application, including statistical learning, reinforcement learning, and sequential decision-making. The SequeL team is involved in national and European research projects and has collaboration with international research groups. This allows the postdoctoral candidate to collaborate with leading researchers in the field at top universities in Europe and North America such as University College of London (UCL), University of Alberta, and McGill University. Moreover, in this project there is the possibility of close collaboration with an online marketing company in the US. Lille is the capital of the north of France, a metropolis with over one million inhabitants, and with excellent train connection to Brussels (30min), Paris (1h) and London (1h30).


– Duration: 16 months – starting date of the contract : November 1, 2013
– Salary: 2620.84 Euros gross/month monthly salary
– Monthly salary after taxes: around 2138 Euros (medical insurance included)
– Possibility of French courses
– Help for housing
– Participation for transportation
– Scientific Resident card and help for husband/wife visa

Application Submission:

The application should include a brief description of the applicant’s research interests and past experience, plus a CV that contains her/his degrees, GPAs, relevant publications, name and contact information of up to three references, and other relevant documents. Please send your application to The deadline for the application is April 15 but the applicants are encouraged to submit their application as soon as possible.

This call has also been posted on

1) my webpage at

2) the INRIA website at: