Introduction
Touch Clarity (www.touchclarity.com) provides realtime optimisation of websites. Touch Clarity chooses, from a number of options, the most popular content to display on a page. This decision is made by tracking how many visitors respond to each of the options, by clicking on them. This is a direct commercial application of the multiarmed bandit problem – each of the items which might be shown is a separate bandit, with a separate response rate.
As in the multiarmed bandit problem, there is a tradeoff between exploration and exploitation – it is necessary to sometimes serve items other than the most popular in order to measure their response rate with sufficient precision to correctly identify which is the most popular. However, in this application there is a further complication – typically the rates of response to each item will vary over time, so continuous exploration is necessary in order to track this time variation, as old knowledge becomes out of date. An extreme example of this might be in choosing which news story to serve as the main story on a news page – interest in one story will decrease over time while interest in another will increase. In addition, the interest in several stories might vary in a similar, coherent way – for example a general increase in interest in sports stories at weekends, or in political stories near to an election. So there are typically two types of variation to consider – where response rates vary together, and where response rates vary completely independently.
The Challenge
The challenge aims to stimulate research into the optimal way to tradeoff exploration vs exploitation where the response rate for each option is varying over time. Instead of a static data set, data for the challenge is provided in the form of a visitor behaviour simulation engine, written in java, and also made available as a Matlab function.
Entrants should write a program which will repeatedly choose from a range of possible display options, and query the visitor simulation engine through an interface e.g.
public Boolean selectOption(int option)The engine will return true or false, depending on whether the simulated visitor responded to the option that was served to them. The objective is to maximize the number of positive responses returned by the engine. The response rate for each option varies over time. The visitor simulation engine maintains time dependent continuous functions for the response rate of each of the options. When the engine is queried with a given option, it calculates the value of the function at that time. The engine returns true with a probability given by this value. The time variable used is the number of requests made to the engine, rather than real time. This is to remove dependencies on the machine hardware, and application performance of the entrants. In this application there is no fixed horizon, that is to say the number of trials is not fixed, and so solutions should not use the number of trials as a parameter.
To prevent the engine from repeating the same behaviour every time it is restarted, the response rate functions are parameterized. On restart, new values for these parameters are randomly generated based on a seed.
The challenge is broken into a number of individual tasks each relating to a particular problem that is frequently encountered in real website data. The engine addresses each of these problems in turn.
Tasks
There are now six tasks, the old tasks are now invalid. In each task there are 5 options. The best achievable response rate is approximately 0.18 for all tasks. All tasks are designed to display the appropriate behaviour over 1M requests. (Although during the judging phase more may be required.)
Task 1. SimulatedVisitorFrequentSwap
To check a candidate algorithm can respond to relatively frequent changes of the best option.
Task 2. SimulatedVisitorLongGaussians
To check a candidate algorithm can respond to changes of the best option over longer time periods.
Task 3. SimulatedVisitorWeeklyVariation
To check a candidate algorithm can handle coherent variation of response rates where there are two sinusoidal components to the variation of differing periods, the longer period being dominant. In addition, there is a gradual change in the relative ranking of the option. This is to ensure there is ongoing exploration throughout the run and that it can handle the consequent comparison of option response rates that have been served at different times. In addition, each option has a randomly chosen start time.
Task 4. SimulatedVisitorDailyVariation
To check a candidate algorithm can handle coherent variation of response rates where there are two sinusoidal components to the variation of differing periods, the shorter period being dominant.
Task 5. SimulatedVisitorWeeklyCloseVariation
To check a candidate algorithm can handle coherent variation where response rates are closely spaced.
Task 6. SimulatedVisitorConstant
We must ensure a candidate algorithm can handle the ideal case. An over exploratory algorithm can degrade to random performance under these conditions.
Organisers
Zakria Hussain, University College, London
Peter Auer, University of Leoben
Nicolò Cesa-Bianchi, Università degli Studi di Milano
Leonard Newnham, Touch Clarity
John Shawe-Taylor, University College, London
Submission of Results
Results can be submitted at any time via the submit tab, a set of random seeds is available here. Run your code using these seeds with each run of length 1M requests.
Judging
During the judging phase of the challenge, a further evaluation will take place for each algorithm. This is to ensure candidate algorithms can perform well over a variety of conditions. A second set of random seeds will be published along with a different run length. Each entrant will be re-evaluated against these seeds and run length. The winner of the challenge will be the entrant who obtains the lowest total regret on all six tasks. There is no longer the option of submitting for one task only.
Touch Clarity reserves the right to disqualify entrants if any of the following rules are broken:
- No algorithm should use the run length as an input parameter. In a real application optimisation does not terminate after a particular number of requests.
- Algorithms should NOT use information about the shape of the response functions obtained during the course of a run. The response function information available from the csv files is to assist in improving the performance of algorithms only. The sort of solution we are trying to avoid is “if resp rates constant over first 1000 responses then use USB otherwise use SomeOtherAlgorithm”.
Java code download
Java-code1.2.zip (version 1.2 released 21st May 2006)
There is an API written in Java which is distributed as a .jar file (.class executables only) and some baseline Java code that implements some simple multi-armed bandit problem algorithms. You will need at least Java version 1.5 to run the Java code. You can compile the baseline code like so:
javac -classpath challenge1.2.jar applicationtest/*.java applicationtest/simplewindowprovider/*.java
After this you can run ApplicationTest like so:
java -classpath .;challenge1.2.jar -Xmx256m applicationtest/ApplicationTest
OR
java -classpath :challenge1.2.jar -Xmx256m applicationtest/ApplicationTest
The Java code contains provider classes and several (bandit) algorithms written using the provider classes. ApplicationTest calls these algorithms and runs them on all 6 tasks. All participants will need to write their algorithm as a provider class. Please look at the provider classes in the sample Java code to understand how to do this. The code contains very simplistic algorithms for bandit problems when arms do not vary over time. However, please note that the challenge is for the case where the rewards for an arm varies over time.
Version 1.1 contains:
- Six new tasks
- detailed output every 1000 requests to csv file. This contains:
- the underlying response rates of the options in the task, and
- the proportion of serves each option is receiving
 
This allows the user to construct clear graphical views of how an algorithm is performing over time – see for example
- .
- An efficiency metric has been added to getStatistics(). This ranges from 0 (equivalent to random serving) to 1 (equalling the maximum theoretical success rate).
Version 1.2 contains
- UCBWindowProvider – a windowed version of UCB
- Improved regret summary
- improved commandline args:
- -h help
- -r number of requests
- -s random seed set
- -i produce results on interim seed set. The summary printed at the end of the test is sufficient for a submission
 
- better javadoc of applicationtest and challenge1.2.jar
Here is the javadoc1.2 for the challenge1.2.jar package.
Matlab code download
Matlab-code1.2.zip (version 1.2 released 21st May 2006)
The Matlab code is a wrapper for the Java API but does not contain any baseline code like that available for the Java download. The Matlab wrapper calls in the challenge1.2.jar executables and all its functions can be used as in Java. Please read the documentation (preamble) in the matlab file in order to set it up to run correctly.
Results of Phase 2 tasks
THE WINNING ALGORITHM WILL BE THE ONE THAT IS RANKED NUMBER 1 IN THE “TOTAL AVERAGE REGRET FOR TASKS 1-6” TABLE USING THE TEST SEEDS.
Total Average Regret for tasks 1-6
| Rank | Algorithm | Name | Institution | Total Average Regret | |
|---|---|---|---|---|---|
| Validation Seeds | Test Seeds | ||||
| 1 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 34599.203 | 34202.5 | 
| 2 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 35071.0 | 36073.797 | 
| 3 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 35120.0 | 36253.9 | 
| 4 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 37066.2 | 37117.8 | 
| 5 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 35532.4 | 37282.203 | 
| 6 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 36948.3 | 37744.7 | 
| 7 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 37356.2 | 38194.6 | 
| 8 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 39605.4 | 43969.7 | 
| 9 | UCBWindowProvider | Baseline code | – | 180301.81 | 179246.31 | 
| 10 | ShiftingBanditProvider | Baseline code | – | 211875.2 | 211107.8 | 
| 11 | ProviderUCB | Baseline code | – | 222884.5 | 249174.6 | 
| 12 | RandomProvider | Baseline code | – | 322342.7 | 318739.6 | 
Task 1. SimulatedVisitorFrequentSwap
| Rank | Algorithm | Name | Institution | Validation Seeds | Test Seeds | ||
|---|---|---|---|---|---|---|---|
| Average Regret | STD | Average Regret | STD | ||||
| 1 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 9183.5 | 614.5174 | 9133.0 | 508.81235 | 
| 2 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 11052.9 | 1126.048 | 9906.5 | 780.7754 | 
| 3 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 10198.4 | 933.2443 | 10697.9 | 1380.2686 | 
| 4 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 12059.8 | 884.5122 | 11983.7 | 618.63525 | 
| 5 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 12036.0 | 658.0817 | 12057.5 | 879.0806 | 
| 6 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 11992.7 | 618.4238 | 12303.9 | 546.7468 | 
| 7 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 12152.8 | 701.47064 | 12330.8 | 766.26135 | 
| 8 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 15298.1 | 1714.1592 | 15265.4 | 3004.444 | 
| 9 | ProviderUCB | Baseline code | – | 32832.0 | 942.6005 | 32392.5 | 828.2142 | 
| 10 | UCBWindowProvider | Baseline code | – | 36098.6 | 1702.405 | 35564.0 | 1964.2234 | 
| 11 | ShiftingBanditProvider | Baseline code | – | 40806.8 | 1312.9364 | 40947.7 | 1463.947 | 
| 12 | RandomProvider | Baseline code | – | 48835.7 | 460.32596 | 49054.1 | 503.07773 | 
Task 2. SimulatedVisitorLongGaussians
| Rank | Algorithm | Name | Institution | Validation Seeds | Test Seeds | ||
|---|---|---|---|---|---|---|---|
| Average Regret | STD | Average Regret | STD | ||||
| 1 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 3032.5 | 467.23615 | 3350.2 | 451.85638 | 
| 2 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 3854.6 | 393.24185 | 4020.2 | 678.94574 | 
| 3 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 3877.9 | 328.87567 | 4031.0 | 400.47943 | 
| 4 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 3819.4 | 331.3753 | 4133.9 | 631.6608 | 
| 5 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 4550.2 | 1560.2434 | 4561.1 | 975.12006 | 
| 6 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 4767.9 | 389.79468 | 4622.5 | 283.90424 | 
| 7 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 4817.5 | 1670.4481 | 5636.5 | 1254.3213 | 
| 8 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 7352.9 | 2036.1892 | 8613.9 | 1643.2212 | 
| 9 | ShiftingBanditsProvider | Baseline code | – | 47240.3 | 3023.0012 | 46586.8 | 3428.574 | 
| 10 | ProviderUCB | Baseline code | – | 51547.9 | 10747.207 | 68992.3 | 29760.65 | 
| 11 | UCBWindowProvider | Baseline code | – | 64994.6 | 14387.951 | 69414.7 | 25192.545 | 
| 12 | RandomProvider | Baseline code | – | 113337.1 | 7096.978 | 109707.5 | 7496.9795 | 
Task 3. SimulatedVisitorWeeklyVariation
| Rank | Algorithm | Name | Institution | Validation Seeds | Test Seeds | ||
|---|---|---|---|---|---|---|---|
| Average Regret | STD | Average Regret | STD | ||||
| 1 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 3947.0 | 559.38635 | 3651.8 | 393.22647 | 
| 2 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 4385.6 | 545.81683 | 4451.5 | 523.6664 | 
| 3 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 4647.5 | 515.3112 | 4533.6 | 249.49202 | 
| 4 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 4534.5 | 723.832 | 4558.7 | 735.7625 | 
| 5 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 4486.4 | 463.7919 | 4676.3 | 703.2815 | 
| 6 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 4764.3 | 556.171 | 5213.6 | 541.3687 | 
| 7 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 7511.4 | 1893.0953 | 7999.3 | 673.5912 | 
| 8 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 7084.7 | 950.5174 | 8206.8 | 787.7223 | 
| 9 | UCBWindowProvider | Baseline code | – | 29055.3 | 11124.4 | 21102.5 | 12444.306 | 
| 10 | ShiftingBanditsProvider | Baseline code | – | 40649.8 | 212.50716 | 40609.1 | 220.46388 | 
| 11 | RandomProvider | Baseline code | – | 57238.3 | 308.08838 | 57185.8 | 253.00189 | 
| 12 | ProviderUCB | Baseline code | – | 61824.0 | 2355.0454 | 63349.5 | 934.28564 | 
Task 4. SimulatedVisitorDailyVariation
| Rank | Algorithm | Name | Institution | Validation Seeds | Test Seeds | ||
|---|---|---|---|---|---|---|---|
| Average Regret | STD | Average Regret | STD | ||||
| 1 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 4728.9 | 883.0477 | 5070.2 | 947.57245 | 
| 2 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 5751.8 | 1262.6565 | 5097.0 | 1017.7952 | 
| 3 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 4884.9 | 492.63092 | 5140.2 | 365.1742 | 
| 4 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 4751.9 | 1495.7024 | 5253.9 | 1112.1284 | 
| 5 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 5870.1 | 617.5396 | 6179.0 | 697.5103 | 
| 6 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 5657.9 | 712.6532 | 6188.6 | 1034.8442 | 
| 7 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 5946.6 | 786.82513 | 7897.1 | 4283.818 | 
| 8 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 7253.1 | 774.3874 | 8043.3 | 1215.4939 | 
| 9 | UCBWindowProvider | Baseline code | – | 30567.3 | 17000.268 | 34363.6 | 19650.63 | 
| 10 | ShiftingBanditsProvider | Baseline code | – | 42619.6 | 205.62112 | 42662.3 | 194.08247 | 
| 11 | RandomProvider | Baseline code | – | 57109.8 | 238.72942 | 57112.7 | 220.19237 | 
| 12 | ProviderUCB | Baseline code | – | 54405.2 | 9520.59 | 61281.9 | 6575.8525 | 
Task 5. SimulatedVisitorWeeklyCloseVariation
| Rank | Algorithm | Name | Institution | Validation Seeds | Test Seeds | ||
|---|---|---|---|---|---|---|---|
| Average Regret | STD | Average Regret | STD | ||||
| 1 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 4947.8 | 375.51675 | 4981.1 | 498.89618 | 
| 2 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 5137.5 | 466.3166 | 5052.9 | 450.37573 | 
| 3 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 4656.4 | 555.3968 | 5088.1 | 739.60144 | 
| 4 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 5280.0 | 654.064 | 5162.6 | 756.14844 | 
| 5 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 5440.7 | 326.4149 | 5243.8 | 421.37152 | 
| 6 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 5584.6 | 286.54074 | 5634.7 | 401.08328 | 
| 7 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 6813.1 | 804.2061 | 6722.4 | 845.91504 | 
| 8 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 7119.8 | 1833.6073 | 7557.7 | 1736.7368 | 
| 9 | UCBWindowProvider | Baseline code | – | 15240.1 | 2971.088 | 14502.9 | 4624.025 | 
| 10 | ShiftingBanditsProvider | Baseline code | – | 21690.1 | 364.62173 | 21514.2 | 245.27754 | 
| 11 | ProviderUCB | Baseline code | – | 21909.6 | 2055.3625 | 22709.6 | 3313.902 | 
| 12 | RandomProvider | Baseline code | – | 25764.2 | 339.47043 | 25654.9 | 343.23346 | 
Task 6. SimulatedVisitorConstant
| Rank | Algorithm | Name | Institution | Validation Seeds | Test Seeds | ||
|---|---|---|---|---|---|---|---|
| Average Regret | STD | Average Regret | STD | ||||
| 1 | ProviderUCB | Baseline code | – | 365.8 | 74.90705 | 448.8 | 128.11522 | 
| 2 | TaoUcbForgetPascalChallenge | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 372.1 | 82.51525 | 472.0 | 142.9662 | 
| 3 | MetaUCB | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 568.0 | 313.2266 | 544.9 | 257.33353 | 
| 4 | ProviderExp4DWrapper | Thomas Jaksch | University of Leoben | 1652.5 | 228.85185 | 1801.2 | 176.14943 | 
| 5 | MetaUCBFC | Nicolas Baskiotis, Cedric Hartland & Sylvain Gelly | Univ Paris XI Orsay – lri | 3081.7 | 170.6159 | 3179.7 | 340.06668 | 
| 6 | UCBWindowProvider | Baseline code | – | 4345.9 | 250.39633 | 4298.6 | 278.8549 | 
| 7 | ProviderDiscountedUCB | Thomas Jaksch | University of Leoben | 5502.5 | 491.91965 | 5277.7 | 426.59714 | 
| 8 | UCB1DiscountTuned | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 6008.8 | 453.98135 | 6123.5 | 396.33356 | 
| 9 | UCB1DiscountTuned_C | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 6354.9 | 437.75983 | 6566.8 | 449.95624 | 
| 10 | UCB1DiscountTuned_B | Flóra Adorján & György Ottucsák | Budapest University of Technology and Economics | 6524.6 | 347.51184 | 6621.2 | 335.8319 | 
| 11 | ShiftingBanditProvider | Baseline code | – | 18868.6 | 104.250984 | 18787.7 | 135.26276 | 
| 12 | RandomProvider | Baseline code | – | 20057.6 | 104.72313 | 20024.6 | 152.13971 | 
Old Phase 1 tasks
These are the old tasks from phase 1 and are no longer used for phase 2.
Task 1
| Rank | Algorithm | Name | Institution | STD | Average Regret | 
|---|---|---|---|---|---|
| 1 | UCBg | Levente Kocsis | mta sztaki | 34.7 | 52.1 | 
| 2 | ProviderUCB (Baseline) | – | – | 24.28 | 65.6 | 
| 3 | Mean/variance regret | Craig Saunders | University of Southampton | 61.28 | 89.3 | 
| 4 | ShiftingBanditProvider (Baseline) | – | – | 21.58 | 141.2 | 
| 5 | RandomProvider (Baseline) | – | – | 25.42 | 151.4 | 
Task 2
| Rank | Algorithm | Name | Institution | STD | Average Regret | 
|---|---|---|---|---|---|
| 1 | UCBgper | Levente Kocsis | mta sztaki | 803.1 | 627.4 | 
| 2 | ShiftingBanditsProvider (Baseline) | – | – | 206.53 | 1839.3 | 
| 3 | Stupid Switcher | Craig Saunders | University of Southampton | 334.27 | 3669.3 | 
| 4 | Experts Prediction | Cedric Hartland | Laboratoire de Recherche en Informatique | 1283.36 | 6318.6 | 
| 5 | ProviderUCB (Baseline) | – | – | 4045.41 | 13270.7 | 
| 6 | UCBW | Levente Kocsis | mta sztaki | 2298.62 | 18430.1 | 
| 7 | UCBg | Levente Kocsis | mta sztaki | 8367.2 | 23552.7 | 
| 8 | RandomProvider (Baseline) | – | – | 3153.57 | 27434.2 | 
Task 3
| Rank | Algorithm | Name | Institution | STD | Average Regret | 
|---|---|---|---|---|---|
| 1 | Mean/variance regret | Craig Saunders | University of Southampton | 4.95 | 10.3 | 
| 2 | UCBS | Levente Kocsis | mta sztaki | 10.39 | 15.3 | 
| 3 | UCBg | Levente Kocsis | mta sztaki | 17.03 | 26.3 | 
| 4 | ProviderUCB (Baseline) | – | – | 11.85 | 80.7 | 
| 5 | ShiftingBanditsProvider (Baseline) | – | – | 40.24 | 765.2 | 
| 6 | RandomProvider (Baseline) | – | – | 41.14 | 892.5 | 
 
               
          