February 24th

14:30 , R2014 Digiteo Shannon (660) (see location):

Madalina Drugan (Vrije Universiteit Brussel, Belgium)

Title : Multi-objective multi-armed bandits

Abstract :

Multi-objective multi-armed bandits (MOMAB) paradigm extends the
multi-armed bandits (MAB) to reward vectors instead. MOMAB differs from
standard MAB in important ways since several arms are optimal according
to their reward tuples. Techniques from multi-objective optimisation are
used to create MOMAB algorithms with efficient exploration/exploitation
trade-off for complex and large multi-objective stochastic environments.
Theoretical analysis is an important aspect of MAB that is a simplified
theoretical framework of reinforcement learning with a single state. We
give an overview of the MOMAB algorithms, their analysis and the
corresponding experimental methodology.

Contact: cyril.furtlehner à

Collaborateur(s) de cette page: furtlehn .
Page dernièrement modifiée le Vendredi 06 mars 2015 10:17:29 CET par furtlehn.