Chargement...
 

2017 Module Reinforcement Learning, Michele Sebag, Diviyan Kalainathan, Laurent Cetinsoy

Exposes

21 février 2018, Amphithéatre Shannon, batiment 660
23 février 2018, salle 2014, 2e etage, batiment Shannon, 660


https://docs.google.com/spreadsheets/d/1eidQleMOdpmXbr3tGaKUdhM0avHUGwBQTtW-lh2QrL4/edit?usp=sharing



Examen de l'an dernier


Projects


Todo: code, experiments, analysis, written report.

Copy-paste of existing programs on the Net will have consequences, which could include receiving a mark of 0.

Projects involve at most 3 students except for the last two (Halite & Alesia: 4 students).
Projects are due on February, 15th, 23:59 GMT+1.

Each group must produce :
  1. A report of circa 2 pages (max 3 pages without references), TeX and .pdf files, including a description of the approach, results and comparison with other algorithms/state of the art (when possible), using the ICML 2017 format. People not able to write TeX can produce a .doc(x) document, with its .pdf.( Description | ICML2017 TeX package )
  2. The code of your implemented approach. This code should work "out of the box", add a notice/readme giving the list of required packages/libraries, special notes if needed. Producing a code taken from the internet, with none or little modifications could lead to unwanted consequences.

You can discuss about your project's problems/ideas, and ask for more information at : diviyan (at) lri (dot) fr


The subjects are the following (increasing difficulty):
  1. Mountain car problem (compare two approaches)
  2. Inverted pendulum (compare two representations of the problem)
  3. The acrobot
  4. Octopus
  5. Td-gammon
  6. bicycle: equilibrium + advancing
  7. Anti-Imitation Policy learning: reproduce an experiment from mainDIVA.pdf
  8. halite.io
  9. Jeu d'Alesia (voir Approximate Dynamic Programming for Two-Player Zero-Sum Markov Games, ICML 15)Alesia_game.zip

Pointers

  1. Video Richard 2016 Sutton, https://www.microsoft.com/en-us/research/video/tutorial-introduction-to-reinforcement-learning-with-function-approximation/
  2. some videos of the Boston Dynamics group

Evaluation

  1. Exam écrit 12 février
  2. TP notés
  3. Projets


13 nov, Michele Sebag

20 nov. MS + DK

27 nov. DK

4 dec. pas cours

11 dec. MS

8 jan MS + DK

15 jan.

24 jan.



Jour à fixer, présentation d'articles

  1. Neural Optimizer Search with Reinforcement Learning ICML 2017
  2. Boosted Fitted Q-Iteration
  3. Constrained Policy Optimization
  4. Curiosity-driven Exploration by Self-supervised Prediction Pauline Brunet et Quentin Bouchut
  5. The K-armed Dueling Bandits Problem Zizhao Li et Xudong Zhang
  6. DARLA: Improving Zero-Shot Transfer in Reinforcement Learning
  7. Coordinated Multi-Agent Imitation Learning Ghiles SIDI SAID et Amine BIAD
  8. Local Bayesian Optimization of Motor Skills
  9. Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning, Thomas Gauthier, Pereira Abou Rejaili Rodrigo
  10. Designing Neural Network Architectures using Reinforcement Learning Mohamed Ali Dargouth & Walid Belrhalmia___
  11. Robot gains Social Intelligence through Multimodal Deep Reinforcement Learning Eden Belouadah
  12. Abstraction Selection in Model-based Reinforcement Learning
  13. Universal Value Function Approximators
  14. Deterministic Policy Gradient Algorithms
  15. Dynamic Programming Boosting for Discriminative Macro-Action Discovery

Deep RL: DQN, AlphaZeroGo, AlphaZero

  1. Playing Atari with Deep Reinforcement Learning Vincent Boyer et Ludovic Kun
  2. [|], Zhengying Liu
  3. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm, Thomas Foltete et Guillaume Collin
  4. Deep Reinforcement Learning from Self-Play in Imperfect-Information Games Adrien Pavao et Eleonor Bartenlian




Collaborateur(s) de cette page: sebag .
Page dernièrement modifiée le Vendredi 23 février 2018 11:09:16 CET par sebag.